Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-portable use of va_args #196

Open
tkoeppe opened this issue Jul 25, 2016 · 35 comments
Open

Non-portable use of va_args #196

tkoeppe opened this issue Jul 25, 2016 · 35 comments

Comments

@tkoeppe
Copy link

tkoeppe commented Jul 25, 2016

The VM_Call function in vm.c currently uses variable arguments in a very awkward fashion in order to fake "default arguments" for function parameters: va_arg(ap, int) is called 12 times unconditionally, independent of how the function was actually called.

This has undefined behaviour and is not portable. It will typically read outside the stack frame and throw off memory analysers.

I'm not sure how to solve this in C89, but if you are willing to require C99, this could be solved with a compound literal:

#define VM_Call(VM, ...) VM_Call_Impl(VM, &(int[13]){__VA_ARGS__})

intptr_t VM_Call_Impl(vm_t *vm, int (*args)[13]) {
  int callnum = (*args)[0];  // etc.
}

What do you think of such a change?

@tkoeppe
Copy link
Author

tkoeppe commented Jul 25, 2016

Or perhaps we could have a "strict" mode:

#if defined(PORTABLE_VA_ARGS)
#define VM_Call(VM, ...) VM_Call_Impl(VM, &(int[13]){__VA_ARGS__})
#elif defined(CLANG_ABI)
#define VM_Call(VM, ...) VM_Call_Clang(VM, __VA_ARGS__)
#else
#define VM_Call(VM, ...) VM_Call_Hackish(VM, __VA_ARGS__)
#endif

That way you could remove all the compile-time branching from the individual functions, too.

@ensiform
Copy link

Invoking C99 makes ioquake3 and forks impossible to compile with MSVC < 2015.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 26, 2016

I understand, that would be a deal breaker.

Are you still interested in a conditionally-compiled version (like with PORTABLE_VA_ARGS above), or should we just close this?

@timangus
Copy link
Member

I would be mildly surprised if you needed 2015 for VA_ARGS. In any case, we don't actually even officially support VS so that shouldn't really be a gate, and requiring 2015 is no bad thing anyway since it's the first compiler they've come up with since the mid 90s that is actually quite good.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 26, 2016

@timangus: No, sorry, to be clear here, the crucial feature from C99 is the compound literal:

   &(int[13]){__VA_ARGS__}
// ^^^^^^^^^^

The __VA_ARGS__ aren't a big issue.

@timangus
Copy link
Member

Yeah, github seems to have stripped my leading and trailing underscores. Not very helpful.

@timangus
Copy link
Member

TEST_COMMENT "TEST_COMMENT"

@timangus
Copy link
Member

Well that's stupid.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 26, 2016

@timangus: That's markdown, baby :-)

@timangus
Copy link
Member

D'oh.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 26, 2016

By the way, there's a "preview" tab when you write a comment, so you can check before submitting what it will look like. And you can also edit comments if you discover a problem only after submission. :-)

@timangus
Copy link
Member

Yeah, but none of the subsequent comments would make any sense if I edited it now ;). Annnyway, if __VA_ARGS__ works, I don't have a problem with it (even if that means needing VS2015.)

@NuclearMonster
Copy link
Member

VS2015 community works? that's fine.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

So... would you be interested in a patch along those lines? The key features would be:

  • VM_Call becomes a macro.
  • The default implementation of the macro would use a C99 compound literal.

We can also keep other code branches around that use the two existing versions of va_arg abuse, both as an alternative for non-C99 compilers and in case someone derives a genuine performance advantage from that.

@wtfbbqhax
Copy link

Opened the debugger- I observe that every invocations (in Tremulous at least) VM_Call() results in leaking at least the callers stack frame, but often several frames to the vm 😈

Also, don't forget about VM_DllSyscall in the fix.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

Another interesting tidbit: Every instance of vmMain actually declares thirteen int parameters, and never an ellipsis. So I'm very tempted to also change the definition of entryPoint, because that's another abuse of va-args at the moment.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

Here is a tentative change: tkoeppe@9719411

This is just a rough skeleton; it has issues:

  • Missing fall-back for non-C99 compilers
  • Magic numbers 13 and 16 appear, which should really use the macros MAX_VM{MAIN,SYSCALL}_ARGS...
  • ... although those macros are a bit silly now given that we have fixed parameter lists and there's nothing to "max" here.
  • There should probably be typedefs for some of the more arcane function types.

@wtfbbqhax
Copy link

@tkoeppe I've never seen &(int[13]){__VA_ARGS__} before..

Let me know if I'm reading this correctly, __VA_ARGS__ populates elements of a temporary array which cast to an array of 13 integers. Is & actually needed?

Assuming this is true have you verified with -O0 and -O3 that optimization doesn't break anything?

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

@wtfbbqhax: ish. The compound literal is an lvalue, initialized from the given initializer (which means that array elements that aren't mentioned are zero-intialized. We then take the address of that array. Yes, I do want to take the address of the array rather than decay the array so as to retain a semblance of type safety here; that way we can at least check that the number of array elements is the same everywhere.

@zturtleman
Copy link
Member

I named MAX_VM{MAIN,SYSCALL}_ARGS and I don't think this changes the meaning of them. It's not a "max" in the sense that arrays might have less elements, it's that less might actually be used and that more can't be used. It makes it easier and more obvious to increase the max number of syscall arguments compared to magic numbers and non-looped array access. My engine fork uses MAX_VMSYSCALL_ARGS = 17. Historical reference: f7a2006, c5af65f.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

@zturtleman: OK, sure, my main issue with those macros in the light of this change is that you can't tie the macro value to the function signature easily. That is, I can't generate f(int, int, int) from a macro #define NUM_ARGS 3. (At least not without some heavy preprocessor lifting.) So we'd still need to keep the function signatures and forwarding calls in sync manually.

@zturtleman
Copy link
Member

Yeah, it requires additional code changes to actually get more args to (DLL) vmMain now as well. This isn't really much different.

@tkoeppe
Copy link
Author

tkoeppe commented Jul 27, 2016

I updated the commit, it's now: tkoeppe@614848a

This one includes changes to the three *_syscall.c DLL implementations. The changes follow the same pattern: the syscall function is replaced by a macro that creates an array as a compound literal, and passes the (decayed) array to the syscall function pointer. That function pointer now actually matches the function signature of the various syscall implementations (e.g. CL_CgameSystemCalls).

@wtfbbqhax
Copy link

Can you please make the VM_Call require an actual call number i.e.,

#define VM_Call( VM, CALL, ... ) VM_Call_Impl( VM, CALL, &(int[12]){__VA_ARGS__})
intptr_t QDECL VM_Call_Impl( vm_t *vm, int callnum, int (*arg)[12] )

@zturtleman
Copy link
Member

This one includes changes to the three *_syscall.c DLL implementations.

Doesn't that break compatibility with existing DLLs?

@tkoeppe
Copy link
Author

tkoeppe commented Jul 28, 2016

This one includes changes to the three *_syscall.c DLL implementations.

Doesn't that break compatibility with existing DLLs?

Hm, only if the name mangling includes the function signature. This isn't the case on Linux, but I don't know about Windows. I doubt it, though, because you don't include the signature in Sys_LoadFunction, so I think it cannot be part of the mangling on any ABI. Moreover, the function signature changes from a function taking one parameter of one function pointer type to a function with one parameter taking a different function pointer type, so even if the size of the parameters is part of the mangling, it shouldn't change.

@zturtleman
Copy link
Member

Sorry, I should of been more specific. I meant when calling syscall in a DLL as it use to be multiple arguments read using va_args in the engine and now it's takes a pointer to an array. So existing DLLs give engine a syscall number value which the engine with this patch thinks is a pointer. Engine will probably SEGFAULT when trying to read syscall number and arguments. Using new DLLs on old engine would likely error about unknown syscall number because it's actually passing a pointer address.

@tkoeppe
Copy link
Author

tkoeppe commented Aug 1, 2016

@wtfbbqhax: Unfortunately no, because you cannot portably have zero variadic arguments in a variadic macro.

@tkoeppe
Copy link
Author

tkoeppe commented Aug 1, 2016

@zturtleman: Yes, you're right. Previously compiled DLLs will pass the arguments as if to an ellipsis, but new client code would treat it as a pointer to the first element of an array, and that would break. So this change would require recompiling DLLs.

@zturtleman
Copy link
Member

Rocket Arena 3 uses a game DLL with no source release so it can't be recompiled. It seems like a bad idea to change the DLL-VM API in general.

@tkoeppe
Copy link
Author

tkoeppe commented Aug 5, 2016

OK, so we should leave the DLLs alone then.

Are you still interested in the change to VM_Call?

@tkoeppe
Copy link
Author

tkoeppe commented Sep 11, 2018

(Note that DeepMind Lab uses the C99 compound literal: https://github.com/deepmind/lab/blob/master/engine/code/qcommon/qcommon.h#L369)

@ensiform
Copy link

Presumably they're using MSVC 2015/17 rather than mingw/make or extremely old msvc

@tkoeppe
Copy link
Author

tkoeppe commented Sep 11, 2018

No, neither, we're using Clang and GCC only -- I just wanted to show how the proposed change looks in a real fork.

@ghost
Copy link

ghost commented Feb 13, 2020

Any progress on this? QuakeJS -s SAFE_HEAP=1 errors out on this kind of stuff. I don't mind only supporting mods with source/reverse engineering just to recompile if someone can point me towards a good implementation like Daemon or or one of the other off-shoots?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants