New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
loader.c uses AT_FLAGS to communicate literally #1168
Conversation
Now the `ape - prog args...` form communicates to the loaded binary that argv[0] is preserved. This is done via either editing the previous flags passed in auxv, or adding a new AT_FLAGS value at the end if one was not found. I decided to always insert an AT_FLAGS entry, although another option is to only insert one if literally is true. This would change the value for `n` to `... + 2 * (!flags && literally) + ...` and add another branch to the if statement. I guessed that the extra branch was not worth saving 2 words on the stack sometimes.
We should be measuring the AT_FLAGS of the child process, since it may be true even if it is false for the parent process.
I hadn't noticed the previous AT_FLAGS changes, so I might be a little behind. Will this method work with macOS, given that it doesn't receive an AUX vector? |
Hmm. It will not work on x86_64 XNU, you're right. On arm64 XNU, auxv is polyfilled. |
This is a pretty niche piece of functionality — basically just a way for the loader to tell the binary that its argv[0] has been preserved. I'm sort of okay with it not working on x86_64 XNU; |
ad8a9c5
to
d42907f
Compare
Agreed, it seems a number of (somewhat fancy?) niche modifications to the loader have been made, to solve the old problem of "who am i and where did I come from" for the loaded binary. I just wonder what their utility is if they come to be depended on and don't work on all mainline supported Cosmo host systems. |
If there is loader code that should be changed or removed, please just point at what it is in particular. Or better still, suggest a better design. The current and planned work is to try to get a reasonable compromise on two desiderata that are in tension:
We need 2 to work under all circumstances to the best of our ability. It would be nice for 1 to work where practicable. The old school approach was to just sacrifice 1 entirely for the sake of 2. The new school approach is to try to get both more of the time. |
The only cases where point 2 currently does not work reliably are OpenBSD and x86_64 XNU assimilated binaries. We can fix x86_64 XNU trivially by reading out the Point 1 doesn't work a lot more of the time, but is also already depended on by a bunch of stuff, so it's not like we're creating that problem by having incrementally better support for it. |
Hello @mrdomino, I am sorry, my previous comments were not intended as a complaint in any way. I'm still developing mostly on Intel macOS, with my point of view mostly coming from that perspective.
Sounds good!
That also sounds like potentially a good idea. I had thought of the idea of fully polyfilling aux for macOS x86_64, but wasn't sure most of the useful auxv entries could be usefully emulated, nor whether more dependencies on upstairs auxv[] access is a good idea within Cosmopolitan either.
I think auxv is also zeroed out in ape/loader.c when os is XNU. Thank you! |
No worries and that makes perfect sense then; yeah, x86_64 XNU is at risk of becoming a dinosaur, so it's good to be vocal about stuff that doesn't work right on it.
If you are interested, I have just filed #1169 for this and would be happy to answer any questions or chat about it further on redbean discord if you wanted to tackle it. Otherwise it is on my radar but currently “best effort” (meaning I may get around to it one day.)
You're right, it is, but we don't care about it in this case since |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I haven’t actually thought extremely carefully about exactly which value gets zeroed by the previously-mentioned loop, but it varies by one word, so it’s wrong half the time.
@ghaerr Nothing is niche if one of us needs it. Especially when that person is volunteer to be the one to write it. I believe Cosmopolitan should be inclusive of your needs, @mrdomino's needs, and others too. I believe we can keep the code simple and elegant in the process if we devote ourselves to having a culture that values this. However our needs and curiosities should come first, since there's no shame in refactoring things later. |
This comment was marked as outdated.
This comment was marked as outdated.
I'm gonna go ahead and say this is ready. Should work everywhere except x86_64 XNU, which has an issue filed. I will leave it to be tackled in a future PR, possibly by some intrepid young coder, who might even be me. |
(comments marked off-topic, including mine, are fine and good but not relevant to this PR; discussion should probably happen at #1169 or in new issues instead) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good. Before I approve, I'm curious how this is intended to be used? Should programs dependent on preserved argv[0]/non path argv[0] check the flag? I'd imagine that would be error prone as programs could be assimilated and I'd think it wouldn't be set then? It's okay if this is purely advisory information too. Feel free to just link me is this is explained elsewhere, I haven't followed the argv[0] preserving improvements super closely.
The more I think about this, the less I like it. That flag has particular semantics in the context of binfmt on Linux. It's the kernel's way of telling a binfmt interpreter, "I have incremented argc, and your argv[1] is the original argv[0] of the script you are interpreting." The ape binary, when run as a binfmt interpreter, now knows how to act on this flag in order to present the correct arguments to the binary it loads. But this flag currently loses its meaning in the context of the loaded binary. The ape interpreter has left it set as a side effect; this side effect can be observed, which will tell the binary that its interpreter has preserved argv[0]. But suppose you want to use an assimilated ape binary as an interpreter; now, it needs to know whether that flag came from the kernel or from the ape loader, and it has two different meanings in those two cases. Overall I do not like this anymore. I think if anything, the ape loader should actually unset this flag if it is set. The test cases should do something else to decide whether to test argv[0]. A bit late in the game, but it also occurs to me that you could use this flag in the original sense; increment argc and pass an extra argv with the original argv[0] in it, instead of doing the register trick. At this point just more arguments for unsetting the flag and not abusing it in current binaries... |
No description provided.