Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

Merged
merged 1 commit into from
Feb 20, 2024

Conversation

al45tair
Copy link
Contributor

@al45tair al45tair commented Feb 14, 2024

Instead of reading from /proc/self/cmdline, take advantage of the fact that the initial stack layout is ABI specified, and that we already have a pointer into it (environ). This lets us walk up the stack until we find argc, at which point we also know where argv is.

We do this from a static initializer because a setenv() or putenv() can change environ (if you add a new environment variable), and it's even permissible to just outright change environ yourself too. It seems reasonable to suggest to people that they shouldn't be doing those things from a static initializer, and as long as they don't, they won't run before we've had a chance to find argv.

Just in case someone does do this, we also check that environ points into the stack. If it doesn't, they won't get any arguments, so if that happens, that's a clue that they're messing with environ too early.

This works around a problem (#69658) with Docker Desktop 4.25.0 and Rosetta, wherein we end up with an extra argument visible in /proc/self/cmdline, and also avoids allocating memory for the command line arguments.

rdar://117963394

@kateinoigakukun
Copy link
Member

kateinoigakukun commented Feb 14, 2024

Can we just save argv/argc passed to main entrypoint by emit a runtime function call at the main? This is only for Linux, so introducing a new ABI function wouldn't be a big deal, I think.

For the record, Rust takes this saving-at-entry approach for some platforms including Linux. (but for Linux + glibc, they use the fact that glibc passes argv/argc/envp to init_array functions)

@al45tair
Copy link
Contributor Author

Can we just save argv/argc passed to main entrypoint by emit a runtime function call at the main? This is only for Linux, so introducing a new ABI function wouldn't be a big deal, I think.

The problem with that is that the command line would only be available after main has started. Library code may want it before then, and indeed it looks like that would work just fine on Darwin, Windows, FreeBSD, OpenBSD and indeed WASI. It just wouldn't work on Linux.

Ideally, Swift should behave the same way everywhere.

@kateinoigakukun
Copy link
Member

kateinoigakukun commented Feb 14, 2024

The problem with that is that the command line would only be available after main has started. Library code may want it before then

Ah, I see... Then it makes sense to depend on initial stack layout as libc-independent approach on Linux.

@al45tair
Copy link
Contributor Author

For the record, Rust takes this saving-at-entry approach for some platforms including Linux. (but for Linux + glibc, they use the fact that glibc passes argv/argc/envp to init_array functions)

Yes. I'd be in favour of the save-at-entry approach if I was doing something like this from scratch, I think, because that's completely portable, whereas with this we're dependent on platform specifics.

As for the init_array behaviour, if every Linux libc did that, that would be the way to go. I think glibc and Bionic both do, but musl doesn't, sadly :-(

…ine.

Instead of reading from `/proc/self/cmdline`, take advantage of the fact
that the initial stack layout is ABI specified, and that we already have
a pointer into it (`environ`).  This lets us walk up the stack until we
find `argc`, at which point we also know where `argv` is.

We do this from a static initializer because a `setenv()` or `putenv()`
can change `environ` (if you add a new environment variable), and it's
even permissible to just outright change `environ` yourself too.  It
seems reasonable to suggest to people that they shouldn't be doing
those things from a static initializer, and as long as they don't,
they won't run before we've had a chance to find `argv`.

Just in case someone _does_ do this, we also check that `environ`
points into the stack.  If it doesn't, they won't get any arguments,
so if that happens, that's a clue that they're messing with `environ`
too early.

This works around a problem (apple#69658) with Docker Desktop 4.25.0 and
Rosetta, wherein we end up with an extra argument visible in
`/proc/self/cmdline`, and also avoids allocating memory for the
command line arguments.

rdar://117963394
@al45tair
Copy link
Contributor Author

@swift-ci Please smoke test Linux platform

1 similar comment
@al45tair
Copy link
Contributor Author

@swift-ci Please smoke test Linux platform

@al45tair al45tair marked this pull request as ready for review February 16, 2024 15:17
@al45tair al45tair requested a review from a team as a code owner February 16, 2024 15:17
@al45tair
Copy link
Contributor Author

@swift-ci Please smoke test Windows platform

@al45tair
Copy link
Contributor Author

@swift-ci Please smoke test macOS platform

Copy link
Contributor

@mikeash mikeash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh

Copy link
Member

@Catfish-Man Catfish-Man left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awful (laudatory)

@al45tair al45tair merged commit 5a90430 into apple:main Feb 20, 2024
3 checks passed
al45tair added a commit to al45tair/swift that referenced this pull request Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants