[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

al45tair · 2024-02-14T13:32:48Z

Instead of reading from /proc/self/cmdline, take advantage of the fact that the initial stack layout is ABI specified, and that we already have a pointer into it (environ). This lets us walk up the stack until we find argc, at which point we also know where argv is.

We do this from a static initializer because a setenv() or putenv() can change environ (if you add a new environment variable), and it's even permissible to just outright change environ yourself too. It seems reasonable to suggest to people that they shouldn't be doing those things from a static initializer, and as long as they don't, they won't run before we've had a chance to find argv.

Just in case someone does do this, we also check that environ points into the stack. If it doesn't, they won't get any arguments, so if that happens, that's a clue that they're messing with environ too early.

This works around a problem (#69658) with Docker Desktop 4.25.0 and Rosetta, wherein we end up with an extra argument visible in /proc/self/cmdline, and also avoids allocating memory for the command line arguments.

rdar://117963394

kateinoigakukun · 2024-02-14T14:16:02Z

Can we just save argv/argc passed to main entrypoint by emit a runtime function call at the main? This is only for Linux, so introducing a new ABI function wouldn't be a big deal, I think.

For the record, Rust takes this saving-at-entry approach for some platforms including Linux. (but for Linux + glibc, they use the fact that glibc passes argv/argc/envp to init_array functions)

al45tair · 2024-02-14T14:23:57Z

Can we just save argv/argc passed to main entrypoint by emit a runtime function call at the main? This is only for Linux, so introducing a new ABI function wouldn't be a big deal, I think.

The problem with that is that the command line would only be available after main has started. Library code may want it before then, and indeed it looks like that would work just fine on Darwin, Windows, FreeBSD, OpenBSD and indeed WASI. It just wouldn't work on Linux.

Ideally, Swift should behave the same way everywhere.

kateinoigakukun · 2024-02-14T14:32:05Z

The problem with that is that the command line would only be available after main has started. Library code may want it before then

Ah, I see... Then it makes sense to depend on initial stack layout as libc-independent approach on Linux.

al45tair · 2024-02-14T15:03:57Z

For the record, Rust takes this saving-at-entry approach for some platforms including Linux. (but for Linux + glibc, they use the fact that glibc passes argv/argc/envp to init_array functions)

Yes. I'd be in favour of the save-at-entry approach if I was doing something like this from scratch, I think, because that's completely portable, whereas with this we're dependent on platform specifics.

As for the init_array behaviour, if every Linux libc did that, that would be the way to go. I think glibc and Bionic both do, but musl doesn't, sadly :-(

…ine. Instead of reading from `/proc/self/cmdline`, take advantage of the fact that the initial stack layout is ABI specified, and that we already have a pointer into it (`environ`). This lets us walk up the stack until we find `argc`, at which point we also know where `argv` is. We do this from a static initializer because a `setenv()` or `putenv()` can change `environ` (if you add a new environment variable), and it's even permissible to just outright change `environ` yourself too. It seems reasonable to suggest to people that they shouldn't be doing those things from a static initializer, and as long as they don't, they won't run before we've had a chance to find `argv`. Just in case someone _does_ do this, we also check that `environ` points into the stack. If it doesn't, they won't get any arguments, so if that happens, that's a clue that they're messing with `environ` too early. This works around a problem (apple#69658) with Docker Desktop 4.25.0 and Rosetta, wherein we end up with an extra argument visible in `/proc/self/cmdline`, and also avoids allocating memory for the command line arguments. rdar://117963394

al45tair · 2024-02-14T19:04:38Z

@swift-ci Please smoke test Linux platform

al45tair · 2024-02-16T12:18:44Z

@swift-ci Please smoke test Linux platform

al45tair · 2024-02-16T15:18:10Z

@swift-ci Please smoke test Windows platform

al45tair · 2024-02-16T15:18:20Z

@swift-ci Please smoke test macOS platform

mikeash

huh

Catfish-Man

This is awful (laudatory)

This reverts the reversion of apple#71611. rdar://117963394

al45tair force-pushed the eng/PR-117963394 branch from 2b886b1 to 1da4132 Compare February 14, 2024 14:15

al45tair force-pushed the eng/PR-117963394 branch from 1da4132 to 04c496c Compare February 14, 2024 17:33

al45tair requested a review from mikeash February 16, 2024 15:17

al45tair marked this pull request as ready for review February 16, 2024 15:17

al45tair requested a review from a team as a code owner February 16, 2024 15:17

mikeash approved these changes Feb 16, 2024

View reviewed changes

Catfish-Man approved these changes Feb 20, 2024

View reviewed changes

al45tair merged commit 5a90430 into apple:main Feb 20, 2024
3 checks passed

jckarter mentioned this pull request Feb 20, 2024

Revert "[CommandLine][Linux] Don't read argv from /proc/self/cmdline." #71761

Merged

al45tair added a commit to al45tair/swift that referenced this pull request Feb 21, 2024

[CommandLine][Linux] Don't read from /proc/self/cmdline (take 2)

ff1cfbd

This reverts the reversion of apple#71611. rdar://117963394

al45tair mentioned this pull request Feb 22, 2024

[CommandLine][Linux] Don't read argv from /proc/self/cmdline (take 2). #71781

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

al45tair commented Feb 14, 2024 •

edited

kateinoigakukun commented Feb 14, 2024 •

edited

al45tair commented Feb 14, 2024

kateinoigakukun commented Feb 14, 2024 •

edited

al45tair commented Feb 14, 2024

al45tair commented Feb 14, 2024

al45tair commented Feb 16, 2024

al45tair commented Feb 16, 2024

al45tair commented Feb 16, 2024

mikeash left a comment

Catfish-Man left a comment

[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

[CommandLine][Linux] Don't read argv from /proc/self/cmdline. #71611

Conversation

al45tair commented Feb 14, 2024 • edited

kateinoigakukun commented Feb 14, 2024 • edited

al45tair commented Feb 14, 2024

kateinoigakukun commented Feb 14, 2024 • edited

al45tair commented Feb 14, 2024

al45tair commented Feb 14, 2024

al45tair commented Feb 16, 2024

al45tair commented Feb 16, 2024

al45tair commented Feb 16, 2024

mikeash left a comment

Choose a reason for hiding this comment

Catfish-Man left a comment

Choose a reason for hiding this comment

al45tair commented Feb 14, 2024 •

edited

kateinoigakukun commented Feb 14, 2024 •

edited

kateinoigakukun commented Feb 14, 2024 •

edited