Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use functions from crt_externs.h on iOS/tvOS/watchOS/visionOS #125225

Merged
merged 4 commits into from
May 21, 2024

Conversation

madsmtm
Copy link
Contributor

@madsmtm madsmtm commented May 17, 2024

Use _NSGetEnviron, _NSGetArgc and _NSGetArgv on iOS/tvOS/watchOS/visionOS, see each commit and the code comments for details. This allows us to unify more code with the macOS implementation, as well as avoiding linking to the Foundation framework (which is good for startup performance).

The biggest problem with doing this would be if it lead to App Store rejections. After doing a bunch of research on this, while it did happen once in 2009, I find it fairly unlikely to happen nowadays, especially considering that Apple has later added crt_externs.h to the iOS/tvOS/watchOS/visionOS SDKs, strongly signifying the functions therein is indeed supported on those platforms (even though they lack an availability attribute).

That we've been overly cautious here has also been noted by @thomcc in #117910 (comment).

r? @workingjubilee

@rustbot label O-apple

If we're comfortable using `_NSGetEnviron` from `crt_externs.h`, there shouldn't be an issue with using these either, and then we can merge with the macOS implementation.

This also fixes two test cases on Mac Catalyst:
- `tests/ui/command/command-argv0.rs`, maybe because `[[NSProcessInfo processInfo] arguments]` somehow converts the name of the first argument?
- `tests/ui/env-funky-keys.rs` since we no longer link to Foundation.
@rustbot rustbot added O-unix Operating system: Unix-like S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) labels May 17, 2024
@workingjubilee
Copy link
Contributor

I probably can't accept this in its current form, the way we handle args on macOS has problems:

We need to make getting args on all platforms Much Less Clever.

@workingjubilee
Copy link
Contributor

workingjubilee commented May 18, 2024

Technically this diff is not making the situation worse I suppose? Hmm... and apparently things like googletest straight-up mutate the pointees of these, eesh... What do you think?

@madsmtm
Copy link
Contributor Author

madsmtm commented May 18, 2024

I'm not entirely sure I understand your linked issues? How could we implement these in a "Much Less Clever" way on macOS, while still allowing it to work in dynamic libraries?

Technically this diff is not making the situation worse I suppose?

Indeed, there aren't any functional changes on macOS, and [[NSProcessInfo processInfo] arguments] also takes the value from _NSGetArgc/_NSGetArgv (and, if I'm reading the disassembly correctly, stores a copy upon first access).

Hmm... and apparently things like googletest straight-up mutate the pointees of these, eesh... What do you think?

Pretty sure that mutating these in main, while not explicitly defined, should be valid, especially in testing scenarios. While researcing this, I found this 2022 proposal for C2Y, which under "Const correctness and thread safety" mentions:

main() can mutate its arguments. [...] we take the position that argv [h]as always been mutable, and users do rely on that for a few use cases including:

  • Hiding from users args that are specific to a given framework
  • Changing the name of the program as displayed by process monitoring tools

@workingjubilee
Copy link
Contributor

workingjubilee commented May 19, 2024

Interesting.

...hmm, actually, one thing I noticed through all this. MacOS was never hardened against foolishness the same way that Linux was.

@madsmtm A program can update the pointee of the value of _NSGetArgc to have a higher value than the actual number of arguments available from the perspective of _NSGetArgv, or alternatively move the null terminator we will find from _NSGetArgv earlier. Please fix that by not only bounding our iteration to the argc but also checking for the null terminator and ending the iteration early if these values are made inconsistent. We cannot avoid a host program setting us up for UB in all cases but we can duck the most obvious error that reckless calling code might throw at us.

@madsmtm
Copy link
Contributor Author

madsmtm commented May 19, 2024

A program can update the pointee of the value of _NSGetArgc to have a higher value than the actual number of arguments available from the perspective of _NSGetArgv,

From my reading of the disassembly of -[NSProcessInfo arguments], _NSGetArgc can only contain a smaller value than the number of arguments available in _NSGetArgv.

or alternatively move the null terminator we will find from _NSGetArgv earlier. Please fix that by not only bounding our iteration to the argc but also checking for the null terminator and ending the iteration early if these values are made inconsistent.

And instead, when encountering a NULL argument in _NSGetArgv, -[NSProcessInfo arguments] skips over it, but continues processing the rest of the arguments up until _NSGetArgc. I.e. it's not treated as a NULL terminator, rather just as an ignored value.

I've implemented this behaviour in the latest commit.

@workingjubilee
Copy link
Contributor

@madsmtm Hmm. Are you saying this doesn't work?

#include "whatever.h"

int overflow_argc() {
   *_NSGetArgc() = 9001;
    return 0;
}

@madsmtm
Copy link
Contributor Author

madsmtm commented May 19, 2024

Well... It's kinda muddied by the fact that -[NSProcessInfo arguments] is initialized by dyld the moment that Foundation is loaded (learned this just now by experimenting), and then a cached value is used ever since.

But if something else is loaded before Foundation and modifies _NSGetArgc/_NSGetArgv (which I think is how e.g. testing frameworks would usually do this, right? If not, how else would they ensure that the value that -[NSProcessInfo arguments] returns is updated?), then it's UB if _NSGetArgc is larger than the number of entries in _NSGetArgv.

This can be seen with the following:

// File: overflow_argc.c
#import <crt_externs.h>

__attribute__((constructor)) // Run the function upon loading this dylib
void overflow_argc(void) {
    *_NSGetArgc() = 9001;
}
// File: read_args.m
#import <Foundation/Foundation.h>
#import <crt_externs.h>

int main(int argc, const char * argv[]) {
    NSLog(@"argc: %i", argc);
    NSLog(@"*_NSGetArgc(): %i", *_NSGetArgc());
    NSLog(@"-[NSProcessInfo arguments]: %@", [[NSProcessInfo processInfo] arguments]);
    return 0;
}

Compile with:

clang overflow_argc.c -dynamiclib -o liboverflow_argc.dylib
clang read_args.m -L . -framework Foundation -l overflow_argc

Running ./a.out works, but if the -l overflow_argc argument is passed before -framework Foundation, it segfaults (before even running main).

@workingjubilee
Copy link
Contributor

Hmm. Based on what you said, -[NSProcessInfo arguments] uses _NSGetArgc and not the other way around, right?

@workingjubilee
Copy link
Contributor

For this exact moment I am most concerned about the behavior of Rust code compiled into a cdylib using this version of the standard library, assuming this patch is accepted, and then run with a C host that uses essentially no meaningful library code beyond the barest minimum, so my only question is "Can that C host make us do a UB by being reckless while updating the pointees of these functions in a way we can easily avoid?"

@madsmtm
Copy link
Contributor Author

madsmtm commented May 19, 2024

Hmm. Based on what you said, -[NSProcessInfo arguments] uses _NSGetArgc and not the other way around, right?

Yes.

For this exact moment I am most concerned about the behavior of Rust code compiled into a cdylib using this version of the standard library, assuming this patch is accepted, and then run with a C host that uses essentially no meaningful library code beyond the barest minimum, so my only question is "Can that C host make us do a UB by being reckless while updating the pointees of these functions in a way we can easily avoid?"

Yeah, I get the concern - and the answer right now is yes, we get UB if some random C code sets the value of _NSGetArgc to something that doesn't match _NSGetArgv.

And we definitely could just do a break in the loop instead of continue, my concern is that that has different behaviour from the standard API that's otherwise available on the platform (-[NSProcessInfo arguments]).

That is, well-behaving C code like the following (at least arguably more well-behaving than setting _NSGetArgc to some crazy value):

#import <crt_externs.h>

__attribute__((constructor)) // Run the function upon loading this dylib
void erase_first_arg(void) {
    *_NSGetArgv()[1] = 0; // Erase first argument
}

Would erase the first argument of -[NSProcessInfo arguments], but would erase all subsequent arguments in an implementation that used break.


So I think the question here is more: What is the semantics of *_NSGetArgv()[1] = 0;? Should it follow the (very implicit) platform convention and erase just that argument, or should it follow the current Linux implementation and "erase" the argument and all following it?

@workingjubilee
Copy link
Contributor

Hm. As far as C is concerned, there is no real question here: the semantics of argv is that it is null-terminated, thus the only sensible behavior when you assign NULL to a value is that everything stops there.

The Objective-C code obeys a different semantic.

While there is no question as to whether or not Objective-C code has a place on macOS and even defines much of the platform's properties, thus we should not simply shrug off this behavior, nonetheless the real question that is hanging in the air is:

...Why did an Apple engineer commit that change?

...Unfortunately, the reputation of the company is that they will not be forthcoming with answers, even if we ask, so we are left to speculate. I feel I can assume their decision is not meritless, but I don't know if they were weighing things the same way.

Hm.

I'll sleep on it.

@madsmtm
Copy link
Contributor Author

madsmtm commented May 19, 2024

I'll add a bit more context for your deliberations:

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

For myself, I think I slightly prefer the "continue parsing, don't break" option that Apple went with, though I am leaning towards wanting the same behaviour on all (unix) platforms, whatever we end up choosing.

@workingjubilee
Copy link
Contributor

perhaps for historical reasons where argv wasn't originally null-terminated.

We do not support such platforms so I don't think we have to worry about that. A platform either supports at least ANSI C or it doesn't support C at all, to us.

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

Hm.

What I am currently struggling to believe is that there are cases where the filter case actually recovers data more usefully than the take_while in the "well-behaved code" cases. All existing libraries that I'm aware of that permute the argv will also move NULL to the end. In other words, they also respect argv's spec. So it shouldn't matter which we do... unless the code is ill-behaved. In the ill-behaved code case, the heuristic that is simpler... "take minimum", break, take_while... prevents errors we would have otherwise encountered, and there is no reason to believe that the data beyond that first NULL is not corrupt.

@workingjubilee
Copy link
Contributor

workingjubilee commented May 19, 2024

...In other words, for all major libraries I am aware of, -[NSProcessInfo arguments] will hit a NULL and then skip over it, and then hit a NULL and then skip over it, etc., for argc - (current_index) arguments.

A major benefit I could see of the filter behavior is to allow lazily iterating the argv since it removes directionality from the argv, instead of preallocating the strings, but

  • we need to come up with Strings anyways for our current type signature
  • it seems doubtful to me this is going to be a huge performance thing tbh
  • because of the "now we expose ourselves to bad/erroneous code racing us" problem, that seems like it will lead to a repeat of the get_env / set_env debacle and get rolled back either way

@madsmtm
Copy link
Contributor Author

madsmtm commented May 20, 2024

perhaps for historical reasons where argv wasn't originally null-terminated.

We do not support such platforms so I don't think we have to worry about that. A platform either supports at least ANSI C or it doesn't support C at all, to us.

Yeah, mostly noted it as a curiosity / an explanation for why argv is both null-terminated and has its length defined by argc (as opposed to just one of those).

So another solution could be to skip (i.e. continue instead of break on) NULL arguments on all platforms. This would also handle e.g. string allocation failure slightly better when doing e.g. argv[i] = strdup("foo"); (however unlikely allocation failure is at the start of program execution).

Hm.

What I am currently struggling to believe is that there are cases where the filter case actually recovers data more usefully than the take_while in the "well-behaved code" cases. All existing libraries that I'm aware of that permute the argv will also move NULL to the end. In other words, they also respect argv's spec. So it shouldn't matter which we do... unless the code is ill-behaved. In the ill-behaved code case, the heuristic that is simpler... "take minimum", break, take_while... prevents errors we would have otherwise encountered, and there is no reason to believe that the data beyond that first NULL is not corrupt.

Fair point.

I've pushed a new commit that merges the implementation on Apple with that of the other unixes, so that we use the "break" strategy on all platforms. I was unsure of exactly what to write as the reasoning, please say so if you have a better suggestion for the wording in my comments!

@workingjubilee
Copy link
Contributor

That commentary looks good to me!

@workingjubilee
Copy link
Contributor

@bors r+

@bors
Copy link
Contributor

bors commented May 21, 2024

📌 Commit 38ad851 has been approved by workingjubilee

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels May 21, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request May 21, 2024
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#124570 (Miscellaneous cleanups)
 - rust-lang#124772 (Refactor documentation for Apple targets)
 - rust-lang#125011 (Add opt-for-size core lib feature flag)
 - rust-lang#125218 (Migrate `run-make/no-intermediate-extras` to new `rmake.rs`)
 - rust-lang#125225 (Use functions from `crt_externs.h` on iOS/tvOS/watchOS/visionOS)
 - rust-lang#125266 (compiler: add simd_ctpop intrinsic)
 - rust-lang#125348 (Small fixes to `std::path::absolute` docs)

Failed merges:

 - rust-lang#125296 (Fix `unexpected_cfgs` lint on std)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit a8ee8d5 into rust-lang:master May 21, 2024
6 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 21, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 21, 2024
Rollup merge of rust-lang#125225 - madsmtm:ios-crt_externs.h, r=workingjubilee

Use functions from `crt_externs.h` on iOS/tvOS/watchOS/visionOS

Use `_NSGetEnviron`, `_NSGetArgc` and `_NSGetArgv` on iOS/tvOS/watchOS/visionOS, see each commit and the code comments for details. This allows us to unify more code with the macOS implementation, as well as avoiding linking to the `Foundation` framework (which is good for startup performance).

The biggest problem with doing this would be if it lead to App Store rejections. After doing a bunch of research on this, while [it did happen once in 2009](https://blog.unity.com/engine-platform/unity-app-store-submissions-problem-solved), I find it fairly unlikely to happen nowadays, especially considering that Apple has later _added_ `crt_externs.h` to the iOS/tvOS/watchOS/visionOS SDKs, strongly signifying the functions therein is indeed supported on those platforms (even though they lack an availability attribute).

That we've been overly cautious here has also been noted by `@thomcc` in rust-lang#117910 (comment).

r? `@workingjubilee`

`@rustbot` label O-apple
@madsmtm madsmtm deleted the ios-crt_externs.h branch May 21, 2024 19:57
@simlay
Copy link
Contributor

simlay commented May 22, 2024

Weirdly, I think think this uncovered a bug with the CoreFoundation not having the correct target_os annotations for visionOS and watchOS targets and thus failing at link time. servo/core-foundation-rs#679 looks to be the fix. Figured it'd be worth writing it down in case someone else comes across a similar issue.

jieyouxu added a commit to jieyouxu/rust that referenced this pull request May 28, 2024
…workingjubilee

Make more of the test suite run on Mac Catalyst

Combined with rust-lang#125225, the only failing parts of the test suite are in `tests/rustdoc-js`, `tests/rustdoc-js-std` and `tests/debuginfo`. Tested with:
```console
./x test --target=aarch64-apple-ios-macabi library/std
./x test --target=aarch64-apple-ios-macabi --skip=tests/rustdoc-js --skip=tests/rustdoc-js-std --skip=tests/debuginfo tests
```

Will probably put up a PR later to enable _running_ on (not just compiling for) Mac Catalyst in CI, though not sure where exactly I should do so? `src/ci/github-actions/jobs.yml`?

Note that I've deliberately _not_ enabled stack overflow handlers on iOS/tvOS/watchOS/visionOS (see rust-lang#25872), but rather just skipped those tests, as it uses quite a few APIs that I'd be weary about getting rejected by the App Store (note that Swift doesn't do it on those platforms either).

r? `@workingjubilee`

CC `@thomcc`

`@rustbot` label O-ios O-apple
jieyouxu added a commit to jieyouxu/rust that referenced this pull request May 29, 2024
…workingjubilee

Make more of the test suite run on Mac Catalyst

Combined with rust-lang#125225, the only failing parts of the test suite are in `tests/rustdoc-js`, `tests/rustdoc-js-std` and `tests/debuginfo`. Tested with:
```console
./x test --target=aarch64-apple-ios-macabi library/std
./x test --target=aarch64-apple-ios-macabi --skip=tests/rustdoc-js --skip=tests/rustdoc-js-std --skip=tests/debuginfo tests
```

Will probably put up a PR later to enable _running_ on (not just compiling for) Mac Catalyst in CI, though not sure where exactly I should do so? `src/ci/github-actions/jobs.yml`?

Note that I've deliberately _not_ enabled stack overflow handlers on iOS/tvOS/watchOS/visionOS (see rust-lang#25872), but rather just skipped those tests, as it uses quite a few APIs that I'd be weary about getting rejected by the App Store (note that Swift doesn't do it on those platforms either).

r? ``@workingjubilee``

CC ``@thomcc``

``@rustbot`` label O-ios O-apple
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 29, 2024
Rollup merge of rust-lang#125226 - madsmtm:fix-mac-catalyst-tests, r=workingjubilee

Make more of the test suite run on Mac Catalyst

Combined with rust-lang#125225, the only failing parts of the test suite are in `tests/rustdoc-js`, `tests/rustdoc-js-std` and `tests/debuginfo`. Tested with:
```console
./x test --target=aarch64-apple-ios-macabi library/std
./x test --target=aarch64-apple-ios-macabi --skip=tests/rustdoc-js --skip=tests/rustdoc-js-std --skip=tests/debuginfo tests
```

Will probably put up a PR later to enable _running_ on (not just compiling for) Mac Catalyst in CI, though not sure where exactly I should do so? `src/ci/github-actions/jobs.yml`?

Note that I've deliberately _not_ enabled stack overflow handlers on iOS/tvOS/watchOS/visionOS (see rust-lang#25872), but rather just skipped those tests, as it uses quite a few APIs that I'd be weary about getting rejected by the App Store (note that Swift doesn't do it on those platforms either).

r? ``@workingjubilee``

CC ``@thomcc``

``@rustbot`` label O-ios O-apple
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request May 30, 2024
…bilee

Make more of the test suite run on Mac Catalyst

Combined with rust-lang/rust#125225, the only failing parts of the test suite are in `tests/rustdoc-js`, `tests/rustdoc-js-std` and `tests/debuginfo`. Tested with:
```console
./x test --target=aarch64-apple-ios-macabi library/std
./x test --target=aarch64-apple-ios-macabi --skip=tests/rustdoc-js --skip=tests/rustdoc-js-std --skip=tests/debuginfo tests
```

Will probably put up a PR later to enable _running_ on (not just compiling for) Mac Catalyst in CI, though not sure where exactly I should do so? `src/ci/github-actions/jobs.yml`?

Note that I've deliberately _not_ enabled stack overflow handlers on iOS/tvOS/watchOS/visionOS (see rust-lang/rust#25872), but rather just skipped those tests, as it uses quite a few APIs that I'd be weary about getting rejected by the App Store (note that Swift doesn't do it on those platforms either).

r? ``@workingjubilee``

CC ``@thomcc``

``@rustbot`` label O-ios O-apple
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-apple Operating system: Apple (macOS, iOS, tvOS, visionOS, watchOS) O-unix Operating system: Unix-like S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants