Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustup can be invoked by this module as the wrong architecture on MacOS #80

Closed
zbentley opened this issue Feb 4, 2024 · 14 comments
Closed

Comments

@zbentley
Copy link
Contributor

zbentley commented Feb 4, 2024

Context/Setup

  • MacOS 14
  • An ARM/M1/2 MacOS environment.
  • Rosetta installed.
  • Puppet-bolt installed via brew install --cask puppetlabs/puppet/puppet-bolt, the suggested install method.
    • An important note: for whatever reason, puppet labs does not provide a bolt installer for arm64, and only provides an x86_64 installer: https://downloads.puppet.com/mac/puppet-tools/12/x86_64/puppet-bolt-3.27.4-1.osx12.dmg at the time of this writing. This means that bolt will run via Rosetta in x86 emulation mode.
  • A puppet-bolt project which ncludes this module in its dependencies.
  • A puppet manifest which invokes this module to install any toolchain/target.
  • No pre-existing rustup/rust/cargo installation or materials anywhere on the system (rustup self uninstall followed by rm -rf ~/.rustup ~/.cargo should reset things to this state, albeit somewhat destructively).

To reproduce

  • Using bolt, apply a manifest that uses this module. For example (given your user name as $USER):
> bolt apply --target localhost --run-as root --execute 'rustup { "$USER": toolchains => ["nightly-aarch64-apple-darwin"], targets => ["aarch64-apple-darwin"] }'
  • Check the file type of cargo with file $(which cargo).

Expected behavior:

cargo and rustc report file types matching the host architecture. For an ARM mac that would look like

> file $(which cargo)
/Users/$USER/.cargo/bin/cargo: Mach-O 64-bit executable arm64

> file $(which rustc)
/Users/$USER/.cargo/bin/rustc: Mach-O 64-bit executable arm64

Observed behavior

> file $(which cargo)
file $(which cargo)
/Users/$USER/.cargo/bin/cargo: Mach-O 64-bit executable x86_64

> file $(which rustc)
/Users/$USER/.cargo/bin/rustc: Mach-O 64-bit executable x86_64

Why this matters

So what happens if cargo is a non-native architecture?

  • That does not affect cargo's build behavior. It and rustc will still compile/cross-compile Rust code normally according to the targets configured in rustup and will produce binaries of the selected architecture.
  • Medium severity: cargo and rustc will run unusually slowly on MacOS due to the presence of binary emulation translating the x86_64 binaries into arm64 code.
  • High severity: programs launched by cargo and their subprograms will prefer to launch x86 programs on MacOS. In other words, if cargo run or some Cargo plugin runs a universal binary on the system (like, say sudo), that subprogram will run in x86 mode. Even if spawned subprograms do not support x86 execution (in which case cargo will spawn them as arm64), those programs' spawned subprocesses will still run preferring x86 execution.

That last one can cause problems. I encountered one such issue where the cargo flamegraph plugin supplied by flamegraph-rs runs its inner dtrace process as x86, causing dtrace to fail when it traced arm64 processes. The bug report for that issue is here, and I blogged about the discovery process of this issue here.

Given that rustup/cargo installations can persist untouched for long periods of time, and given that the main compiler suite still works, I think it's likely that incorrect-architecture installations of cargo/rustc by this module persist without users' knowledge in many cases, silently causing slowdowns and occasionally causing louder issues like the above.

Suggested resolution

When this module bootstraps cargo with the rustup resource, or when it installs a per-toolchain cargo via the rustup::toolchain resource, it should allow user specification of the architecture to those resources.

The new architecture => or toolchain_architecture => parameter that users can specify should, if unspecified, default to the invoking machine's native architecture via the "architecture" fact.

When this module invokes subprograms internally, they should be wrapped in the arch command to force the architecture as requested by the user.

This behavior need only be present on MacOS, as other platforms do not do multiarch/binfmt translation in the same way (and the issue is much more common on MacOS than on free operating systems due to the frequently limited availability of source builds, as is the case with puppet-bolt in this case: only an x86_64 binary installable is provided, and build steps for creating a correct executable are not readily available).

@danielparks
Copy link
Owner

It’s been a while since I worked on this, so I could have the details wrong. Also, I don’t have an ARM Mac to test this on, which makes it harder to track down the issue myself.

I think the issue is the rustup wrappers, e.g. ~/.cargo/bin/cargo, are being installed using the “host” architecture, i.e. x84 because it’s running in Rosetta. The toolchains should still be installed for the architecture you specified, though the symptoms you report suggest that might not be happening — I would assume that if something running under Rosetta ran an ARM binary, then the ARM binary ran a universal binary, that it would choose the ARM version.

What does file ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* give you?

@danielparks
Copy link
Owner

Probably the best way to solve the rustup architecture problem is to allow choosing a specific rustup-init (see Other ways to install rustup). The downside is that it doesn’t make sense to use that on the main path — reimplementing the installation shell script that figures out the architecture string would be too fragile.

@zbentley
Copy link
Contributor Author

zbentley commented Feb 6, 2024

Thanks for the replies!

if something running under Rosetta ran an ARM binary, then the ARM binary ran a universal binary, that it would choose the ARM version.

I'd expect that too! But it does not seem to be occurring; it seems like if anything up the chain has been invoked as x86, even without an explicit arch override to do so, then future universal binaries will be launched as x86 even if there are intermediate ARM binaries launched (because they're non-universal and only support ARM). I really don't like that behavior from Rosetta/Apple, it seems quite counterintuitive, but ... what can you do 🤷

What does file ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* give you?

When rustup-init is called by this module via x86 bolt, all binaries in that folder are x86.

When rustup-init is run by me (on a clean/non-initted system) from an ARM Bash/Zsh shell binary, all binaries in that folder are ARM.

Probably the best way to solve the rustup architecture problem is to allow choosing a specific rustup-init

I don't think that'll change anything, sadly. I think that, so long as Bolt's x86-only (which it is for now), regardless of how rustup-init is run, unless whatever runs it wraps it in arch -arch $native_arch rustup-init ..., this issue will reoccur.

I don’t have an ARM Mac to test this on, which makes it harder to track down the issue myself.

I'll take a stab at a PR (my Ruby is rusty though) in the coming weeks and you can decide if the arch -arch approach seems feasible or not.

@danielparks
Copy link
Owner

danielparks commented Feb 8, 2024

What does file ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* give you?

When rustup-init is called by this module via x86 bolt, all binaries in that folder are x86.

That sounds like a bug in this module or maybe in rustup itself (unlikely, I think) — the toolchain should always be in the architecture in its name. For example, on my Intel Mac:

❯ rustup toolchain install --no-self-update --force-non-host nightly-aarch64-apple-darwin
  . . .
❯ cd ~/.rustup/toolchains
❯ file nightly-*-apple-darwin/bin/cargo       
nightly-aarch64-apple-darwin/bin/cargo: Mach-O 64-bit executable arm64
nightly-x86_64-apple-darwin/bin/cargo:  Mach-O 64-bit executable x86_64

Similarly, if I install the stable-x86_64-pc-windows-gnu toolchain, then the corresponding toolchain binaries are Windows executables.

Probably the best way to solve the rustup architecture problem is to allow choosing a specific rustup-init

I don't think that'll change anything, sadly. I think that, so long as Bolt's x86-only (which it is for now), regardless of how rustup-init is run, unless whatever runs it wraps it in arch -arch $native_arch rustup-init ..., this issue will reoccur.

I figure that the architecture of rustup-init is what determines the architecture of ~/.cargo/bin/cargo. That binary should just determine which toolchain is desired, then exec the correct cargo binary in ~/.rustup/toolchains/. Since ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/cargo is x86_64 for you, I don't think that’s the (immediate) problem.

It still might be a problem later once the toolchains binaries get fixed.


All that said, it does seem to be working on my machine (not really a surprise, I suppose). Can you run it with the environment variable RUSTUP_TRACE=1 and the -v flag? That will show exactly what commands it runs.

For example, on my local machine:

❯ RUSTUP_TRACE=1 puppet apply -ve 'rustup { daniel: toolchains => ["nightly-aarch64-apple-darwin"], targets => [] }' 
Info: Loading facts
Info: Loading facts
Notice: Compiled catalog for marlow.local in environment production in 0.19 seconds
Info: Using environment 'production'
Info: Applying configuration version '1707364703'
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup show
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain list
Notice: /Stage[main]/Main/Rustup[daniel]/Rustup_internal[daniel]/toolchains: toolchains changed [
  {
    'ensure' => 'present',
    'name' => 'stable-x86_64-apple-darwin'
  },
  {
    'ensure' => 'present',
    'name' => 'nightly-x86_64-apple-darwin'
  },
  {
    'ensure' => 'present',
    'name' => '1.64-x86_64-apple-darwin'
  },
  . . .
  {
    'ensure' => 'present',
    'name' => '1.69.0-x86_64-apple-darwin'
  }] to [
  {
    'ensure' => 'present',
    'name' => 'nightly-aarch64-apple-darwin',
    'profile' => 'default'
  }]
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain stable-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain nightly-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain 1.64-x86_64-apple-darwin
. . .
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup target list --installed --toolchain 1.69.0-x86_64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain install --no-self-update --force-non-host --profile default nightly-aarch64-apple-darwin
Info: rustup_internal: as daniel: /Users/daniel/.cargo/bin/rustup toolchain list
Notice: Applied catalog in 13.45 seconds
❯ file ~/.rustup/toolchains/nightly-*-apple-darwin/bin/cargo
/Users/daniel/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/cargo: Mach-O 64-bit executable arm64
/Users/daniel/.rustup/toolchains/nightly-x86_64-apple-darwin/bin/cargo:  Mach-O 64-bit executable x86_64

@zbentley
Copy link
Contributor Author

zbentley commented Feb 8, 2024

My output looks exactly like yours when I run puppet apply, but the issue is the cargo binary being used. The ones in my toolchain directories do match the toolchain's arch. However, the cargo in ~/.cargo/bin/cargo matches the arch of the program that installed it, and that's the cargo on my PATH (is that bad? Should I change that?).

If I install it via puppet apply, it's ARM. If I do it via bolt, it's x86, because bolt ix x86. Unfortunately, bolt seems to be swallowing debug output related to RUSTUP_TRACE; here's my command and output:

> RUSTUP_TRACE=1 bolt apply --verbose --log-level debug --target localhost --run-as root --execute 'rustup { "zac": toolchains => ["nightly-aarch64-apple-darwin"], targets => ["aarch64-apple-darwin"] }'
<snip, lots of facts>
Finished: task apply_helpers::custom_facts with 0 failures in 12.62 sec
Finished: install puppet and gather facts with 0 failures in 12.89 sec
Finished: install puppet and gather facts with 0 failures in 12.89 sec
Starting: apply catalog on localhost
Starting: apply catalog on localhost
Applying manifest block on ["localhost"]
Running task 'apply_helpers::apply_catalog' on localhost
Started on localhost...
[{"target":"localhost","action":"apply","object":null,"status":"success","value":{"report":{"host":"atropos.local","time":"2024-02-08T10:57:32.795239000-05:00","configuration_version":1707407840,"transaction_uuid":null,"report_format":12,"puppet_version":"7.26.0","status":"changed","transaction_completed":true,"noop":false,"noop_pending":false,"environment":null,"logs":[{"level":"warning","message":"Private key for 'atropos.local' does not exist","source":"Puppet","tags":["warning"],"time":"2024-02-08T10:57:32.846257000-05:00","file":null,"line":null},{"level":"warning","message":"Client certificate for 'atropos.local' does not exist","source":"Puppet","tags":["warning"],"time":"2024-02-08T10:57:32.846321000-05:00","file":null,"line":null},{"level":"notice","message":"created","source":"/Stage[main]/Main/Rustup[zac]/Rustup_internal[zac]/ensure","tags":["notice","rustup_internal","zac","rustup","class"],"time":"2024-02-08T10:57:34.107290000-05:00","file":"/Users/zac/Desktop/Projects/Personal/zbox/.modules/rustFinished on localhost:
up/manifests/init.pp","line":81},{"level":"notice","message":"Applied catalog in 13.23 seconds","source":"Puppet","tags":["notice"],"time":"2024-02-08T10:57:46.068321000-05:00","file":null,"line":null}],"metrics":{"resources":{"name":"resources","label":"Resources","values":[["total","Total",1],["skipped","Skipped",0],["failed","Failed",0],["failed_to_restart","Failed to restart",0],["restarted","Restarted",0],["changed","Changed",1],["out_of_sync","Out of sync",1],["scheduled","Scheduled",0],["corrective_change","Corrective change",0]]},"time":{"name":"time","label":"Time","values":[["rustup_internal","Rustup internal",13.223539],["config_retrieval","Config retrieval",0],["transaction_evaluation","Transaction evaluation",13.226376999984495],["catalog_application","Catalog application",13.231105000013486],["total","Total",13.273145]]},"changes":{"name":"changes","label":"Changes","values":[["total","Total",1]]},"events":{"name":"events","label":"Events","values":[["total","Tot  Notice: /Stage[main]/Main/Rustup[zac]/Rustup_internal[zac]/ensure: created
al",1],["failure","Failure",0],["success","Success",1]]}},"resource_statuses":{"Rustup_internal[zac]":{"title":"zac","file":"/Users/zac/Desktop/Projects/Personal/zbox/.modules/rustup/manifests/init.pp","line":81,"resource":"Rustup_internal[zac]","resource_type":"Rustup_internal","provider_used":"default","containment_path":["Stage[main]","Main","Rustup[zac]","Rustup_internal[zac]"],"evaluation_time":13.223539,"tags":["rustup_internal","zac","rustup","class"],"time":"2024-02-08T10:57:32.839099000-05:00","failed":false,"failed_to_restart":false,"changed":true,"out_of_sync":true,"skipped":false,"change_count":1,"out_of_sync_count":1,"events":[{"audited":false,"property":"ensure","previous_value":"absent","desired_value":"present","historical_value":null,"message":"created","name":"rustup_internal_created","status":"success","time":"2024-02-08T10:57:32.839219000-05:00","redacted":null,"corrective_change":false}],"corrective_change":false}},"corrective_change":false,"catalog_uuid":"1d5863e1-6e89-4578-80ff-3f9caa22af36","cached_catalog_status":"not_used"},"_output":"changed: 1, failed: 0, unchanged: 0 skipped: 0, noop: 0","_sensitive":"Sensitive [value redacted]"}}]
  changed: 1, failed: 0, unchanged: 0 skipped: 0, noop: 0
Finished: apply catalog with 0 failures in 26.74 sec
Finished: apply catalog with 0 failures in 26.74 sec
Successful on 1 target: localhost
Ran on 1 target in 39.67 sec

This is easier to reproduce without Puppet entirely. If I brew install rustup, I get an ARM rustup-init binary:

∴ file $(which rustup-init)
/opt/homebrew/bin/rustup-init: Mach-O 64-bit executable arm6

If I run rustup-init from an ARM shell (default answers), once it's done I have an ARM ~/.cargo/bin/cargo:

zac@atropos ~ ∴ which cargo
/Users/zac/.cargo/bin/cargo
zac@atropos ~ ∴ file $(which cargo)
/Users/zac/.cargo/bin/cargo: Mach-O 64-bit executable arm64

But if I run it via this module in an x86 bolt, things malfunction.

@zbentley
Copy link
Contributor Author

zbentley commented Feb 8, 2024

The linked PR does seem to work, and always installs the requisite base/bootstrap toolchain as the native architecture.

@danielparks
Copy link
Owner

Great, thanks! Merged, with only a bit of struggling with CI.

I imagine that fixes the problem with the wrappers being the wrong arch, but I would not have expected this to fix the issue where ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* are x86_64 instead of ARM. Can you confirm that?

@zbentley
Copy link
Contributor Author

I imagine that fixes the problem with the wrappers being the wrong arch, but I would not have expected this to fix the issue where ~/.rustup/toolchains/nightly-aarch64-apple-darwin/bin/* are x86_64 instead of ARM. Can you confirm that?

Sorry, I could have been clearer in the earlier comment: the issue was only ever with ~/.cargo/bin/cargo; the architectures in ~/.rustup/toolchains were always the correct ones for their respective toolchains. If that should change, I can apply a similar fix to their bootstrap code, just let me know!

@danielparks
Copy link
Owner

Ah, okay. That makes a lot more sense.

@danielparks
Copy link
Owner

I’ve just cut a release that includes this. Do you think anything else needs to be done before closing this?

Thanks again for report and the PR!

@zbentley
Copy link
Contributor Author

Just tested the new release, and my test succeeded on my laptop, but failed on a blank MacOS VM.

That's entirely my fault. My facter environment downcases the OS family for .... frankly silly reasons (which routinely cause me issues) that I should undo. The code in the PR I submitted is a no-op on all Macs except mine.

The fix is simple and will be linked right after this comment; upcase the "d" in "Darwin".

So sorry, I should have tested on a fresh install before I sent that in and made you spend the time.

Please don't feel pressured to cut a second release any time soon; this is a fairly rare issue and I'm happy to pin to master until whenever's convenient for you.

@danielparks
Copy link
Owner

Ha ha, no problem. I should have caught that in review.

@zbentley
Copy link
Contributor Author

Tested on main just now and this is confirmed fix. Sorry for all the back-n-forth, and thanks for authoring and maintaining this module!

@danielparks
Copy link
Owner

Ok, I cut a new release with an additional fix for macOS (in rustup::global).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants