Hello! 👋
I was trying to run an omicron-nexus test locally on my macbook and hit a wrinkle so I decided to report the problem and offer a PR with a fix. I don't think this is urgent since it only affects running a subset of tests on macOS.
Also, this is my first time trying to get things to run (and also contributing), so please let me know if I misunderstood or misexplained anything. 🙏
TLDR version
On startup, Dendrite tries to infer the P4 directory and as a part of that runs path-shape validation checks on the dpd binary path. This path is obtained via std::env::current_exe() which has platform-specific behavior when the executable is invoked through a symlink: some platforms return the symlink path, while others return the symlink target. Because Omicron sets up Dendrite via a symlinked directory and because on macOS std::env::current_exe() returns the symlink path (not the target), Dendrite fails to start because the mentioned checks fail due to running against the wrong path.
Context
I wanted to see if I could work on a random TODO from the codebase. After spinning the roulette wheel, I landed on this nexus test TODO. As a first step, I wanted to get the closest test to run locally on my macbook (with macOS Sequoia 15.7.5). After setting up all the dependencies, I ran:
> cargo nextest run -p omicron-nexus --test test_all test_local_users
and observed this output:
info: experimental features enabled: setup-scripts, benchmarks
Finished `test` profile [unoptimized + debuginfo] target(s) in 2.22s
Finished `test` profile [unoptimized + debuginfo] target(s) in 1.31s
Running `target/debug/crdb-seed`
May 02 10:19:16.051 INFO Using existing CRDB seed tarball: `/var/folders/b3/1nh9mpbx7g7d38ygchdkt2nr0000gn/T/crdb-base-izuzak/990110b8072bc48e04d1498a031bd5a9299dacec66e5450fac2a9ba1a75193c9.tar`
SETUP PASS [ 1.801s] crdb-seed: cargo run -p crdb-seed --profile test
FAIL [ 6.363s] omicron-nexus::test_all integration_tests::password_login::test_local_users
stdout ───
running 1 test
test integration_tests::password_login::test_local_users ... FAILED
failures:
failures:
integration_tests::password_login::test_local_users
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 560 filtered out; finished in 6.29s
...
thread 'integration_tests::password_login::test_local_users' (38538025) panicked at nexus/test-utils/src/starter.rs:435:10:
called `Result::unwrap()` on an `Err` value: failed to discover dendrite port from files in /var/folders/b3/1nh9mpbx7g7d38ygchdkt2nr0000gn/T/.tmp5d88DD
Caused by:
dpd exited with status signal: 6 (SIGABRT) before port could be discovered:
thread 'main' (38538468) panicked at dpd/src/main.rs:272:23:
unable to initialize bf: P4Missing("/Users/izuzak/Developer/omicron/out/dendrite-stub expected to end with dendrite")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
...
Investigation notes
As far as I can tell, that expectation regarding the path ending in dendrite comes from here in the dendrite repo, which is called from here with the path obtained here. So, on startup, dpd tries to infer the P4 directory (unless it's defined via ENV) and as a part of that it does some path-shape validations on the dpd binary location which it obtains from std::env::current_exe().
Okay, that's one side of the equation. The other side is that omicron-nexus tests are set up to run dpd using Command::new("dpd"). Effectively, this executes: dpd run --listen-addresses .... Since only the dpd binary name is used, dpd is searched for in PATH. env.sh adds out/dendrite-stub/bin to the PATH and that directory is a symlink to out/dendrite-stub/root/opt/oxide/dendrite/bin, created as a part of install_builder_prerequisites.sh. So, when omicron-nexus tests start Dendrite, the dpd binary is executed from a symlinked directory.
And the final piece in all of this is that std::env::current_exe() has platform-specific behavior when the executable is invoked through a symlink: some platforms return the symlink path, while others return the symlink target. I see that this platform-specific behavior was brought up in this old rust-lang issue where it was decided to address this through documentation.
So, the problem seems to happen due to a combination of a few things:
- Dendrite uses
std::env::current_exe() and performs path-shape checks on the resulting path.
std::env::current_exe() returns the symlink path on macOS instead of the symlink target (at least on macOS 15.7.5 🤷).
- Omicron exposes
dpd through a symlinked bin directory.
Because of that, Dendrite ends up checking the symlinked path out/dendrite-stub/bin/dpd instead of the target out/dendrite-stub/root/opt/oxide/dendrite/bin/dpd and fails with the message above.
Ideas for fixing this
I can think of a few ways to make this problem go away, but updating Dendrite to resolve symlinks before performing path-shape checks on the dpd executable path feels like the cleanest/simplest thing to do. std::fs::canonicalize() can do this so I'll open a PR with that approach. I worked around the problem locally (so that I can run the tests) by prepending the symlink target directory to PATH.
I also noticed that std::env::current_exe() is used in a few other places across oxide repositories. There aren't too many of them so should be possible to manually check if any other usages have a similar problem.
Hello! 👋
I was trying to run an
omicron-nexustest locally on my macbook and hit a wrinkle so I decided to report the problem and offer a PR with a fix. I don't think this is urgent since it only affects running a subset of tests on macOS.Also, this is my first time trying to get things to run (and also contributing), so please let me know if I misunderstood or misexplained anything. 🙏
TLDR version
On startup, Dendrite tries to infer the P4 directory and as a part of that runs path-shape validation checks on the
dpdbinary path. This path is obtained viastd::env::current_exe()which has platform-specific behavior when the executable is invoked through a symlink: some platforms return the symlink path, while others return the symlink target. Because Omicron sets up Dendrite via a symlinked directory and because on macOSstd::env::current_exe()returns the symlink path (not the target), Dendrite fails to start because the mentioned checks fail due to running against the wrong path.Context
I wanted to see if I could work on a random TODO from the codebase. After spinning the roulette wheel, I landed on this nexus test TODO. As a first step, I wanted to get the closest test to run locally on my macbook (with macOS Sequoia 15.7.5). After setting up all the dependencies, I ran:
and observed this output:
Investigation notes
As far as I can tell, that expectation regarding the path ending in
dendritecomes from here in the dendrite repo, which is called from here with the path obtained here. So, on startup,dpdtries to infer the P4 directory (unless it's defined via ENV) and as a part of that it does some path-shape validations on thedpdbinary location which it obtains fromstd::env::current_exe().Okay, that's one side of the equation. The other side is that
omicron-nexustests are set up to rundpdusingCommand::new("dpd"). Effectively, this executes:dpd run --listen-addresses .... Since only thedpdbinary name is used,dpdis searched for inPATH.env.shaddsout/dendrite-stub/binto thePATHand that directory is a symlink toout/dendrite-stub/root/opt/oxide/dendrite/bin, created as a part ofinstall_builder_prerequisites.sh. So, whenomicron-nexustests start Dendrite, thedpdbinary is executed from a symlinked directory.And the final piece in all of this is that
std::env::current_exe()has platform-specific behavior when the executable is invoked through a symlink: some platforms return the symlink path, while others return the symlink target. I see that this platform-specific behavior was brought up in this old rust-lang issue where it was decided to address this through documentation.So, the problem seems to happen due to a combination of a few things:
std::env::current_exe()and performs path-shape checks on the resulting path.std::env::current_exe()returns the symlink path on macOS instead of the symlink target (at least on macOS 15.7.5 🤷).dpdthrough a symlinked bin directory.Because of that, Dendrite ends up checking the symlinked path
out/dendrite-stub/bin/dpdinstead of the targetout/dendrite-stub/root/opt/oxide/dendrite/bin/dpdand fails with the message above.Ideas for fixing this
I can think of a few ways to make this problem go away, but updating Dendrite to resolve symlinks before performing path-shape checks on the
dpdexecutable path feels like the cleanest/simplest thing to do.std::fs::canonicalize()can do this so I'll open a PR with that approach. I worked around the problem locally (so that I can run the tests) by prepending the symlink target directory toPATH.I also noticed that
std::env::current_exe()is used in a few other places across oxide repositories. There aren't too many of them so should be possible to manually check if any other usages have a similar problem.