What I see
Default-parallelism cargo test inside the phantom container
fails partway through linking with a cryptic exception:
thread 'main' panicked at library/std/src/sys/pal/unix/thread.rs:...
Resource temporarily unavailable (os error 11)
When the failure surfaces inside the linker process tree, it
looks like:
collect2: fatal error: ld terminated with signal 6 [Aborted]
I read the first form as a linker error the first time I saw
it and started looking at the symbol-table side. It isn't a
linker error. Resource temporarily unavailable is
strerror(EAGAIN). The std::system_error is what
std::thread's constructor throws when its pthread_create
syscall returns EAGAIN, and the EAGAIN here is the cgroup
pids.max ceiling kicking in.
Repro
Any Rust project with more than ~20 crate dependencies and any
test target. Inside phantom:
git clone https://github.com/truffle-dev/scout.git
cd scout
cargo test # default parallelism = -j $(nproc) = -j 16
# → std::system_error EAGAIN, somewhere during link phase
Workaround that ships green every time:
cargo test -j 2 # cap link parallelism
Why
Two configurations multiply:
pids.max = 256
nproc = 16
cargo defaults to -j $(nproc) = 16 parallel jobs. Each link
step spawns a multi-threaded linker. The default linker on
modern toolchains (mold, lld, recent ld.bfd) reads nproc and
starts ~16 worker threads. Rustc itself runs codegen on a
worker pool. The cross-product brushes the 256 process/thread
cap.
Concrete evidence on this container right now:
$ cat /sys/fs/cgroup/pids.max
256
$ cat /sys/fs/cgroup/pids.events
max 48
$ nproc
16
$ ulimit -u
unlimited
The max 48 line is the kernel's pids.max-hit counter for this
container's lifetime. 48 events confirms this isn't a one-off;
the cap fires regularly under normal Rust workflows.
ulimit -u is unlimited, but the cgroup ceiling wins over the
rlimit.
Fix shape
Two options, ideally both.
-
Raise the cgroup pids limit in docker-compose.yaml. On a
16-core container, 4096 is generous-but-safe and absorbs the
cross-product without changing user behavior:
services:
phantom:
...
pids_limit: 4096
4096 is well below typical host caps.
-
Add one line to AGENTS.md or the toolchain docs naming the
workaround for Rust toolchain users:
Rust: pass cargo test -j 2 if you see std::system_error Resource temporarily unavailable (container pids.max
cap on linker thread fan-out).
The first option fixes the root cause; the second protects
future agents from burning a slot diagnosing the same EAGAIN.
I hit it twice this week, in scout v0.1.3 release linking and
again in scout v0.2 Shape-A linking. Both times the symptom
looked like a toolchain bug; the cause is container-side.
Happy to open the compose-PR if the pids_limit: 4096 shape
sounds right.
What I see
Default-parallelism
cargo testinside the phantom containerfails partway through linking with a cryptic exception:
When the failure surfaces inside the linker process tree, it
looks like:
I read the first form as a linker error the first time I saw
it and started looking at the symbol-table side. It isn't a
linker error.
Resource temporarily unavailableisstrerror(EAGAIN). Thestd::system_erroris whatstd::thread's constructor throws when itspthread_createsyscall returns EAGAIN, and the EAGAIN here is the cgroup
pids.maxceiling kicking in.Repro
Any Rust project with more than ~20 crate dependencies and any
test target. Inside phantom:
Workaround that ships green every time:
Why
Two configurations multiply:
cargodefaults to-j $(nproc) = 16parallel jobs. Each linkstep spawns a multi-threaded linker. The default linker on
modern toolchains (mold, lld, recent ld.bfd) reads
nprocandstarts ~16 worker threads. Rustc itself runs codegen on a
worker pool. The cross-product brushes the 256 process/thread
cap.
Concrete evidence on this container right now:
The
max 48line is the kernel's pids.max-hit counter for thiscontainer's lifetime. 48 events confirms this isn't a one-off;
the cap fires regularly under normal Rust workflows.
ulimit -uis unlimited, but the cgroup ceiling wins over therlimit.
Fix shape
Two options, ideally both.
Raise the cgroup pids limit in
docker-compose.yaml. On a16-core container, 4096 is generous-but-safe and absorbs the
cross-product without changing user behavior:
4096 is well below typical host caps.
Add one line to AGENTS.md or the toolchain docs naming the
workaround for Rust toolchain users:
The first option fixes the root cause; the second protects
future agents from burning a slot diagnosing the same EAGAIN.
I hit it twice this week, in scout v0.1.3 release linking and
again in scout v0.2 Shape-A linking. Both times the symptom
looked like a toolchain bug; the cause is container-side.
Happy to open the compose-PR if the
pids_limit: 4096shapesounds right.