Summary
The @simlin/server Jest suite (src/server) intermittently fails one test when run under the parallel load of the pre-commit hook. A single pre-commit run reported 1 failed, 72 passed for @simlin/server, but re-running the suite standalone passed cleanly 73/73 (confirmed twice). This is a pre-existing flake: the change that triggered the failed pre-commit run was a Rust-only change to src/simlin-engine/src/layout/metrics.rs, completely unrelated to the server, so the test itself was almost certainly the cause rather than the diff under test.
Why it matters
This undermines the reliability of the pre-commit gate (scripts/pre-commit). The hook runs many checks in parallel (Rust fmt/clippy/tests, WASM build, TS lint/typecheck/tests, Python tests -- see root CLAUDE.md "Pre-commit Hooks"), and under that contention one server test flips red. The consequence is a spurious red on a branch that is actually green: a developer can be blocked (or worse, conditioned to distrust the gate) by a failure that has nothing to do with their change. It is a developer-experience and CI-trust problem, not a product correctness bug, but it erodes the value of the canonical gate.
This is the same class of problem as the recently-filed #629 (pre-commit Rust pipeline spuriously failing on a cold cache due to parallel clippy --all-features + capped cargo test contending on the package-cache lock), but in a different component: that one is the Rust pipeline / cargo package-cache lock; this one is the @simlin/server Jest suite. It is also distinct from #474 (an order-dependent/flaky Rust engine test) and from the pysimlin Hypothesis health-check flake tracked in docs/tech-debt.md item 19.
Component affected
src/server -- the @simlin/server Jest test suite (7 test files; see src/server/CLAUDE.md).
- Surfaces through
scripts/pre-commit (and by extension CI, if the same suite runs there under parallel load).
How it reproduces / what's known
- Symptom: under parallel pre-commit load,
1 failed, 72 passed; standalone re-run passes 73/73 (reproduced-clean twice in isolation).
- The specific failing test has not been identified yet -- the parallel run did not surface (or the surfacing was not captured) which of the 73 tests flaked.
- Likely root-cause families for a "fails under load, passes in isolation" flake: test-isolation / shared mutable fixture state, a hard-coded port or other shared OS resource, a timer/
setTimeout-driven async race, or resource contention (CPU starvation pushing an implicit timeout over the edge when the box is busy with the rest of the pre-commit run).
Possible approaches for resolution
- Identify the flaky test. Run the server suite repeatedly under load to reproduce. Useful levers:
- Run with reduced/maximal worker parallelism to bracket the behavior, e.g.
pnpm --filter @simlin/server test -- --runInBand (serial) vs. the default parallel run, and a high --maxWorkers while the machine is otherwise busy.
- Loop the suite (e.g. a shell
for loop, or Jest with a repeat) while a CPU/IO load generator runs in parallel to mimic the rest of the pre-commit hook.
- Capture the failing test name and its error/stack the first time it trips.
- Find the root cause once the test is known: look for shared module-level fixture state not reset between tests, a fixed port / temp path collision, a real timer vs. fake timers, an unawaited promise, or an implicit timeout that's too tight under contention.
- Make it deterministic: isolate the shared state (per-test fixtures /
beforeEach reset), bind to an ephemeral port (:0) or unique temp paths, use fake timers / explicit awaits instead of wall-clock waits, and/or relax overly tight timeouts. Prefer fixing the root cause over papering it with retries.
Context
Discovered during the layout-quality-eval work (branch layout-quality-eval). The triggering pre-commit run was for an unrelated Rust-only change to src/simlin-engine/src/layout/metrics.rs; the server flake is therefore pre-existing and not caused by that change.
Summary
The
@simlin/serverJest suite (src/server) intermittently fails one test when run under the parallel load of the pre-commit hook. A single pre-commit run reported1 failed, 72 passedfor@simlin/server, but re-running the suite standalone passed cleanly73/73(confirmed twice). This is a pre-existing flake: the change that triggered the failed pre-commit run was a Rust-only change tosrc/simlin-engine/src/layout/metrics.rs, completely unrelated to the server, so the test itself was almost certainly the cause rather than the diff under test.Why it matters
This undermines the reliability of the pre-commit gate (
scripts/pre-commit). The hook runs many checks in parallel (Rust fmt/clippy/tests, WASM build, TS lint/typecheck/tests, Python tests -- see rootCLAUDE.md"Pre-commit Hooks"), and under that contention one server test flips red. The consequence is a spurious red on a branch that is actually green: a developer can be blocked (or worse, conditioned to distrust the gate) by a failure that has nothing to do with their change. It is a developer-experience and CI-trust problem, not a product correctness bug, but it erodes the value of the canonical gate.This is the same class of problem as the recently-filed #629 (pre-commit Rust pipeline spuriously failing on a cold cache due to parallel
clippy --all-features+ cappedcargo testcontending on the package-cache lock), but in a different component: that one is the Rust pipeline / cargo package-cache lock; this one is the@simlin/serverJest suite. It is also distinct from #474 (an order-dependent/flaky Rust engine test) and from the pysimlin Hypothesis health-check flake tracked indocs/tech-debt.mditem 19.Component affected
src/server-- the@simlin/serverJest test suite (7 test files; seesrc/server/CLAUDE.md).scripts/pre-commit(and by extension CI, if the same suite runs there under parallel load).How it reproduces / what's known
1 failed, 72 passed; standalone re-run passes73/73(reproduced-clean twice in isolation).setTimeout-driven async race, or resource contention (CPU starvation pushing an implicit timeout over the edge when the box is busy with the rest of the pre-commit run).Possible approaches for resolution
pnpm --filter @simlin/server test -- --runInBand(serial) vs. the default parallel run, and a high--maxWorkerswhile the machine is otherwise busy.forloop, or Jest with a repeat) while a CPU/IO load generator runs in parallel to mimic the rest of the pre-commit hook.beforeEachreset), bind to an ephemeral port (:0) or unique temp paths, use fake timers / explicitawaits instead of wall-clock waits, and/or relax overly tight timeouts. Prefer fixing the root cause over papering it with retries.Context
Discovered during the layout-quality-eval work (branch
layout-quality-eval). The triggering pre-commit run was for an unrelated Rust-only change tosrc/simlin-engine/src/layout/metrics.rs; the server flake is therefore pre-existing and not caused by that change.