server: @simlin/server Jest suite intermittently fails one test under parallel pre-commit load

## Summary

The `@simlin/server` Jest suite (`src/server`) intermittently fails **one** test when run under the parallel load of the pre-commit hook. A single pre-commit run reported `1 failed, 72 passed` for `@simlin/server`, but re-running the suite standalone passed cleanly `73/73` (confirmed twice). This is a pre-existing flake: the change that triggered the failed pre-commit run was a Rust-only change to `src/simlin-engine/src/layout/metrics.rs`, completely unrelated to the server, so the test itself was almost certainly the cause rather than the diff under test.

## Why it matters

This undermines the reliability of the pre-commit gate (`scripts/pre-commit`). The hook runs many checks in parallel (Rust fmt/clippy/tests, WASM build, TS lint/typecheck/tests, Python tests -- see root `CLAUDE.md` "Pre-commit Hooks"), and under that contention one server test flips red. The consequence is a spurious red on a branch that is actually green: a developer can be blocked (or worse, conditioned to distrust the gate) by a failure that has nothing to do with their change. It is a developer-experience and CI-trust problem, not a product correctness bug, but it erodes the value of the canonical gate.

This is the same *class* of problem as the recently-filed #629 (pre-commit Rust pipeline spuriously failing on a cold cache due to parallel `clippy --all-features` + capped `cargo test` contending on the package-cache lock), but in a different component: that one is the Rust pipeline / cargo package-cache lock; this one is the `@simlin/server` Jest suite. It is also distinct from #474 (an order-dependent/flaky *Rust engine* test) and from the pysimlin Hypothesis health-check flake tracked in `docs/tech-debt.md` item 19.

## Component affected

- `src/server` -- the `@simlin/server` Jest test suite (7 test files; see `src/server/CLAUDE.md`).
- Surfaces through `scripts/pre-commit` (and by extension CI, if the same suite runs there under parallel load).

## How it reproduces / what's known

- Symptom: under parallel pre-commit load, `1 failed, 72 passed`; standalone re-run passes `73/73` (reproduced-clean twice in isolation).
- The specific failing test has **not** been identified yet -- the parallel run did not surface (or the surfacing was not captured) which of the 73 tests flaked.
- Likely root-cause families for a "fails under load, passes in isolation" flake: test-isolation / shared mutable fixture state, a hard-coded port or other shared OS resource, a timer/`setTimeout`-driven async race, or resource contention (CPU starvation pushing an implicit timeout over the edge when the box is busy with the rest of the pre-commit run).

## Possible approaches for resolution

1. **Identify the flaky test.** Run the server suite repeatedly under load to reproduce. Useful levers:
   - Run with reduced/maximal worker parallelism to bracket the behavior, e.g. `pnpm --filter @simlin/server test -- --runInBand` (serial) vs. the default parallel run, and a high `--maxWorkers` while the machine is otherwise busy.
   - Loop the suite (e.g. a shell `for` loop, or Jest with a repeat) while a CPU/IO load generator runs in parallel to mimic the rest of the pre-commit hook.
   - Capture the failing test name and its error/stack the first time it trips.
2. **Find the root cause** once the test is known: look for shared module-level fixture state not reset between tests, a fixed port / temp path collision, a real timer vs. fake timers, an unawaited promise, or an implicit timeout that's too tight under contention.
3. **Make it deterministic:** isolate the shared state (per-test fixtures / `beforeEach` reset), bind to an ephemeral port (`:0`) or unique temp paths, use fake timers / explicit `await`s instead of wall-clock waits, and/or relax overly tight timeouts. Prefer fixing the root cause over papering it with retries.

## Context

Discovered during the layout-quality-eval work (branch `layout-quality-eval`). The triggering pre-commit run was for an unrelated Rust-only change to `src/simlin-engine/src/layout/metrics.rs`; the server flake is therefore pre-existing and not caused by that change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: @simlin/server Jest suite intermittently fails one test under parallel pre-commit load #635

Summary

Why it matters

Component affected

How it reproduces / what's known

Possible approaches for resolution

Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

server: @simlin/server Jest suite intermittently fails one test under parallel pre-commit load #635

Description

Summary

Why it matters

Component affected

How it reproduces / what's known

Possible approaches for resolution

Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions