Commit 3846387
fix(tracing): de-flake CI by guarding async snapshot vs sync flushSync race
CI run 25143783932 (commit ee73863, agent_outcome cherry-pick) failed
with `flushSync — crash recovery > flushSync preserves all accumulated
data` after passing locally and on prior + subsequent runs. Root cause
analysis surfaced a real race in the tracer:
The `Recap` (tracing.ts) `snapshot()` method writes asynchronously via
`fs.writeFile(tmpPath) → fs.rename(tmpPath, filePath)` for atomicity,
storing the in-flight promise in `this.snapshotPromise` but never
awaiting it. `flushSync()` writes synchronously to the SAME `filePath`
via `fsSync.writeFileSync`. On slow CI runners with disk contention, a
snapshot that was in-flight when flushSync ran could complete its
`fs.rename` AFTER flushSync's sync write — clobbering the crashed-trace
file with stale (non-crashed) content. The test then read the wrong
content and failed assertions like `summary.status === "crashed"`.
Fix:
- Add a `private crashed = false` flag on the Recap class.
- `flushSync()` sets `this.crashed = true` BEFORE its sync write so any
in-flight async snapshot() can detect the takeover.
- `snapshot()` checks `this.crashed` at two points:
- Entry: bail before starting a new write.
- Just before `fs.rename`: if a flushSync ran during the write, drop
the temp file and skip the rename (so the crashed file stays
canonical).
Why not just await snapshotPromise in flushSync: flushSync is invoked
from SIGINT/SIGTERM/beforeExit handlers where async code may not run.
The flag-based guard makes the race deterministic without depending on
event-loop progression after the signal.
Verification:
- The flaky test now passes 10/10 times locally.
- Broader regression run: 3554/3554 tests pass across
test/altimate/ + test/upstream/ + test/branding/.
- Single-file run: 27/27 pass in tracing-display-crash.test.ts every time.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent 3aaf064 commit 3846387
1 file changed
Lines changed: 19 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
328 | 328 | | |
329 | 329 | | |
330 | 330 | | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
331 | 336 | | |
332 | 337 | | |
333 | 338 | | |
| |||
745 | 750 | | |
746 | 751 | | |
747 | 752 | | |
| 753 | + | |
748 | 754 | | |
749 | 755 | | |
750 | 756 | | |
751 | 757 | | |
752 | 758 | | |
753 | 759 | | |
754 | 760 | | |
755 | | - | |
| 761 | + | |
| 762 | + | |
| 763 | + | |
756 | 764 | | |
757 | 765 | | |
758 | | - | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
759 | 773 | | |
760 | 774 | | |
761 | 775 | | |
| |||
948 | 962 | | |
949 | 963 | | |
950 | 964 | | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
951 | 968 | | |
952 | 969 | | |
953 | 970 | | |
| |||
0 commit comments