test(flake): scoped per-test event-loop yield in the 6 dying spec files#7854
test(flake): scoped per-test event-loop yield in the 6 dying spec files#7854JohnMcLear wants to merge 1 commit into
Conversation
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
Review Summary by QodoAdd per-test event-loop yield to six flaky spec files
WalkthroughsDescription• Add per-test event-loop yield via setImmediate in beforeEach • Target six spec files with Windows silent-ELIFECYCLE flake • Break tight microtask chains causing event-loop starvation • Scoped per-file to avoid breaking plugin state-sharing Diagramflowchart LR
A["Six spec files<br/>with ELIFECYCLE flake"] -->|"Add beforeEach<br/>with setImmediate yield"| B["Event loop<br/>drains timer queue<br/>between tests"]
B -->|"Breaks microtask<br/>chains"| C["Reduces event-loop<br/>starvation on Windows"]
File Changes1. src/tests/backend/specs/api/importexportGetPost.ts
|
Code Review by Qodo
1. Duplicated yield hook
|
… to etherpad's own backend specs PR #7854's first iteration added the yield to 6 known-dying spec files (pad.ts, importexportGetPost.ts, socketio.ts, messages.ts, import.ts, clientvar_rev_consistency.ts). Linux backend matrix passed, proving the yield doesn't break the affected tests' own state-sharing assumptions. But the very next Win+plugins run captured **death #13 in sessionsAndGroups.ts**, a 7th file outside the scoped fix. The flake migrated rather than being suppressed. That's strong evidence the trigger is the rapid-sequential-test pattern in general, not specific files. Replace the per-file scope with a root-level `mochaHooks.beforeEach` yield in diagnostics.ts, gated on a file-path check: yield for ether/etherpad's own specs in `tests/backend/specs/`, SKIP for plugin tests loaded from `../node_modules/ep_*/static/tests/backend/specs/`. The plugin-test skip exists because PR #7844 demonstrated that an unconditional global yield breaks `ep_subscript_and_superscript`'s `returns HTML with Subscript HTML tags` series — those plugin tests share state across describe-block boundaries and don't tolerate any microtask reordering. The file-path check preserves PR #7844's finding without re-breaking those tests. Files modified: - src/tests/backend/diagnostics.ts: root beforeEach yield, scoped Per-file changes from the previous commit are reverted — root scope supersedes them and there's no point yielding twice per test. Test plan unchanged from the original PR: - Linux ± plugins must pass. - Windows ± plugins flake rate: ~22% pre-fix. Post-fix, run the CI 5-10x and compare. If unchanged, cadence is ruled out as the trigger and we look at per-test pathologies (jose CNG on Windows, libuv IOCP edge cases unrelated to load). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1bf8a75 to
83e1d37
Compare
CI Feedback 🧐A test triggered by this PR failed. Here is an AI-generated analysis of the failure:
|
|
Closing — the broader-scope yield (root mochaHooks.beforeEach, scoped to etherpad's own specs via file-path check) still didn't prevent the silent ELIFECYCLE. Captured death #14 on this run: ``` That file IS covered by the root yield. So cadence / rapid-sequential test pattern is NOT the trigger. Combined with PR #7852's result, this rules out two of the most plausible causal hypotheses (TIME_WAIT, cadence). Remaining candidates need either local Windows repro or out-of-process probing beyond what we've already added (#7846's netstat sidecar). In-CI diagnosis has reached diminishing returns. Linux ± plugins did pass cleanly — the file-path-scoped yield is safe even though it doesn't fix anything. Could be revived as a chore PR if the test-infra value alone is worth it, but I'm not going to ship "scoped yield with no measurable benefit" as a fix. |
Summary
Adds
beforeEach(async () => await new Promise(r => setImmediate(r)))at the top-level describe of the six spec files that cluster in the silent-ELIFECYCLE captures. Pattern-wide intervention against rapid-sequential test cadence as the suspected trigger.Why
12 captured silent-ELIFECYCLE deaths (#7838, #7842, #7846 diagnostics) all hit the same six files:
importexportGetPost.tssocketio.tsmessages.tspad.ts > Testsimport.tsclientvar_rev_consistency.tsAll six share a shape: 50–100 short tests per top-level describe, each test making rapid loopback HTTP (supertest) or socket.io connect/disconnect calls. Pre-kill node-reports show V8 main isolate going event-loop-starved for 200–400 ms before each kill (5 Hz heartbeat falls silent), then external termination.
The TIME_WAIT smoking gun captured by #7846 was ruled out as causal by #7852 (closed): keepAlive collapsed TIME_WAIT churn to zero, kill survived anyway. The remaining shared substrate is just cadence.
What this changes
One
beforeEachper file, inserted at the top-leveldescribe(__filename, ...). The yield forces a timer-queue drain between every test in those files, breaking the tight microtask chain that may be the trigger.Critical scope note: per-file beforeEach, NOT root
mochaHooks.beforeEach. PR #7844 tried the root-level variant and brokeep_subscript_and_superscript'sreturns HTML with Subscript HTML tagstests, which share state across describe-block boundaries. Per-file scope contains any timing perturbation to the affected files — plugin tests loaded from../node_modules/ep_*/static/tests/backend/specsare completely untouched.Expected outcomes
Either result is data we don't currently have.
Locally verified
Not feasible — local mocha against this project needs full bootstrap. Relying on CI.
🤖 Generated with Claude Code