Symptom
Multiple integration/e2e tests reliably time out at the 5000 ms default when run as part of the full `bun test` suite under contention. They all share the same root cause: real-e2e operations (subprocess spawns, workspace materialisation, pool acquisition) whose wall-clock under suite load exceeds Bun's 5s default per-test timeout.
Same bug class as #1169 (fixed in #1170 for `pipeline-e2e.test.ts`) and the `pipeline input` block (fixed in #1176). New occurrences keep surfacing as more PRs get pushed — every push attempt this week has hit a different file.
Known offending tests (as of 2026-04-27)
`packages/core/test/evaluation/orchestrator.test.ts` (and related workspace test files):
- `WorkspacePoolManager > slot acquisition > throws when all slots are locked`
- `RepoManager > materializeAll > materializes multiple repos`
- workspace lifecycle tests
- `--workspace` flag tests
`apps/cli/test/`:
- `eval.integration.test.ts` — multiple cases
- `commands/results/serve.test.ts`
- `pipeline grade — builtin assertions` tests
- `agentv eval assert > exits 0 when grader returns ...`
- `agentv eval assert > exits 1 when grader returns ...`
- `trend command > ...`
This list is non-exhaustive — any test that spawns subprocesses or materialises workspaces is a candidate. Recommend a sweep rather than filing per-file issues.
Why this matters
`validate.yml` (CI) does not run `bun test` — but the local prek pre-push hook does, and it is the only test gate before push. PRs #1167, #1168, #1174, #1175 all required `--no-verify` bypass. PR #1176 needed seven push attempts before catching a contention-free run. That undermines the safety the hook is supposed to provide and trains contributors to ignore it.
Suggested fix
Same one-liner pattern as #1170 / #1176: bump per-test timeout to 30000 ms using Bun's numeric third-arg form:
```ts
it('test name', async () => { ... }, 30_000);
```
(The files import from `'vitest'` but are run by `bun test` — the numeric form works; the vitest options-object does not.)
When all tests in a `describe` block share the same risk profile, prefer setting it once at the describe level.
Approach
Recommend one sweep PR that walks the repo, identifies every test using `execa`, subprocess spawning, or `materializeAll`/`WorkspacePoolManager`, and applies the 30s timeout uniformly. The list above is a starting point; the sweep should grep more broadly:
```bash
grep -rn 'execa|materializeAll|WorkspacePoolManager' apps/cli/test packages/core/test
```
This is preferable to per-file PRs because:
- Each new flake costs a PR and a contention bypass.
- The fix is mechanical with no architectural risk.
- Future contributors stop hitting the bypass-needed pattern.
Handoff context
Symptom
Multiple integration/e2e tests reliably time out at the 5000 ms default when run as part of the full `bun test` suite under contention. They all share the same root cause: real-e2e operations (subprocess spawns, workspace materialisation, pool acquisition) whose wall-clock under suite load exceeds Bun's 5s default per-test timeout.
Same bug class as #1169 (fixed in #1170 for `pipeline-e2e.test.ts`) and the `pipeline input` block (fixed in #1176). New occurrences keep surfacing as more PRs get pushed — every push attempt this week has hit a different file.
Known offending tests (as of 2026-04-27)
`packages/core/test/evaluation/orchestrator.test.ts` (and related workspace test files):
`apps/cli/test/`:
This list is non-exhaustive — any test that spawns subprocesses or materialises workspaces is a candidate. Recommend a sweep rather than filing per-file issues.
Why this matters
`validate.yml` (CI) does not run `bun test` — but the local prek pre-push hook does, and it is the only test gate before push. PRs #1167, #1168, #1174, #1175 all required `--no-verify` bypass. PR #1176 needed seven push attempts before catching a contention-free run. That undermines the safety the hook is supposed to provide and trains contributors to ignore it.
Suggested fix
Same one-liner pattern as #1170 / #1176: bump per-test timeout to 30000 ms using Bun's numeric third-arg form:
```ts
it('test name', async () => { ... }, 30_000);
```
(The files import from `'vitest'` but are run by `bun test` — the numeric form works; the vitest options-object does not.)
When all tests in a `describe` block share the same risk profile, prefer setting it once at the describe level.
Approach
Recommend one sweep PR that walks the repo, identifies every test using `execa`, subprocess spawning, or `materializeAll`/`WorkspacePoolManager`, and applies the 30s timeout uniformly. The list above is a starting point; the sweep should grep more broadly:
```bash
grep -rn 'execa|materializeAll|WorkspacePoolManager' apps/cli/test packages/core/test
```
This is preferable to per-file PRs because:
Handoff context
<<:) in eval loader #1174 push attempt). Broadened on 2026-04-27 after PR fix(test): raise input.test.ts pipeline timeouts to 30s #1176 surfaced the wider list.