Skip to content

fix(test): raise pipeline-e2e timeout to 30s (#1169)#1170

Merged
christso merged 1 commit intomainfrom
fix/1169-pipeline-e2e-timeout
Apr 27, 2026
Merged

fix(test): raise pipeline-e2e timeout to 30s (#1169)#1170
christso merged 1 commit intomainfrom
fix/1169-pipeline-e2e-timeout

Conversation

@christso
Copy link
Copy Markdown
Collaborator

Closes #1169

Problem

apps/cli/test/commands/eval/pipeline/pipeline-e2e.test.ts (eval pipeline e2e > runs full input → grade → bench pipeline) flakes under suite contention from bun --filter agentv test, timing out at the 5000 ms per-test default. The test passes in isolation in ~7-12s — it's a real end-to-end test that spawns three sequential bun apps/cli/src/cli.ts pipeline ... subprocesses (input → grade → bench), so 5s is too tight when the suite is busy.

Fix

Raise the per-test timeout to 30 s using Bun's it(name, fn, timeout) numeric-third-arg form. One-line change, no scope creep. The pipeline plumbing assertion ("the wiring works") is preserved as-is — fixture not trimmed because the existing fixture is already minimal and the timeout bump alone fully solves the flake.

Red / green evidence

  • Red (before): suite-context run produced (fail) eval pipeline e2e > runs full input → grade → bench pipeline ... ^ this test timed out after 5000ms.
  • Green (after):
    • Isolated: bun test apps/cli/test/commands/eval/pipeline/pipeline-e2e.test.ts1 pass, 0 fail in ~12 s.
    • Suite context: bun --filter agentv test ran multiple times; the pipeline-e2e test no longer appears in any failure output. Pre-push hook (which runs the full suite) passed cleanly on the green push.

Notes

  • A few neighbouring pipeline input tests in apps/cli/test/commands/eval/pipeline/input.test.ts exhibit the same 5s-timeout-under-contention pattern and flake intermittently. They are out of scope for test: pipeline-e2e flake at 5000ms default timeout #1169 and would benefit from the same treatment in a follow-up.

…ntion flake (#1169)

The `eval pipeline e2e > runs full input → grade → bench pipeline` test
spawns three sequential `bun apps/cli/src/cli.ts pipeline ...` subprocesses
(input → grade → bench). In isolation it completes in ~7-12s, well under
the 5000ms default per-test timeout, but under suite contention from
`bun --filter agentv test` it routinely overshoots and times out.

Bump the per-test timeout to 30s using `it(name, fn, timeout)` (Bun's
test runner supports the numeric third-arg form). Other suite tests are
not modified — only the test cited in the issue.

Closes #1169
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: c50d372
Status: ✅  Deploy successful!
Preview URL: https://e0da385c.agentv.pages.dev
Branch Preview URL: https://fix-1169-pipeline-e2e-timeou.agentv.pages.dev

View logs

@christso christso merged commit 65944a8 into main Apr 27, 2026
4 checks passed
@christso christso deleted the fix/1169-pipeline-e2e-timeout branch April 27, 2026 10:08
christso added a commit that referenced this pull request Apr 27, 2026
…1170) (#1176)

The `pipeline input` describe block in
`apps/cli/test/commands/eval/pipeline/input.test.ts` exhibits the same
5s-default timeout flake under suite contention as the e2e test fixed in
#1170. Each test spawns a `bun apps/cli/src/cli.ts pipeline input ...`
subprocess that completes in ~1-2s in isolation but routinely overshoots
5s under `bun --filter agentv test` contention.

Apply the same per-test timeout bump (`it(name, fn, 30_000)`) to all 10
tests in the describe block. The user brief named three sibling tests
(`writes code_graders/<name>.json for deterministic assertions`,
`omits experiment from manifest...`, `falls back to eval file
basename...`); the remaining seven exhibited identical SIGTERM-at-5s
failures in the same suite run, so the bump is applied uniformly to
prevent partial coverage churn.

Same one-line treatment per the #1170 precedent.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: pipeline-e2e flake at 5000ms default timeout

1 participant