fix: harden E2E tests against CI timing flakiness#787
Merged
alishakawaguchi merged 3 commits intomainfrom Mar 26, 2026
Merged
Conversation
… bugs Timing fixes (legitimate — CI is slower than local): - WaitForCheckpoint/WaitForCheckpointAdvanceFrom: 15s → 30s everywhere - AssertNoShadowBranches → WaitForNoShadowBranches(10s) with polling (shadow branch cleanup is async, needs time to complete) - opencode StartSession: startup timeout 15s → 30s (TUI render + settle) - TestPartialStaging: per-prompt timeout 90s → 2m (cursor agent slowness) Real bugs documented (not worked around): - Mid-turn commits don't get checkpoint trailers for headless agents - Shadow branches left orphaned after carry-forward - Agent-internal files (.opencode/) included in files_touched See e2e/OPENCODE_BUGS.md for full details. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: c75d6c04967c
Contributor
There was a problem hiding this comment.
Pull request overview
Harden the E2E suite against CI timing variability by increasing checkpoint/prompt timeouts and replacing brittle immediate assertions with polling, while documenting known OpenCode-specific CLI bugs uncovered by E2E runs.
Changes:
- Increase checkpoint wait timeouts (15s → 30s) across E2E tests and add per-prompt timeout overrides where needed.
- Replace
AssertNoShadowBrancheswith polling-basedWaitForNoShadowBranches(10s)to tolerate async cleanup lag. - Increase OpenCode TUI startup wait (15s → 30s) and add
e2e/OPENCODE_BUGS.mddocumenting observed issues.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| e2e/testutil/assertions.go | Adds WaitForNoShadowBranches polling helper to reduce shadow-branch cleanup flakes. |
| e2e/tests/subagent_commit_flow_test.go | Uses longer checkpoint wait + polls for shadow-branch cleanup. |
| e2e/tests/stash_workflows_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup (with added rationale comment). |
| e2e/tests/split_commits_test.go | Uses longer checkpoint waits, polls for cleanup, and increases per-prompt timeout via agents.WithPromptTimeout. |
| e2e/tests/single_session_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/session_lifecycle_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/rewind_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/resume_test.go | Uses longer checkpoint waits for resume flows. |
| e2e/tests/resume_remote_test.go | Uses longer checkpoint waits for remote resume flows. |
| e2e/tests/multi_session_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/mid_turn_commit_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/interactive_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/external_agent_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/existing_files_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/edge_cases_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/deleted_files_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/checkpoint_metadata_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/tests/attribution_test.go | Uses longer checkpoint waits + polls for shadow-branch cleanup. |
| e2e/agents/opencode.go | Increases OpenCode TUI readiness wait to reduce CI startup flakiness. |
| e2e/OPENCODE_BUGS.md | Documents OpenCode-driven failures as real CLI bugs (no test workarounds). |
- Remove unused AssertNoShadowBranches (all callers migrated) - WaitForNoShadowBranches: use require.Emptyf to fail fast, include timeout in error message for easier debugging Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Entire-Checkpoint: a4a48ee818e0
Contributor
Author
|
bugbot run |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
WaitForCheckpoint/WaitForCheckpointAdvanceFrom15s → 30s,AssertNoShadowBranches→ pollingWaitForNoShadowBranches(10s), opencode startup timeout 15s → 30s, cursor per-prompt timeout 90s → 2me2e/OPENCODE_BUGS.mdexposed by opencode E2E tests (mid-turn checkpoint trailers, orphaned shadow branches, agent-internal files in files_touched)What this does NOT do
These are timing-only fixes. The 3 real opencode bugs documented in
OPENCODE_BUGS.mdare not worked around — the tests will continue to fail for opencode until the CLI bugs are fixed.Test plan
mise run fmt && mise run lint— passesmise run test:e2e:canary— passes🤖 Generated with Claude Code
Note
Low Risk
Low risk: changes are confined to E2E harness/test timing and assertions, with no impact on production code paths. Main risk is masking real regressions by allowing slower completion or delayed cleanup in CI.
Overview
Hardens E2E tests against CI timing variance by increasing checkpoint-related waits (typically
15s→30s) and extending OpenCode TUI startup readiness waiting.Replaces the instant
AssertNoShadowBranchescheck with a pollingWaitForNoShadowBrancheshelper to tolerate asynchronous shadow-branch deletion after condensation, and adjusts a couple of prompts to use a longer per-prompt timeout.Written by Cursor Bugbot for commit 601f139. Configure here.