chore(timeouts): loosen aggressive defaults causing canceled steps#1589
Merged
Conversation
Real-world signal: impl-issue runs on the Wave codebase that exercise \`go test ./...\` routinely take 6-10 minutes; the prior 5-minute StepDefault forced a "canceled" failure class even when the step was making clear progress (caught during Phase 1 dispatch on #1577 + #1578). Bumps: - StepDefault 5m → 30m - RelayCompaction 5m → 15m - MetaDefault 30m → 60m - SkillInstall/CLI/HTTP 2m → 5m - SkillHTTPHeader 30s → 60s - SkillPublish 30s → 60s - GatePollTimeout 30m → 60m - GitCommand 30s → 90s - ForgeAPI 15s → 60s - ForgeAPIList 30s → 90s Pipeline yaml agent_review contract timeouts (60/90/120s) → 300s across audit-*, impl-issue, ops-pr-* — agent reviews on multi-file diffs need real read time and the prior bound was the dominant "contract validation failed" cause in Phase 1. All values stay overridable per-step in pipeline yaml or per-runtime in wave.yaml; this only loosens the fallback floor.
45 tasks
nextlevelshit
added a commit
that referenced
this pull request
Apr 30, 2026
Per memory feedback_defaults_agnostic: internal/defaults/ ships language-agnostic content. Hardcoding *_test.go + 'func Test*' patterns into the shipped impl-issue.yaml breaks Wave for Node/Python/Rust users. Test-deletion guard stays in .agents/ (Wave's own project config). Other projects opt-in via their own .agents/pipelines/impl-issue.yaml. Also reverts the embedfs/ timeout regression that re-introduced 120s/90s contract bounds (PR #1589 already bumped them to 300s).
nextlevelshit
added a commit
that referenced
this pull request
Apr 30, 2026
…1596) * feat(pipeline): reject net test deletions in impl-issue via llm_judge Adds a navigator step `judge-test-deletion` between `implement` and `create-pr` in both `.agents/pipelines/impl-issue.yaml` and the embedfs default. The step captures the diff of `_test.go` files into `.agents/output/test-diff.md`, then an `llm_judge` contract evaluates a single binary criterion: net removal of `func Test*` declarations is rejected unless replaced. On failure the step reworks via the existing `fix-implement` step. Mirrors the failure mode reproduced in run `impl-issue-20260429-221616-d8e4` on issue #1580. Regression coverage in `internal/contract/llm_judge_test.go` exercises the three diff shapes the gate must distinguish: deletion-only (FAIL), addition-only (PASS), replacement (PASS). Related to #1582 * fix(defaults): embed judge-test-deletion prompt in shipped defaults The shipped impl-issue.yaml references .agents/prompts/implement/judge-test-deletion.md, which lived only in the project's local .agents tree. TestShippedPipelines_ValidateAll caught the gap by walking the embedfs and verifying every source_path resolves. Mirror the prompt under internal/defaults/embedfs/prompts/implement/ so the default works for users who haven't run wave init. * fix(scope): revert defaults regression — guard belongs in .agents/ only Per memory feedback_defaults_agnostic: internal/defaults/ ships language-agnostic content. Hardcoding *_test.go + 'func Test*' patterns into the shipped impl-issue.yaml breaks Wave for Node/Python/Rust users. Test-deletion guard stays in .agents/ (Wave's own project config). Other projects opt-in via their own .agents/pipelines/impl-issue.yaml. Also reverts the embedfs/ timeout regression that re-introduced 120s/90s contract bounds (PR #1589 already bumped them to 300s). * fix(defaults): also revert impl-issue.yaml — embedfs untouched, project-only guard
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
During Phase 1 dispatch (#1577 + #1578) the impl-issue pipeline produced `canceled` failure-class step terminations even when the step was making clear progress — caused by a 5-minute `StepDefault` that's tighter than `go test ./...` typically takes on this codebase.
Changes
Constants in `internal/timeouts/timeouts.go`:
Pipeline yaml agent_review contract timeouts — all 60/90/120s → 300s across:
Why
Agent reviews on multi-file diffs need real read time. 90s was the dominant "contract validation failed" cause in Phase 1 runs. Step defaults that fight the rest of the system mask real failures behind `canceled` noise.
All values still overridable per-step in pipeline yaml or per-runtime in wave.yaml; this only loosens the fallback floor.
Test plan
Related