Skip to content

chore(timeouts): loosen aggressive defaults causing canceled steps#1589

Merged
nextlevelshit merged 2 commits into
mainfrom
chore/loosen-timeouts
Apr 29, 2026
Merged

chore(timeouts): loosen aggressive defaults causing canceled steps#1589
nextlevelshit merged 2 commits into
mainfrom
chore/loosen-timeouts

Conversation

@nextlevelshit
Copy link
Copy Markdown
Collaborator

Summary

During Phase 1 dispatch (#1577 + #1578) the impl-issue pipeline produced `canceled` failure-class step terminations even when the step was making clear progress — caused by a 5-minute `StepDefault` that's tighter than `go test ./...` typically takes on this codebase.

Changes

Constants in `internal/timeouts/timeouts.go`:

Constant Old New
StepDefault 5m 30m
RelayCompaction 5m 15m
MetaDefault 30m 60m
SkillInstall/CLI/HTTP 2m 5m
SkillHTTPHeader 30s 60s
SkillPublish 30s 60s
GatePollTimeout 30m 60m
GitCommand 30s 90s
ForgeAPI 15s 60s
ForgeAPIList 30s 90s

Pipeline yaml agent_review contract timeouts — all 60/90/120s → 300s across:

  • audit-architecture.yaml, audit-issue.yaml, audit-security.yaml, audit-tests.yaml
  • impl-issue.yaml, impl-issue-core.yaml
  • ops-pr-respond.yaml, ops-pr-review.yaml

Why

Agent reviews on multi-file diffs need real read time. 90s was the dominant "contract validation failed" cause in Phase 1 runs. Step defaults that fight the rest of the system mask real failures behind `canceled` noise.

All values still overridable per-step in pipeline yaml or per-runtime in wave.yaml; this only loosens the fallback floor.

Test plan

  • `go test ./internal/timeouts/...` green
  • `go build ./...` clean
  • Future impl-issue dispatches no longer canceled at the 5-min mark

Related

Real-world signal: impl-issue runs on the Wave codebase that exercise
\`go test ./...\` routinely take 6-10 minutes; the prior 5-minute
StepDefault forced a "canceled" failure class even when the step was
making clear progress (caught during Phase 1 dispatch on #1577 + #1578).

Bumps:

- StepDefault           5m → 30m
- RelayCompaction       5m → 15m
- MetaDefault          30m → 60m
- SkillInstall/CLI/HTTP 2m → 5m
- SkillHTTPHeader      30s → 60s
- SkillPublish         30s → 60s
- GatePollTimeout      30m → 60m
- GitCommand           30s → 90s
- ForgeAPI             15s → 60s
- ForgeAPIList         30s → 90s

Pipeline yaml agent_review contract timeouts (60/90/120s) → 300s
across audit-*, impl-issue, ops-pr-* — agent reviews on multi-file
diffs need real read time and the prior bound was the dominant
"contract validation failed" cause in Phase 1.

All values stay overridable per-step in pipeline yaml or per-runtime
in wave.yaml; this only loosens the fallback floor.
@nextlevelshit nextlevelshit merged commit bf39994 into main Apr 29, 2026
10 checks passed
@nextlevelshit nextlevelshit deleted the chore/loosen-timeouts branch April 29, 2026 23:59
nextlevelshit added a commit that referenced this pull request Apr 30, 2026
Per memory feedback_defaults_agnostic: internal/defaults/ ships
language-agnostic content. Hardcoding *_test.go + 'func Test*' patterns
into the shipped impl-issue.yaml breaks Wave for Node/Python/Rust users.

Test-deletion guard stays in .agents/ (Wave's own project config).
Other projects opt-in via their own .agents/pipelines/impl-issue.yaml.

Also reverts the embedfs/ timeout regression that re-introduced
120s/90s contract bounds (PR #1589 already bumped them to 300s).
nextlevelshit added a commit that referenced this pull request Apr 30, 2026
…1596)

* feat(pipeline): reject net test deletions in impl-issue via llm_judge

Adds a navigator step `judge-test-deletion` between `implement` and
`create-pr` in both `.agents/pipelines/impl-issue.yaml` and the embedfs
default. The step captures the diff of `_test.go` files into
`.agents/output/test-diff.md`, then an `llm_judge` contract evaluates a
single binary criterion: net removal of `func Test*` declarations is
rejected unless replaced. On failure the step reworks via the existing
`fix-implement` step. Mirrors the failure mode reproduced in run
`impl-issue-20260429-221616-d8e4` on issue #1580.

Regression coverage in `internal/contract/llm_judge_test.go` exercises
the three diff shapes the gate must distinguish: deletion-only (FAIL),
addition-only (PASS), replacement (PASS).

Related to #1582

* fix(defaults): embed judge-test-deletion prompt in shipped defaults

The shipped impl-issue.yaml references
.agents/prompts/implement/judge-test-deletion.md, which lived only in the
project's local .agents tree. TestShippedPipelines_ValidateAll caught the
gap by walking the embedfs and verifying every source_path resolves.
Mirror the prompt under internal/defaults/embedfs/prompts/implement/ so
the default works for users who haven't run wave init.

* fix(scope): revert defaults regression — guard belongs in .agents/ only

Per memory feedback_defaults_agnostic: internal/defaults/ ships
language-agnostic content. Hardcoding *_test.go + 'func Test*' patterns
into the shipped impl-issue.yaml breaks Wave for Node/Python/Rust users.

Test-deletion guard stays in .agents/ (Wave's own project config).
Other projects opt-in via their own .agents/pipelines/impl-issue.yaml.

Also reverts the embedfs/ timeout regression that re-introduced
120s/90s contract bounds (PR #1589 already bumped them to 300s).

* fix(defaults): also revert impl-issue.yaml — embedfs untouched, project-only guard
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant