Problem Statement
The Design Decision Gate workflow was previously fixed by raising --max-turns from 5 to 10 (per investigation in #27469). However, the workflow is still producing false-failure runs: the agent successfully commits an ADR draft and posts a PR comment, but exits code 1 because it consumed 11 turns against a 10-turn cap.
Affected Workflow and Run
| Workflow |
PR |
Run ID |
Branch |
Outputs Delivered? |
| Design Decision Gate |
#27479 |
§24706352108 |
copilot/add-safe-output-type-comment-memory |
✅ ADR committed + PR comment posted |
Run 24706352108 is the definitive evidence: the agent used 11 turns against a 10-turn cap. Terminal exit reason was error_max_turns. Both safeoutputs calls — push_to_pull_request_branch and add_comment — succeeded. The task was completed correctly; only the exit code is wrong.
Root Cause
The task shape requires more than 10 turns:
read git bundle → inspect diff → generate ADR → write file → git add/commit → push branch → post comment
This audit found 15 distinct tool types across 11 turns, with approximately 50% of turns spent on deterministic data-gathering reads that could be moved to pre-agent steps. The cap was raised 5→10 but 10 is still short by 1–2 turns.
Audit metrics (run 24706352108):
- Turns used: 11, cap: 10
- Cache efficiency: 97.3% (tokens: 734K input, 133K effective)
- Estimated cost: $0.67
- Firewall: 17 requests, 0 blocked (all
api.anthropic.com)
- MCP health: 1/1 (safeoutputs)
Proposed Remediation
Option A — Quick fix (raise cap): Increase --max-turns to 15 in the Design Decision Gate workflow definition. Low risk, immediate unblocking.
Option B — Better long-term (pre-agent steps): Move deterministic operations (read git bundle, extract diff, write file list to /tmp/gh-aw/agent/context.json) into pre-agent harness steps. Removes ~5–6 turns from the agentic phase and keeps the task within the existing cap while reducing cost.
Option C — Outcome-based pass/fail: Add a post-agent step that reads safeoutputs.jsonl and marks the run success if both push_to_pull_request_branch and add_comment succeeded — independent of the Claude exit code. This would have correctly passed run 24706352108 without changing the cap.
Recommendation: implement Option A immediately, Option B as a follow-on.
Success Criteria
- Design Decision Gate runs where the agent delivers all required outputs are marked
conclusion: success
- No false-negative failures from turn-count ceiling
- Verified by re-running on a PR with an active ADR task
References: §24706352108
Related to #27411
Generated by [aw] Failure Investigator (6h) · ● 840.5K · ◷
Problem Statement
The Design Decision Gate workflow was previously fixed by raising
--max-turnsfrom 5 to 10 (per investigation in #27469). However, the workflow is still producing false-failure runs: the agent successfully commits an ADR draft and posts a PR comment, but exits code 1 because it consumed 11 turns against a 10-turn cap.Affected Workflow and Run
Run 24706352108 is the definitive evidence: the agent used 11 turns against a 10-turn cap. Terminal exit reason was
error_max_turns. Bothsafeoutputscalls —push_to_pull_request_branchandadd_comment— succeeded. The task was completed correctly; only the exit code is wrong.Root Cause
The task shape requires more than 10 turns:
This audit found 15 distinct tool types across 11 turns, with approximately 50% of turns spent on deterministic data-gathering reads that could be moved to pre-agent steps. The cap was raised 5→10 but 10 is still short by 1–2 turns.
Audit metrics (run 24706352108):
api.anthropic.com)Proposed Remediation
Option A — Quick fix (raise cap): Increase
--max-turnsto15in the Design Decision Gate workflow definition. Low risk, immediate unblocking.Option B — Better long-term (pre-agent steps): Move deterministic operations (read git bundle, extract diff, write file list to
/tmp/gh-aw/agent/context.json) into pre-agent harness steps. Removes ~5–6 turns from the agentic phase and keeps the task within the existing cap while reducing cost.Option C — Outcome-based pass/fail: Add a post-agent step that reads
safeoutputs.jsonland marks the runsuccessif bothpush_to_pull_request_branchandadd_commentsucceeded — independent of the Claude exit code. This would have correctly passed run 24706352108 without changing the cap.Recommendation: implement Option A immediately, Option B as a follow-on.
Success Criteria
conclusion: successReferences: §24706352108
Related to #27411