Skip to content

[aw-failures] Design Decision Gate still failing at --max-turns 10 after prior 5→10 fix (false failure, outputs delivered) #27513

@github-actions

Description

@github-actions

Problem Statement

The Design Decision Gate workflow was previously fixed by raising --max-turns from 5 to 10 (per investigation in #27469). However, the workflow is still producing false-failure runs: the agent successfully commits an ADR draft and posts a PR comment, but exits code 1 because it consumed 11 turns against a 10-turn cap.

Affected Workflow and Run

Workflow PR Run ID Branch Outputs Delivered?
Design Decision Gate #27479 §24706352108 copilot/add-safe-output-type-comment-memory ✅ ADR committed + PR comment posted

Run 24706352108 is the definitive evidence: the agent used 11 turns against a 10-turn cap. Terminal exit reason was error_max_turns. Both safeoutputs calls — push_to_pull_request_branch and add_comment — succeeded. The task was completed correctly; only the exit code is wrong.

Root Cause

The task shape requires more than 10 turns:

read git bundle → inspect diff → generate ADR → write file → git add/commit → push branch → post comment

This audit found 15 distinct tool types across 11 turns, with approximately 50% of turns spent on deterministic data-gathering reads that could be moved to pre-agent steps. The cap was raised 5→10 but 10 is still short by 1–2 turns.

Audit metrics (run 24706352108):

  • Turns used: 11, cap: 10
  • Cache efficiency: 97.3% (tokens: 734K input, 133K effective)
  • Estimated cost: $0.67
  • Firewall: 17 requests, 0 blocked (all api.anthropic.com)
  • MCP health: 1/1 (safeoutputs)

Proposed Remediation

Option A — Quick fix (raise cap): Increase --max-turns to 15 in the Design Decision Gate workflow definition. Low risk, immediate unblocking.

Option B — Better long-term (pre-agent steps): Move deterministic operations (read git bundle, extract diff, write file list to /tmp/gh-aw/agent/context.json) into pre-agent harness steps. Removes ~5–6 turns from the agentic phase and keeps the task within the existing cap while reducing cost.

Option C — Outcome-based pass/fail: Add a post-agent step that reads safeoutputs.jsonl and marks the run success if both push_to_pull_request_branch and add_comment succeeded — independent of the Claude exit code. This would have correctly passed run 24706352108 without changing the cap.

Recommendation: implement Option A immediately, Option B as a follow-on.

Success Criteria

  • Design Decision Gate runs where the agent delivers all required outputs are marked conclusion: success
  • No false-negative failures from turn-count ceiling
  • Verified by re-running on a PR with an active ADR task

References: §24706352108
Related to #27411

Generated by [aw] Failure Investigator (6h) · ● 840.5K ·

  • expires on Apr 28, 2026, 7:29 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions