Overview
Failure investigation covering the 6h window ending 2026-05-03T07:27Z. Found 3 failed runs out of 34 total (8.8% failure rate). Two workflows are P1 regressions; one is a P0 chronic failure spanning 7 consecutive days.
⚠️ GitHub issue correlation was skipped — gh CLI is not authenticated in this environment. New issues may duplicate existing tracking; please cross-check the agentic-workflows open issues before acting on sub-issues.
Failure Clusters
| # |
Workflow |
Run ID |
Engine |
Duration |
Severity |
Root Cause |
| 1 |
GitHub Remote MCP Authentication Test |
§25271765723 |
Copilot / gpt-4.1-mini |
2.2m |
P0 chronic |
Unsupported model — 7 consecutive failures |
| 2 |
Schema Consistency Checker |
§25271731933 |
Claude Code |
7.5m |
P1 regression |
max_turns (60) hit; playwright CLI migration expanded scope |
| 3 |
Artifacts Summary |
§25271895822 |
Copilot / claude-sonnet-4.6 |
17m |
P1 first-run |
Infeasible pagination strategy; never reached safeoutputs |
Evidence
Cluster 1 — GitHub Remote MCP Authentication Test (P0 chronic)
7 consecutive failures (2026-04-27 → 2026-05-03); no successful run in history.
Error from agent-stdio.log:
● Request failed (transient_bad_request). Retrying...
● Request failed (transient_bad_request). Retrying...
400 The requested model is not supported.
- Engine: Copilot CLI v1.0.40, model
gpt-4.1-mini
- Turns: 0 (agent exits immediately before any work)
- Firewall: 2 requests blocked from
(unknown) domain (likely the model endpoint)
- All runs: Apr 27, Apr 28, Apr 29, Apr 30, May 2, May 3
Cluster 2 — Schema Consistency Checker (P1 regression)
First failure after 6 consecutive successful runs. Triggered same day as commit 15b7dd5 (playwright CLI migration).
Error from agent-stdio.log:
{"subtype":"error_max_turns","stop_reason":"tool_use",
"total_cost_usd":2.0151049,"terminal_reason":"max_turns"}
Audit-diff vs last success (run 25245322929 → 25271731933):
- Removed:
safeoutputs.create_discussion (was called 1× in successful run — never reached in failed run)
- 85 new bash investigation tools (all
# Check playwright..., # Check mcp..., schema inspection) that weren't in the previous run
- GitHub API core consumption: +1,671% (21 → 372 quota units)
- Turns: 61 (hit the 60-turn cap) vs unknown baseline
- Cost: $2.02 for a failed run
Root cause: The playwright CLI migration commit added new schema fields/patterns that Schema Consistency Checker tried to fully validate, causing an exploratory bash loop that consumed all 60 turns before reaching the reporting step.
Cluster 3 — Artifacts Summary (P1 first-run)
First-ever run of this workflow; immediately failed.
Agent approach observed in logs:
- Tried
gh api repos/github/gh-aw/actions/artifacts?per_page=100 — got partial data
- Attempted to paginate 20+ pages of artifacts to build 30-day summary
- Tried to build
run_id → workflow_name mapping by paginating 500+ runs
- Recognized "at 500 runs per ~4 hours, it's ~90k runs for 30 days — This isn't feasible"
- Tried alternative strategies but never produced output;
safe_outputs job was skipped
Key metrics: 17 minutes, 20 turns, 843K tokens (842K cache reads = 48.5% cache hit rate)
Proposed Fix Roadmap
| Priority |
Issue |
Workflow |
Action |
| P0 |
Unsupported model gpt-4.1-mini |
GitHub Remote MCP Authentication Test |
Replace model with supported alternative |
| P1 |
max_turns regression from playwright migration |
Schema Consistency Checker |
Add turn budget or pre-agent deterministic data collection |
| P1 |
Infeasible pagination strategy |
Artifacts Summary |
Rewrite with deterministic pre-agent artifact collection |
References:
Generated by [aw] Failure Investigator (6h) · ● 401.4K · ◷
Overview
Failure investigation covering the 6h window ending 2026-05-03T07:27Z. Found 3 failed runs out of 34 total (8.8% failure rate). Two workflows are P1 regressions; one is a P0 chronic failure spanning 7 consecutive days.
Failure Clusters
Evidence
Cluster 1 — GitHub Remote MCP Authentication Test (P0 chronic)
7 consecutive failures (2026-04-27 → 2026-05-03); no successful run in history.
Error from
agent-stdio.log:gpt-4.1-mini(unknown)domain (likely the model endpoint)Cluster 2 — Schema Consistency Checker (P1 regression)
First failure after 6 consecutive successful runs. Triggered same day as commit
15b7dd5(playwright CLI migration).Error from
agent-stdio.log:{"subtype":"error_max_turns","stop_reason":"tool_use", "total_cost_usd":2.0151049,"terminal_reason":"max_turns"}Audit-diff vs last success (run 25245322929 → 25271731933):
safeoutputs.create_discussion(was called 1× in successful run — never reached in failed run)# Check playwright...,# Check mcp..., schema inspection) that weren't in the previous runRoot cause: The playwright CLI migration commit added new schema fields/patterns that Schema Consistency Checker tried to fully validate, causing an exploratory bash loop that consumed all 60 turns before reaching the reporting step.
Cluster 3 — Artifacts Summary (P1 first-run)
First-ever run of this workflow; immediately failed.
Agent approach observed in logs:
gh api repos/github/gh-aw/actions/artifacts?per_page=100— got partial datarun_id → workflow_namemapping by paginating 500+ runssafe_outputsjob was skippedKey metrics: 17 minutes, 20 turns, 843K tokens (842K cache reads = 48.5% cache hit rate)
Proposed Fix Roadmap
gpt-4.1-miniReferences: