You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Window: partial ~9.5h (11:39–21:12Z) · 101 runs with full summaries (of 301 downloaded dirs; bridge 120s cap truncated the fetch) · repository github/gh-aw
Headline
89.1% success (90/101) → 90.9% excluding 2 intentional credit-cap test workflows. Rebound from the 07-03 trough (68.1%).
🎉 pi engine fully recovered: 27/27 = 100% (was 30.8% on 07-03). PR Sous Chef 19/19 = 100% — the workflow whose sustained 80%-fail incident collapsed pi on 07-03 is now clean. Auto-Triage Issues 2/2 also recovered. The 07-02→07-03 pi-0turn incident is RESOLVED.
All 9 real failures are copilot-engine 0-turn/0-tok pre-agent driver fails. claude 15/15, pi 27/27, codex 2/2 all clean.
0 missing-tools · 0 missing-data · 0 MCP failures · 0 noop-errors. No tool-gap signal this window.
Every failure is concentrated in the copilot engine; all other engines are perfect this window.
Trend charts
Workflow health (30d observed windows)
Success rate rebounds to 89.1% after the 07-02/07-03 dip (60%/68%), returning toward the ~85–92% baseline. The dip was driven by the transient PR Sous Chef / pi incident, now resolved — the bar composition today shows a thin failure band entirely from copilot 0-turn fails.
Token usage (7-day moving average)
Daily observed tokens sit at 4.37M, below the 7-day average — consistent with a partial (~9.5h) observation window rather than a real drop in fleet load. metrics.TokenUsage remains 0 fleet-wide; the working source stays token_usage_summary.total_*.
Failure breakdown (all copilot 0-turn)
9 real failures across 3 known clusters
A. Smoke CI — 4/4 = 100% fail on push to main (chronic, smoke-ci-copilot-cli-100pct-fail-on-push)
copilot/claude-sonnet-4.6, ~3m, 0-turn at Execute GitHub Copilot CLI. Now failing across 06-30...07-04 observed windows. High-frequency (fires on every push) → constant red on main.
B. copilot-sdk-driver longrun-0turn (copilot-sdk-driver-failures, recur 22) — agent runs real work, then the job fails 0-turn/0-tok:
C. chroot-node (chroot-node-not-available, recur 10)
Daily Issues Report Generator — 10.1m, 0-turn.
Excluded (intentional): Daily Max Ai Credits Test, Daily Credit Limit Test — credit-cap probes designed to 0-turn.
What changed vs prior audits
Resolved: 07-03 pi-collapse / PR Sous Chef incident (pi 30.8%→100%).
Chronic, unfixed: Smoke CI 100%-on-push (copilot CLI startup), copilot-sdk longrun agent-job fails, chroot-node (Daily Issues Report Generator), codex gh-aw-binary-not-found (no codex prod runs observed this window).
Absent: Avenger had 0 runs (schedule appears paused again); err-config root cause still unaddressed when it fires.
Recommendations (carried, still open)
P1 — Smoke CI copilot-CLI startup on main. 100% deterministic red on every push across 4+ windows. If Smoke CI is the copilot-engine canary, its constant red is a real startup-regression signal. Capture stderr/exit-code at the Execute GitHub Copilot CLI step for a fast-fail diagnostic.
HIGH — copilot-sdk longrun 0-turn family. 4 workflows burned ~88 agent-minutes of real work today then reddened at 0-turn. Instrument the CLI exec step to emit exit code + stderr into artifacts so these stop being opaque; distinguish instant vs longrun sub-variants.
MEDIUM — chroot-node. Daily Issues Report Generator still fails because node isn't on PATH in the AWF chroot; add a pre-flight node --version fast-fail and bind-mount/bundle node.
WATCH — pi/gpt-5.4. Recovered today; watch 07-05 for relapse before closing pi-gpt54-0tok-agentjob-fail.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Agentic Workflow Audit — 2026-07-04
Window: partial ~9.5h (11:39–21:12Z) · 101 runs with full summaries (of 301 downloaded dirs; bridge 120s cap truncated the fetch) · repository
github/gh-awHeadline
claude15/15,pi27/27,codex2/2 all clean.Engine health
Every failure is concentrated in the copilot engine; all other engines are perfect this window.
Trend charts
Workflow health (30d observed windows)
Success rate rebounds to 89.1% after the 07-02/07-03 dip (60%/68%), returning toward the ~85–92% baseline. The dip was driven by the transient PR Sous Chef / pi incident, now resolved — the bar composition today shows a thin failure band entirely from copilot 0-turn fails.
Token usage (7-day moving average)
Daily observed tokens sit at 4.37M, below the 7-day average — consistent with a partial (~9.5h) observation window rather than a real drop in fleet load.
metrics.TokenUsageremains 0 fleet-wide; the working source staystoken_usage_summary.total_*.Failure breakdown (all copilot 0-turn)
9 real failures across 3 known clusters
A. Smoke CI — 4/4 = 100% fail on push to main (chronic,
smoke-ci-copilot-cli-100pct-fail-on-push)B. copilot-sdk-driver longrun-0turn (
copilot-sdk-driver-failures, recur 22) — agent runs real work, then the job fails 0-turn/0-tok:C. chroot-node (
chroot-node-not-available, recur 10)Excluded (intentional): Daily Max Ai Credits Test, Daily Credit Limit Test — credit-cap probes designed to 0-turn.
What changed vs prior audits
Recommendations (carried, still open)
nodeisn't on PATH in the AWF chroot; add a pre-flightnode --versionfast-fail and bind-mount/bundle node.pi-gpt54-0tok-agentjob-fail.References:
Warning
Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
awmgmcpgSee Network Configuration for more information.
Beta Was this translation helpful? Give feedback.
All reactions