[audit-workflows] 🔍 Agentic Workflow Audit — 2026-07-01 #42840

2026-07-01T21:55:43Z

github-actions[bot]
Bot Jul 1, 2026

Daily audit of agentic workflow runs (evening window; log download capped at the ~60s bridge timeout, so 84 of 196 run directories had complete run_summary.json and were analyzed).

Overview

Metric	Value
Runs analyzed	84
Success	72 (85.7%)
Failure	12 (14.3%)
Total AIC (credits)	5,703.6
Action-minutes	1,037
Missing tools / data / MCP failures	0 / 0 / 0 ✅
Tokens	unreported — artifact empty since 2026-06-19; AIC is the primary usage metric

TokenUsage remains 0 across all runs (known empty-artifact condition since 06-19). Cost/usage tracked via AIC instead.

Engine Health

Engine	Runs	Fail	Success %	AIC
copilot	50	9	82.0%	4,005
pi	16	0	100% ✅	155
claude	13	1	92.3%	1,528
codex	3	2	33.3% ⚠️	16
gemini	1	0	100%	0
antigravity	1	0	100%	0

Failure Clusters (12 failures → 3 root-cause groups)

All 12 failures were 0-turn / 0-token — i.e. the agent never produced output; failures occurred in setup/CLI-launch or post-agent steps, not during reasoning.

Cluster A — copilot "Execute GitHub Copilot CLI" 0-turn (8 fails) — CHRONIC

Workflow	Fail/Runs	Note
Smoke CI	4/4 = 100%	Still 100% red on every push to main
Daily Formal Spec Verifier	1/1
Matt Pocock Skills Reviewer	1/5
Smoke Copilot Sub Agents	1/1
Daily Safe Output Integrator	1/1

Agent job fails at the Execute GitHub Copilot CLI step after a few minutes with Turns=0 / Tokens=0 / ErrorCount=0. Maps to known issues copilot-sdk-driver-failures (recurrence now 20) and smoke-ci-copilot-cli-100pct-fail-on-push (recurrence now 2 — 2nd consecutive 100%-fail window).

Cluster B — NEW: cross-engine "Install Playwright CLI" failures (3 fails)

Workflow	Engine
Slide Deck Maintainer	copilot
Smoke Claude	claude
Smoke Codex	codex

All three failed at the Install Playwright CLI agent-setup step in the same window, across three different engines. The cross-engine signature strongly implies an infra/network cause (Playwright browser download / npm registry / firewall allowlist), not an engine bug. New known issue: playwright-install-cli-failure.

Cluster C — codex "Process Safe Outputs" failure (1 fail)

Changeset Generator (codex): agent job succeeded, but the safe_outputs job failed at the Process Safe Outputs step. Consistent with the chronic safe-output-partial-failure-intolerance pattern (recurrence now 8) where a single safe-output item reds the whole job.

📈 30-Day Trends

Workflow Health — success/failure counts with success-rate overlay (dashed line = 85% baseline). Volume swings reflect the audit's variable download window; today's 85.7% sits right on the long-run baseline, with the failure share dominated by the copilot Execute-CLI cluster.

Token Usage — daily tokens (M) + 7-day moving average. The shaded region marks the ongoing gap: the token artifact has been empty since 2026-06-19, so per-run token accounting is unavailable and AIC has become the effective usage metric (5,703.6 today). Restoring token reporting would re-enable this chart.

🎯 Recommended Actions

[HIGH] Escalate Smoke CI. copilot/claude-sonnet-4.6 Smoke CI is now 100% red across 2+ consecutive windows at Execute GitHub Copilot CLI (0-turn). If Smoke CI is the copilot-engine canary, its constant failure on push likely signals a real copilot CLI startup regression on main. Add a fast-fail diagnostic that surfaces why the CLI aborts 0-turn.
[MEDIUM] Harden Playwright install. Confirm the Playwright download/npm hosts are on the firewall allowlist for playwright-enabled workflows, add retry-with-backoff, and cache browser binaries so a transient download failure doesn't red the whole agent job.
[MEDIUM] Isolate safe-output failures. The Process Safe Outputs step continues to red otherwise-green runs (Changeset Generator today). Make per-item safe-output failures non-fatal to the job.

✅ Positives

pi engine: 16/16 = 100% success.
Zero missing-tools, missing-data, and MCP-failure reports across all 84 runs — tool/permission surface is healthy.
No new categories of failure beyond the Playwright infra cluster; the rest are known, tracked recurrences.

Repo memory updated

audit-history.jsonl, known-issues.json (+1 new: playwright-install-cli-failure; bumped copilot-sdk, smoke-ci, safe-output recurrences), anomalies.json, recommendations.json, metrics-summary.json, workflow-trends.json. Files compacted to fit the 60 KB memory budget (50.8 KB total, push validated).

References:

§28544490596 — Smoke CI (Execute Copilot CLI 0-turn)
§28542293845 — Smoke Claude (Playwright install fail)
§28542294866 — Changeset Generator (Process Safe Outputs fail)

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

Generated by 🔍 Agentic Workflow Audit Agent · 241.9 AIC · ⌖ 31.7 AIC · ⊞ 7.3K · ◷

expires on Jul 2, 2026, 1:55 PM UTC-08:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[audit-workflows] 🔍 Agentic Workflow Audit — 2026-07-01 #42840

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

[audit-workflows] 🔍 Agentic Workflow Audit — 2026-07-01 #42840

Uh oh!

github-actions[bot] Bot Jul 1, 2026

Overview

Engine Health

Failure Clusters (12 failures → 3 root-cause groups)

📈 30-Day Trends

🎯 Recommended Actions

✅ Positives

Replies: 0 comments

github-actions[bot]
Bot Jul 1, 2026