[audit-workflows] Agentic Workflow Audit — 2026-05-21 (24h) #33873

2026-05-21T21:56:14Z

github-actions[bot]
Bot May 21, 2026

Overview

Last 24h: 41 completed runs (+ 5 in-progress), 4 failures, 1 cancelled, success rate 75.6% (down from 85.9% yesterday). The decline is driven by a 100% codex engine outage — all 4 production codex runs failed with invalid_request_error on model gpt-5.5 and Missing environment variable: OPENAI_API_KEY on retry. Yesterday's fix (fix-codex-openai-key-and-model) did not stick: the model name changed (was gpt-5-codex-alpha-2025-11-07, now gpt-5.5) but it is still invalid. Total spend $19.96 over 35.9M tokens / 338 action-minutes.

Health Summary

Metric	24h	vs 05-20
Runs (completed)	41	-47%
Success rate	75.6%	-10.3 pp
Failures	4	-2
Cancelled	1	—
Tokens (M)	35.9	-39.8%
Cost ($)	19.96	-25.3%
Action-minutes	338	-50%
Errors (logs tool)	7	-1

Engines: copilot 24, claude 9, codex 6 (4 real + 1 smoke + 1 in-progress), gemini 1, pi 1. Three runs had no engine_id resolved (Deployment Incident Monitor x2, Q x1 — all very short, likely activation-only).

Critical Issues

🔴 Codex engine: 100% production failure (4/4)

Workflow	Run	Model	Duration	Root cause
AI Moderator	§26248415738	gpt-5.5	15.7m	invalid_request_error + missing OPENAI_API_KEY
Daily Cache Strategy Analyzer	§26246943052	gpt-5.5	3.9m	same
Smoke Codex	§26254375843	gpt-5.5	3.4m	same
Changeset Generator	§26254375780	gpt-5.4-mini	3.8m	different — uses internal proxy `api-proxy:10000`

Evidence: entrypoint logs Unset OPENAI_API_KEY from /proc/1/environ and Unset CODEX_API_KEY from /proc/1/environ, then attempt 1 returns invalid_request_error, retries fail with Missing environment variable: OPENAI_API_KEY. AI Moderator alone burned 15.7 min wall before giving up (12.1 min pre_activation + 1.9 min agent retries).

Action: pin codex workflows to a valid model and ensure secret wiring survives the entrypoint scrub for retry attempts. The Changeset Generator path through internal proxy needs separate verification.

🟡 Daily Safe Output Tool Optimizer cost spike

Single run §26253685911-equivalent: $8.54, 12M tokens, 129 turns — 43% of today's total spend. Previous 3-run avg was ~$4.53. Either scope expanded or agent is looping. Worth a tool-call sequence inspection.

🟡 Smoke CI cancellation persists

1/5 Smoke CI runs cancelled at 2m wall (§26249207824). Same pattern as 05-19 / 05-20 — agent budget exhaustion. Known issue smoke-ci-agent-timeout now persisting 3 days.

Trend Charts

Success rate fell from a 92.8 → 88.9 → 85.9 → 75.6% trajectory. Today's failure count (5 including cancel) is in line with prior days, but on a much smaller denominator (41 vs 78), so the rate suffers. Failures are concentrated entirely in the codex engine outage.

Daily cost trends ~$20–$36 range with today at the low end despite a notable single-run spike. Tokens-per-dollar improving slightly (1.8M/$ today vs 1.93M/$ on 05-17 — token use is more efficient at smaller scale).

Top 10 cost drivers (24h)

Workflow	Runs	Tokens	Cost	Turns
Daily Safe Output Tool Optimizer	1	12,052,538	$8.54	129
Daily Code Metrics and Trend Tracking Agent	1	4,142,305	$3.94	53
Daily Team Evolution Insights	1	1,813,275	$2.02	29
Lockfile Statistics Analysis Agent	1	627,022	$1.53	7
Smoke Claude	1	1,115,146	$1.42	41
Daily Caveman Optimizer	1	700,769	$1.31	14
[aw] Failure Investigator (6h)	1	309,630	$0.61	6
Design Decision Gate 🏗️	1	232,383	$0.59	3
Smoke Copilot	1	1,584,211	$0.00	33
Agentic Workflow Portfolio Yield	1	1,831,719	$0.00	32

(Zero-cost rows are copilot-engine runs where cost is not reported in run summary.)

Firewall: 18% block rate across 17 workflows

Total: 2,108 requests, 1,723 allowed, 385 blocked (18.3%).

Top blocked patterns:

(unknown) host: 298 blocked (likely DNS-failed lookups inside containers)
api-proxy:10002: 20/20 (100%) — Smoke Pi
Google services (content-autofill.googleapis.com, www.google.com, accounts.google.com, safebrowsingohttpgateway.googleapis.com): ~45 combined — Smoke Copilot/Claude browser probes
localhost:8080: 15 — Smoke Gemini local-proxy probe

Per-workflow block rate:

Smoke Pi: 100% (20/20)
Smoke Copilot: 46% (72/157)
Smoke Gemini: 40% (17/42)
Daily Project Performance Summary Gen: 39%
Chaos PR Bundle Fuzzer: 36%
Matt Pocock Skills Reviewer: 36%
Copilot PR Prompt Pattern Analysis: 38%
11 others: 25–35%

DIFC integrity filtering (13 events)

13 GitHub MCP list_issues / search_issues calls were filtered because target issues have lower integrity than agents require. Affected: #33436, #33597, #33640, #33787, #33847 (x2), #33777, #32446, #33605, #33649. Tags: none:all / unapproved:all. This is expected DIFC behavior — verify agents do not loop on the empty result.

MCP tool usage (24h)

Server	Calls	Notes
safeoutputs	92	Heavy — add_comment, noop, create_pull_request_review_comment, update_pull_request dominant
github	29	pull_request_read, issue_read, search_repositories
agenticworkflows	19	audit, status (this workflow + meta-analysis)
serena	10	find_symbol, activate_project
sentry	8	list_events, find_organizations
mcpscripts	6	github_pr_query, github_discussion_query
tavily	1	one search

All tool calls show status=unknown in summary (telemetry capture issue — does not indicate failure). No missing_tool events flagged.

Known-Issue Status

Issue	Severity	First Seen	Today
codex-model-not-found	🔴 high	05-20	Worse — model name changed but still invalid; 4/4 codex runs failed
smoke-ci-agent-timeout	🟡 med	05-19	Persists — 1/5 cancelled today
high-firewall-block-rate	🟡 med	05-19	Persists — 17 workflows with >25% block rate
pr-sous-chef-execution-drift	🟡 med	05-21	New — turn count varies 0–18 across 4 runs (avg 10.5)
daily-safe-output-optimizer-cost-spike	🟡 med	05-21	New — $8.54 single run, 43% of daily spend
auto-triage-100pct-failure	🟢 med	05-20	Resolved — 1/1 success today
safe-outputs-job-failure	🟡 med	05-19	No recurrence today
upload-assets-job-failure	🟡 med	05-19	No recurrence today
push-repo-memory-patch-size	🟢 med	05-20	No recurrence — today's memory writes well under limit
otlp-404-not-found	🟢 med	05-20	No direct evidence today

Recommendations

🔴 Fix codex model + secret wiring (urgent). Pin codex workflows to a real model (the cli ships gpt-5 etc.) and ensure the entrypoint does not scrub the secret needed by retry attempts. Yesterday's patch did not solve the issue.
🔴 Investigate Changeset Generator proxy path. Different code path (api-proxy:10000, model gpt-5.4-mini) — needs separate verification.
🟡 Audit Daily Safe Output Tool Optimizer. 129 turns / $8.54 / 12M tokens in one run is an outlier; inspect tool-call sequence for loops or scope creep.
🟡 Raise Smoke CI agent timeout or move noop emission earlier in flow — 3-day pattern.
🟡 PR Sous Chef execution drift. Investigate why turn count varies 0–18 across 4 runs.
🟢 DIFC integrity handling. Confirm 13 filtered list_issues/search_issues calls do not cause infinite-loop or noop failures downstream.
🟢 Smoke Pi firewall: allowlist api-proxy:10002 or remove the probe.

References:

§26248415738 AI Moderator codex failure (most expensive)
§26253685911 Daily Safe Output Tool Optimizer ($8.54)
§26249207824 Smoke CI cancellation

Generated by 🔍 Agentic Workflow Audit Agent · ● 15.5M · ◷

expires on May 22, 2026, 9:56 PM UTC

2026-05-21T22:23:15Z

github-actions[bot]
Bot May 21, 2026
Author

💥 WHOOSH! 🦸♂️

The Smoke Test Agent rockets in with a sonic BOOM!

🚀 KA-POW! Claude engine smoke test 26256118523 zipped through the skies, dodged the firewall, vanquished the build errors, and landed safely on the runway! ✨

"Holy MCP servers, Batman — every tool checked out!" 🦇

Cape flapping, the agent leaves a glowing trail of green checkmarks ✅✅✅ and vanishes into the next workflow run... 🌟

THWIP! 🕸️ See you in the next adventure!

Warning

Firewall blocked 6 domains

The following domains were blocked by the firewall during workflow execution:

accounts.google.com
android.clients.google.com
clients2.google.com
contentautofill.googleapis.com
safebrowsingohttpgateway.googleapis.com
www.google.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "accounts.google.com"
    - "android.clients.google.com"
    - "clients2.google.com"
    - "contentautofill.googleapis.com"
    - "safebrowsingohttpgateway.googleapis.com"
    - "www.google.com"

See Network Configuration for more information.

💥 [THE END] — Illustrated by Smoke Claude · ● 6.1M · ◷

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[audit-workflows] Agentic Workflow Audit — 2026-05-21 (24h) #33873

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[audit-workflows] Agentic Workflow Audit — 2026-05-21 (24h) #33873

Uh oh!

github-actions[bot] Bot May 21, 2026

Overview

Health Summary

Critical Issues

🔴 Codex engine: 100% production failure (4/4)

🟡 Daily Safe Output Tool Optimizer cost spike

🟡 Smoke CI cancellation persists

Trend Charts

Known-Issue Status

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot May 21, 2026 Author

github-actions[bot]
Bot May 21, 2026

github-actions[bot]
Bot May 21, 2026
Author