[audit-workflows] Daily Audit — 2026-05-16 — 63 runs, 21 errors, $17.59 spend #32712
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Agentic Workflow Audit Agent. A newer discussion is available at Discussion #32908. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
Audited 63 workflow runs across the last 24 hours (window ending 2026-05-16 ~21:30 UTC). Total spend was $17.59 over 5.4h of agent activity and 41.0M raw / 241.5M effective tokens. 21 errors were recorded with 0 warnings and 0 missing tools. Engine mix: 47 copilot · 12 claude · 2 codex · 2 unknown.
The day was dominated by one recurring infrastructure failure: the Safe Outputs MCP HTTP Server step failing at startup, which knocked out runs across at least three unrelated workflows. A separate
actions/upload-artifactpattern bug broke Daily Caveman Optimizer, and PR Sous Chef hit transientCould not resolve host: github.comerrors during checkout.Headline Metrics
Critical Issues (action required)
safe-outputs-mcp-server-startup-failure— high severity, 4+ recurrencesagent/Start Safe Outputs MCP HTTP Serverexited with code 1.cache-memory-upload-artifact-path-invalid— medium severityUpload cache-memory data as artifactrejected the pattern/tmp/gh-aw/cache-memory/.with: "Relative pathing '.' and '..' is not allowed."/tmp/gh-aw/cache-memory/**or/tmp/gh-aw/cache-memory/*instead of the trailing..pr-sous-chef-dns-checkout-failure— medium severity, 3 recurrencesactivation/Checkout actions folderfailed withCould not resolve host: github.com.Workflow Health Trends
Success rate fell from 100% in the 17:00 UTC bucket to 52.2% at 20:00 UTC as the Safe Outputs MCP failures clustered in the late-afternoon scheduled runs, before partial recovery at 21:00. Repo memory had no prior history at the start of this audit, so the chart is hour-bucketed for today; once daily snapshots accumulate, future audits will render a true day-over-day curve.
Token & Cost Trends
Cost was front-loaded into the 18:00–19:00 buckets ($5.83 + $5.19), driven by claude-engine workflows like Daily Code Metrics, Copilot Agent PR Analysis, and Design Decision Gate. A single late run — Daily Safe Output Tool Optimizer (§25972929656) — drove the 21:00 spike to $5.99 (≈34% of the daily total) and ended in failure, making it the day's most expensive outlier.
Top Failing Workflows
Firewall Activity & Network Friction
15 workflows routed requests to the
(unknown)domain category and were blocked:The consistent
(unknown)domain category strongly suggests the proxy CONNECT host header is being stripped or unresolved. Recommendation: Capture the blocked CONNECT targets and add known-good entries to the firewall allowlist; verify DNS forapi.githubcopilot.comis resolving inside the sandbox.Cost Outliers & Anomalies
11 high-anomaly events flagged across 63 runs by cross-run log template analysis (anomaly score > 0.6).
Cost spikes (single-run):
Execution drift: PR Sous Chef turn count varied 0–40 (mean 14.9) — suggests prompt instability or task-shape changes. Worth inspecting recent prompt edits and adding a 0-turn branch for empty PRs.
Recommendations Summary
actions/upload-artifactpattern in cache-memory upload step(unknown)blocksRepo Memory Updates
This audit seeded the previously-empty repo memory with:
audit-history.jsonl(1 entry),workflow-trends.json(top 15 workflows),known-issues.json(5 issues),recommendations.json(5 recs),anomalies.json, andmetrics-summary.json(1 day). Subsequent audits will accumulate multi-day history so the trend charts can become genuine 30-day rollups.References:
Beta Was this translation helpful? Give feedback.
All reactions