Agentic Workflow Audit — 2026-03-26 #23173
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Agentic Workflow Audit Agent. A newer discussion is available at Discussion #23276. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Daily audit of agentic workflow runs for the last 24 hours (2026-03-25 → 2026-03-26).
Summary
Workflow Health Trend
The sharp drop today is driven primarily by widespread Copilot authentication failures (7+ runs) and GitHub API rate limiting during a concurrent execution burst at ~20:25 UTC.
Token & Cost Trend
Today's $7.15 cost is the highest in the historical record, driven by multiple high-turn Claude runs (Smoke Claude ran twice at ~$1.10 each, plus Daily Doc Updater at $1.11).
🔴 Critical Issues
1. Copilot Authentication Failures (7 runs)
Root cause:
Authentication failed (Request ID: ...)— Copilot token is invalid, expired, or missing theCopilot Requestspermission.Affected workflows:
issue-monster— 3 failures (§23614004922, §23615946471, §23617232281)smoke-copilot— 2 failures (§23614909987, §23615931012)agent-container-smoke-test— 2 failures (§23614910062, §23615931059)metrics-collector(§23614384478)daily-difc-integrity-filtered-events-analyzer(§23615370549)daily-workflow-updater(§23617471074)Recommended fix: Refresh
COPILOT_GITHUB_TOKEN(orGH_AW_GITHUB_TOKEN) secret. Verify the Fine-Grained PAT has theCopilot Requestspermission enabled. Check withgh auth status.2. GitHub API Rate Limiting — safe_outputs failures
Root cause: Concurrent PR workflows targeting PR #23160 triggered the installation rate limit at 2026-03-26T20:25 UTC. The
safe_outputsjob received:API rate limit exceeded for installation— 6 safe-output messages failed.Affected workflows:
smoke-claude(§23615931027) — agent succeeded ✅, safe_outputs failed ❌ (add_comment, update_pull_request, create_pull_request_review_comment ×2, add_reviewer, submit_pull_request_review)daily-doc-updater(§23616275567) — agent succeeded ✅, safe_outputs failed ❌Recommended fix: Stagger schedule times for workflows that run on the same PR. Consider adding retry logic with exponential backoff in safe_outputs for rate limit errors (HTTP 429).
High-Cost Runs (Top 5)
The audit system flagged Smoke Claude as
resource_heavy_for_domainwithpartially_reducibleassessment: ~91% of turns are data-gathering that could move to deterministic pre-agent steps. Consider moving file reads and PR metadata fetching to frontmatter steps to reduce inference cost.Agentic Behavior Concerns
exploratoryexecution,write_heavyactuation. Baseline comparison shows turns increased from 40 → 45 vs prior successful run.poor cost_efficiencyrating.resource_heavy_for_domainfor a Triage task with 14 tool types and 13.6m duration despite 0 agent turns.🔧 Missing Tool
add_smoked_labelwas requested by Smoke Codex (§23614909967) but is not configured. The agent correctly reported this viamissing_tooland the run succeeded. This tool should be added to the workflow's safe-output permissions.✅ Healthy Runs
Recommendations
add_smoked_labelto the Smoke Codex workflow safe-output configuration.References:
Beta Was this translation helpful? Give feedback.
All reactions