[audit-workflows] Daily Audit 2026-06-07 — prod-main 92.9% healthy; 3 config-drift recurrences (day 3), minimatch crash resolved #37663
Closed
Replies: 1 comment
-
|
This discussion has been marked as outdated by Agentic Workflow Audit Agent. A newer discussion is available at Discussion #37950. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
Daily audit of agentic workflow runs for 2026-06-07. The 24h window was small and quiet (Sunday): the continuation fetch confirmed all runs landed in a ~3.7h cluster (17:56–21:38Z), 56 runs / 54 completed (2 still in progress, including this agent). Headline: 46 success / 8 failure = 85.2% overall, but on production
mainthe real success rate is 92.9% (39/42, excluding one intentional-failure test workflow). Four of the eight failures are confined to a single feature PR branch.Good news up top: the
minimatchSDK-driver crash that dominated 06-06 (6 workflows) is gone — not seen in any of 28 copilot runs. The token-budget 429, safe-output partial-failure, and detection-parse failures were also all absent. The firewall blocked 0 of 2516 requests.Summary
mainsuccess (real)Critical Issues (recurring on prod
main)These three have now failed a prod-main schedule three consecutive days (06-05/06/07) — all static config, all cheap-but-total (0 useful tokens), all should be one-line fixes:
Daily Caveman Optimizer[claude] —400 "This model does not support the effort parameter"(run 27104658186; now retries 11×). Fix: drop theeffort/reasoning-effort param from the workflow frontmatter.Daily Cache Strategy Analyzer[codex] —404 "Model not found gpt-5-codex-alpha-2025-11-07"(run 27101492605). The model id was rotated fromadelie-alpha-2026-02-19but still 404s — chasing a moving endpoint. Fix: pin a stable/aliased codex model id rather than another snapshot.Daily Safe Output Integrator[copilot] — copilot-sdk-driver tool-permission lockout (run 27101776716): 13 denials ofread(pkg/workflow/*_test.go)+shell(sed ... safe_outputs_config.go), aborted after the 5-denial threshold. Persisting 5+ windows (narrower than 06-06). Fix: reconcile the sdk-driverallow-toolmapping with the workflow's declared tools.New This Window
activation-guardrail-cjs-module-not-found(MEDIUM) — All 4 PR-review workflows on branchcopilot/fix-daily-credit-limit-testfailed identically: the activation job crashes withCannot find module 'check_daily_effective_workflow_guardrail.cjs'. The agent job is then correctly skipped, but the activation failure reds the run. This is dev-confined to the feature PR building the daily-credit-limit guardrail, but it's a reproducible packaging gap — the compiled lock filesrequire()a.cjsthat isn't bundled in the action package. It will reddenmainif merged unbundled. Affected: Test Quality Sentinel, PR Code Quality Reviewer, Design Decision Gate, Matt Pocock Skills Reviewer (runs 27103538546/47/66/67). Fix: bundle the.cjsbefore merge + add a CI smoke test thatnode-requires every.cjsthe activation job references (same class as the now-fixed minimatch gap).Full failure breakdown (8 failures / 4 classes)
Trend Charts (to 2026-06-07)
Workflow Health
Success rate sits at 85.2% for the window (7-day avg ~86.5%), holding just below the 90% line. The dip vs. recent ~90% days is almost entirely the dev-branch guardrail cluster — prod-main real success is 92.9%, so platform health is steady, not regressing.
Token Usage
Daily tokens came in at 24.0M, below the 7-day moving average of ~29.3M — expected for a low-volume Sunday with fewer PR-triggered runs. No runaway-cost or token-cap (25M) pressure this window; the prior 429 risk on Daily Ambient Context Optimizer stayed quiet.
Top cost runs (claude-measured, all successful)
Note: codex/copilot cost is not reported by the harness, so $ totals are claude-only.
PR Sous Chefshowed execution drift (0/7/21 turns across 3 runs, all success) — flagged by observability, not a failure.Recommendations
effort(Caveman), stable codex id (Cache Strategy), sdk-driver tool-permission reconciliation (Safe Output Integrator).check_daily_effective_workflow_guardrail.cjsbefore merging the daily-credit-limit feature; extend the CI smoke test tonode-require every activation.cjs.References: §27104658186 · §27101492605 · §27103538546
Beta Was this translation helpful? Give feedback.
All reactions