Skip to content

[aw-failures] 6h failure cluster (2026-05-24): Copilot CLI 1.0.51 anthropic-beta header regression broke 13 workflows #34394

@github-actions

Description

@github-actions

Executive summary

In the 6-hour window ending 2026-05-24T07:44 UTC, 16 agentic workflow runs failed with this distribution by engine:

Engine Failures Cluster Severity
GitHub Copilot CLI 1.0.51 13 A — anthropic-beta: context-1m-2025-08-07 rejected (HTTP 400) P0
Codex 0.130.0 1 B — Missing OPENAI_API_KEY after fallback retry P2
Claude Code 2 C — 30-min action timeout; D — max_turns=30 exhausted P2

The dominant cluster (81% of failures) is a single, fully reproducible upstream regression in Copilot CLI 1.0.51. The fix is already proposed in #34390 (bump to Copilot 1.0.52 / Codex 0.133.0 / GitHub MCP v1.0.5) — merging that PR should resolve Cluster A immediately. This report tracks the production impact and ties together the 13 per-workflow auto-failure issues already created.

Failure clusters

Cluster Engine Runs Symptom Recommendation
A Copilot CLI 1.0.51 13 400 Unexpected value(s) \context-1m-2025-08-07` for the `anthropic-beta` header` on every attempt → all 4 retries fail Merge #34390 (Copilot 1.0.51 → 1.0.52)
B Codex 0.130.0 1 First attempt: invalid_request_error. Retries 2–4: Missing environment variable: 'OPENAI_API_KEY' (env not propagated to fresh-run retry) Same upgrade in #34390 bumps Codex; also investigate why fresh-run retry loses OPENAI_API_KEY
C Claude Code 1 Agent completed successfully (called noop), but the Execute Claude Code CLI action timed out at 30 minutes during browser-driven docs testing Raise action timeout or shorten the per-device test plan in multi-device-docs-tester.md
D Claude Code 1 terminal_reason: max_turns, Reached maximum number of turns (30). Many turns burned on permission-denied bash attempts against /tmp/gh-aw/cache-memory/ and /tmp/gh-aw/agent/ Either raise max-turns for step-name-alignment.md or allow-list those tmp paths

Evidence — Cluster A (P0)

All 13 Copilot runs use identical engine config:

engine_id: copilot
model: claude-sonnet-4.5
version: 1.0.51
firewall_version: v0.25.53

The error appears on every Copilot CLI attempt (4 retries per run × 13 runs):

Sample agent stdio log (run 26354999018, Sub-Issue Closer)
● Request failed (transient_bad_request). Retrying...
● Request failed (transient_bad_request). Retrying...

400 Unexpected value(s) `context-1m-2025-08-07` for the `anthropic-beta` header.
Please consult our documentation at docs.anthropic.com or try again without the header.

[copilot-harness] attempt 1: process closed exitCode=1 duration=8s
[copilot-harness] attempt 1 failed: isCAPIError400=false isMCPPolicyError=false
[copilot-harness] retry 1/3 → attempt 2 → same 400 → retry 2/3 → attempt 3 → same 400
[copilot-harness] all 3 retries exhausted — giving up (exitCode=1)

Affected runs (all conclusion=failure, error_count=1, missing_tool_count=0):

All 13 Cluster-A runs and their auto-issues
Workflow Run ID Auto-issue
Sub-Issue Closer §26354999018 #34392
PR Triage Agent §26354991782 #34340
Daily CLI Tools Exploratory Tester §26353911648 #34383
Contribution Check §26353674661 #34348
Workflow Health Manager - Meta-Orchestrator §26353253807 #34376
PR Description Updater §26352706155 (no auto-issue created)
Documentation Noob Tester §26352499399 #34367
jsweep - JavaScript Unbloater §26352390877 #34365
Code Simplifier §26352104868 #34363
GPL Dependency Cleaner (gpclean) §26352005318 #34362
Metrics Collector - Infrastructure Agent §26350966504 #34355
Daily Compiler Threat Spec Optimizer §26350785539 #34353
Daily Firewall Logs Collector and Reporter §26350663635 #34352

Evidence — Cluster B (Codex)

Run §26353930732 (Duplicate Code Detector). Auto-issue: #34384.

First attempt produced an invalid_request_error (the model gpt-5.5 is configured in codex.turn). Retries 2–4 then failed with Missing environment variable: 'OPENAI_API_KEY'. This suggests the harness's --continue / fresh-run retry path does not re-inject the API key secret, which converts a transient model error into a hard auth failure.

Codex turn errors across 4 attempts (run 26353930732)
attempt 1: codex_core::session::turn: Turn error: { "type": "invalid_request_error", ... }
attempt 2: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 3: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 4: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
[codex-harness] all 3 retries exhausted — giving up (exitCode=1)

Evidence — Cluster C (Claude Code action timeout)

Run §26354235499 (Multi-Device Docs Tester). No auto-issue was created because the agent itself reported success.

The Claude agent ran for ~26 turns, completed multi-device docs testing successfully across iPhone 12 / iPad / FHD viewports, and called noop. However, the wrapping GitHub Action (Execute Claude Code CLI) hit a 30-minute timeout — the agent finished but the action runner kept blocking. Two permission_denials (nohup npm run dev, redirect to /tmp/gh-aw/agent/astro-dev.log) likely lengthened the run as the agent worked around restrictions.

##[error]The action 'Execute Claude Code CLI' has timed out after 30 minutes.
[claude-harness] attempt 1: process closed duration=2m 56s
# (agent ran 2m 56s but the action wrapper waited longer)

Evidence — Cluster D (Claude Code max_turns)

Run §26352648076 (Step Name Alignment). Auto-issue: #34371.

terminal_reason: max_turns
errors: ["Reached maximum number of turns (30)"]
num_turns: 31
permission_denials: 10  (all reading /tmp/gh-aw/cache-memory/ or /tmp/gh-aw/agent/*)

The agent burned many turns on bash commands probing tmp paths it could not access. Compaction-style behavior plus permission friction caused it to hit the 30-turn cap. Either raise max-turns for this workflow or extend the allowed-paths list to include /tmp/gh-aw/cache-memory/ and /tmp/gh-aw/agent/.

Audit-diff baseline comparison (Cluster A)

For run 26354999018 vs. the most recent successful Sub-Issue Closer baseline (§26274652795, 2026-05-22):

Metric Successful baseline Failed run Delta
turns 5 0 agent never produced a turn
blocked_requests 12 8 both runs have firewall noise from (unknown) domains
posture read_only read_only unchanged

The collapse from 5 turns to 0 turns is the smoking gun: the Copilot CLI's HTTP 400 fails before any agent reasoning step occurs.

Existing tracking & correlation

Issue Type Relevance
#34386 CLI Version Checker auto-report Documents the Copilot 1.0.51 → 1.0.52 upgrade with full release notes (created 06:42 UTC, after the first 9 Cluster-A failures)
#34390 PR/issue with code change Already implements the fix: bumps DefaultCopilotVersion to 1.0.52, DefaultCodexVersion to 0.133.0, regenerates 235 lockfiles
13 per-workflow auto-issues [aw] <workflow> failed One per failed Copilot run; listed in the Cluster A table above
#34342 Prior failure-investigator self-failure Unrelated root cause (npm error notarget @anthropic-ai/claude-code@2.1.150 with a date before 5/21/2026) — not part of this cluster

Proposed fix roadmap

P0 — immediate

  1. Merge #34390 to pin Copilot CLI 1.0.52. This unblocks all 13 Cluster-A workflows. Anthropic's API rejection of anthropic-beta: context-1m-2025-08-07 is hardcoded inside Copilot CLI 1.0.51; only a CLI upgrade fixes it.
  2. After merge, re-trigger one Copilot workflow (e.g. Sub-Issue Closer) and confirm exit 0.

P1 — short-term
3. Codex retry resilience (Cluster B): investigate why codex exec retries lose OPENAI_API_KEY. The harness retry path likely respawns the process without preserving secrets. Add an env-inheritance check before retry, or surface Missing OPENAI_API_KEY as a non-retriable auth error so the workflow fails fast with a clearer signal.
4. Action-timeout vs agent-completion mismatch (Cluster C): the Execute Claude Code CLI step waited past agent completion. Confirm the harness exits promptly after the agent's final tool call.

P2 — backlog
5. Workflow-specific tuning (Cluster D): step-name-alignment.md either needs max-turns raised or its bash allow-list widened to include /tmp/gh-aw/cache-memory/ and /tmp/gh-aw/agent/ so the agent stops burning turns on denied probes.

Sub-issues created

No new sub-issues are filed. Per-workflow auto-failure issues already exist for 14 of the 16 runs (linked above). Creating duplicates would add noise; this parent issue links the existing tracking issues via GitHub's sub-issue mechanism for visibility.

Confidence & unknowns

  • High confidence: Cluster A root cause and fix path. Verified across 4 sampled runs (grep -c context-1m-2025-08-07 returned 4 occurrences per agent-stdio.log, one per Copilot retry).
  • Medium confidence: Cluster B is fixable by the same upgrade. Codex 0.133.0 ships with auth/session changes; the missing-env-var symptom may not survive the bump even if the underlying invalid_request_error does.
  • Unknowns: Why PR Description Updater (run 26352706155) and Multi-Device Docs Tester (run 26354235499) did not generate auto-failure issues. Possibly report-failure-as-issue: false in their frontmatter, or auto-issue creation skipped because the agent step itself reported success while a later job failed.

References

Generated by 🔍 [aw] Failure Investigator (6h) · ● opu47 21.2M ·

  • expires on May 31, 2026, 7:58 AM UTC


6h follow-up — 2026-05-24T13:11 UTC

Cluster A (Copilot CLI 1.0.51 anthropic-beta 400) — RESOLVED on main. Fix PR #34390 ("Bump pinned Copilot/Codex/GitHub MCP versions and regenerate workflow artifacts") was merged at 2026-05-24T12:59:44 UTC.

Failures observed in the 6h window ending 2026-05-24T13:11 UTC

12 additional failures observed; 11 of 12 occurred before the fix was merged, and the single post-merge failure is on a PR branch that has not yet rebased onto the new pinned version. No new failure modes detected.

Failures in this window (12)
Run ID Workflow Engine Created Pre/Post merge Cluster
§26361438009 PR Triage Agent Copilot 1.0.51 12:37:18Z Pre A
§26361477910 PR Description Updater Copilot 1.0.51 12:39:08Z Pre A
§26361491942 PR Description Updater Copilot 1.0.51 12:39:49Z Pre A
§26361510407 PR Description Updater Copilot 1.0.51 12:40:37Z Pre A
§26361519301 PR Description Updater Copilot 1.0.51 12:41:02Z Pre A
§26361533861 PR Description Updater Copilot 1.0.51 12:41:43Z Pre A
§26361624539 PR Code Quality Reviewer Copilot 1.0.51 12:45:51Z Pre A
§26361627620 PR Description Updater Copilot 1.0.51 12:46:00Z Pre A
§26361673647 Changeset Generator Codex 0.133.0 12:48:04Z Pre B
§26361673669 Smoke Codex Codex 0.133.0 12:48:04Z Pre B
§26361673672 Smoke Copilot Copilot 1.0.51 12:48:04Z Pre A
§26361959405 PR Code Quality Reviewer Copilot 1.0.51 13:00:50Z Post (PR branch unmerged) A

Cluster A — closing thoughts

The fix is live on main. Outstanding PR branches that pin Copilot 1.0.51 in their workflow artifacts will continue to fail until they rebase. This is a transient effect and does not require additional remediation beyond normal PR maintenance. Recommended: confirm post-merge Copilot runs on main succeed in the next 6h cycle, then close this issue.

Cluster B — still active and not addressed by the version bump

The two Codex failures in this window already run on Codex 0.133.0 (the bumped version), yet still exhibit the documented pattern: attempt 1 produces an invalid_request_error and retries 2–4 fail with Missing environment variable: 'OPENAI_API_KEY'. The version bump did not fix the harness retry env-propagation bug — that remains a separate, actionable harness-side issue that should be investigated independently.

Codex stdio excerpt (run 26361673647)
attempt 1: Turn error: { "type": "invalid_request_error", ... } (model=gpt-5.4-mini)
attempt 2: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 3: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 4: Turn error: Missing environment variable: `OPENAI_API_KEY`.
[codex-harness] all 3 retries exhausted — giving up (exitCode=1)

No new clusters

No new failure modes appeared in this window. Clusters C (Claude Code action timeout) and D (Claude Code max_turns) had no recurrence.

Update generated by [aw] Failure Investigator (6h) run §26362131625.

Generated by 🔍 [aw] Failure Investigator (6h) · ● opu47 9.3M ·


Status: resolved — closing

The dominant P0 Cluster A tracked here (Copilot CLI 1.0.51 anthropic-beta: context-1m-2025-08-07 regression that broke 13 workflows) has been resolved by PR #34390, which bumped DefaultCopilotVersion from 1.0.51 to 1.0.52. As of 2026-05-25T01:30 UTC:

  • pkg/constants/version_constants.go has const DefaultCopilotVersion Version = "1.0.52" (fix landed).
  • Zero Copilot-engine failures observed in the last 6h sample (43 runs total, 3 failures — all on Codex/Claude, not Copilot).

Residual concerns from this report — already tracked or self-healed

Cluster Original symptom Current state
A — Copilot 1.0.51 anthropic-beta 13 failures Fixed by #34390 → 1.0.52. No recurrences.
B — Codex 0.130.0 Missing OPENAI_API_KEY after fallback retry 1 failure Codex regression now tracked separately in #34522 (different root cause: PR #34390 also bumped Codex to 0.133.0, which introduced the stream_options.include_usage bug).
C — Claude 30-min action timeout (multi-device-docs-tester) 1 failure Not observed in last 6h. Standalone follow-up if it recurs.
D — Claude max_turns=30 exhausted 1 failure Avenger ran into a similar max_turns=25 situation 2026-05-24 18:46–20:39 (§26372192847 and 2 others), then self-recovered. Not a persistent issue — no follow-up filed.

Why close now

Closing as resolved. Re-open if the Copilot regression resurfaces.


Closed 2026-05-25 by failure-investigator after verifying DefaultCopilotVersion = 1.0.52 and zero Copilot failures in 6h lookback.

Generated by 🔍 [aw] Failure Investigator (6h) · opus47 19.7M ·

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions