[aw-failures] 6h failure cluster (2026-05-24): Copilot CLI 1.0.51 `anthropic-beta` header regression broke 13 workflows

### Executive summary

In the 6-hour window ending 2026-05-24T07:44 UTC, **16 agentic workflow runs failed** with this distribution by engine:

| Engine | Failures | Cluster | Severity |
|---|---:|---|---|
| GitHub Copilot CLI 1.0.51 | 13 | A — `anthropic-beta: context-1m-2025-08-07` rejected (HTTP 400) | **P0** |
| Codex 0.130.0 | 1 | B — `Missing OPENAI_API_KEY` after fallback retry | P2 |
| Claude Code | 2 | C — 30-min action timeout; D — `max_turns=30` exhausted | P2 |

The dominant cluster (81% of failures) is a single, fully reproducible upstream regression in Copilot CLI 1.0.51. **The fix is already proposed** in [#34390](https://github.com/github/gh-aw/issues/34390) (bump to Copilot 1.0.52 / Codex 0.133.0 / GitHub MCP v1.0.5) — merging that PR should resolve Cluster A immediately. This report tracks the production impact and ties together the 13 per-workflow auto-failure issues already created.

### Failure clusters

| Cluster | Engine | Runs | Symptom | Recommendation |
|---|---|---:|---|---|
| A | Copilot CLI 1.0.51 | 13 | `400 Unexpected value(s) \`context-1m-2025-08-07\` for the \`anthropic-beta\` header` on every attempt → all 4 retries fail | Merge [#34390](https://github.com/github/gh-aw/issues/34390) (Copilot 1.0.51 → 1.0.52) |
| B | Codex 0.130.0 | 1 | First attempt: `invalid_request_error`. Retries 2–4: `Missing environment variable: 'OPENAI_API_KEY'` (env not propagated to fresh-run retry) | Same upgrade in [#34390](https://github.com/github/gh-aw/issues/34390) bumps Codex; also investigate why fresh-run retry loses `OPENAI_API_KEY` |
| C | Claude Code | 1 | Agent completed successfully (called `noop`), but the `Execute Claude Code CLI` action timed out at 30 minutes during browser-driven docs testing | Raise action timeout or shorten the per-device test plan in `multi-device-docs-tester.md` |
| D | Claude Code | 1 | `terminal_reason: max_turns`, `Reached maximum number of turns (30)`. Many turns burned on permission-denied bash attempts against `/tmp/gh-aw/cache-memory/` and `/tmp/gh-aw/agent/` | Either raise `max-turns` for `step-name-alignment.md` or allow-list those tmp paths |

### Evidence — Cluster A (P0)

All 13 Copilot runs use identical engine config:

```
engine_id: copilot
model: claude-sonnet-4.5
version: 1.0.51
firewall_version: v0.25.53
```

The error appears on every Copilot CLI attempt (4 retries per run × 13 runs):

<details>
<summary>Sample agent stdio log (run 26354999018, Sub-Issue Closer)</summary>

```
● Request failed (transient_bad_request). Retrying...
● Request failed (transient_bad_request). Retrying...

400 Unexpected value(s) `context-1m-2025-08-07` for the `anthropic-beta` header.
Please consult our documentation at docs.anthropic.com or try again without the header.

[copilot-harness] attempt 1: process closed exitCode=1 duration=8s
[copilot-harness] attempt 1 failed: isCAPIError400=false isMCPPolicyError=false
[copilot-harness] retry 1/3 → attempt 2 → same 400 → retry 2/3 → attempt 3 → same 400
[copilot-harness] all 3 retries exhausted — giving up (exitCode=1)
```

</details>

Affected runs (all conclusion=failure, error_count=1, missing_tool_count=0):

<details>
<summary>All 13 Cluster-A runs and their auto-issues</summary>

| Workflow | Run ID | Auto-issue |
|---|---|---|
| Sub-Issue Closer | [§26354999018](https://github.com/github/gh-aw/actions/runs/26354999018) | #34392 |
| PR Triage Agent | [§26354991782](https://github.com/github/gh-aw/actions/runs/26354991782) | #34340 |
| Daily CLI Tools Exploratory Tester | [§26353911648](https://github.com/github/gh-aw/actions/runs/26353911648) | #34383 |
| Contribution Check | [§26353674661](https://github.com/github/gh-aw/actions/runs/26353674661) | #34348 |
| Workflow Health Manager - Meta-Orchestrator | [§26353253807](https://github.com/github/gh-aw/actions/runs/26353253807) | #34376 |
| PR Description Updater | [§26352706155](https://github.com/github/gh-aw/actions/runs/26352706155) | _(no auto-issue created)_ |
| Documentation Noob Tester | [§26352499399](https://github.com/github/gh-aw/actions/runs/26352499399) | #34367 |
| jsweep - JavaScript Unbloater | [§26352390877](https://github.com/github/gh-aw/actions/runs/26352390877) | #34365 |
| Code Simplifier | [§26352104868](https://github.com/github/gh-aw/actions/runs/26352104868) | #34363 |
| GPL Dependency Cleaner (gpclean) | [§26352005318](https://github.com/github/gh-aw/actions/runs/26352005318) | #34362 |
| Metrics Collector - Infrastructure Agent | [§26350966504](https://github.com/github/gh-aw/actions/runs/26350966504) | #34355 |
| Daily Compiler Threat Spec Optimizer | [§26350785539](https://github.com/github/gh-aw/actions/runs/26350785539) | #34353 |
| Daily Firewall Logs Collector and Reporter | [§26350663635](https://github.com/github/gh-aw/actions/runs/26350663635) | #34352 |

</details>

### Evidence — Cluster B (Codex)

Run [§26353930732](https://github.com/github/gh-aw/actions/runs/26353930732) (Duplicate Code Detector). Auto-issue: #34384.

First attempt produced an `invalid_request_error` (the model `gpt-5.5` is configured in `codex.turn`). Retries 2–4 then failed with `Missing environment variable: 'OPENAI_API_KEY'`. This suggests the harness's `--continue` / fresh-run retry path does not re-inject the API key secret, which converts a transient model error into a hard auth failure.

<details>
<summary>Codex turn errors across 4 attempts (run 26353930732)</summary>

```
attempt 1: codex_core::session::turn: Turn error: { "type": "invalid_request_error", ... }
attempt 2: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 3: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 4: codex_core::session::turn: Turn error: Missing environment variable: `OPENAI_API_KEY`.
[codex-harness] all 3 retries exhausted — giving up (exitCode=1)
```

</details>

### Evidence — Cluster C (Claude Code action timeout)

Run [§26354235499](https://github.com/github/gh-aw/actions/runs/26354235499) (Multi-Device Docs Tester). No auto-issue was created because the agent itself reported success.

The Claude agent ran for ~26 turns, completed multi-device docs testing successfully across iPhone 12 / iPad / FHD viewports, and called `noop`. However, the wrapping GitHub Action (`Execute Claude Code CLI`) hit a 30-minute timeout — the agent finished but the action runner kept blocking. Two `permission_denials` (`nohup npm run dev`, redirect to `/tmp/gh-aw/agent/astro-dev.log`) likely lengthened the run as the agent worked around restrictions.

```
##[error]The action 'Execute Claude Code CLI' has timed out after 30 minutes.
[claude-harness] attempt 1: process closed duration=2m 56s
# (agent ran 2m 56s but the action wrapper waited longer)
```

### Evidence — Cluster D (Claude Code max_turns)

Run [§26352648076](https://github.com/github/gh-aw/actions/runs/26352648076) (Step Name Alignment). Auto-issue: #34371.

```
terminal_reason: max_turns
errors: ["Reached maximum number of turns (30)"]
num_turns: 31
permission_denials: 10  (all reading /tmp/gh-aw/cache-memory/ or /tmp/gh-aw/agent/*)
```

The agent burned many turns on bash commands probing tmp paths it could not access. Compaction-style behavior plus permission friction caused it to hit the 30-turn cap. Either raise `max-turns` for this workflow or extend the allowed-paths list to include `/tmp/gh-aw/cache-memory/` and `/tmp/gh-aw/agent/`.

### Audit-diff baseline comparison (Cluster A)

For run 26354999018 vs. the most recent successful Sub-Issue Closer baseline ([§26274652795](https://github.com/github/gh-aw/actions/runs/26274652795), 2026-05-22):

| Metric | Successful baseline | Failed run | Delta |
|---|---:|---:|---|
| turns | 5 | 0 | agent never produced a turn |
| blocked_requests | 12 | 8 | both runs have firewall noise from `(unknown)` domains |
| posture | read_only | read_only | unchanged |

The collapse from 5 turns to 0 turns is the smoking gun: the Copilot CLI's HTTP 400 fails before any agent reasoning step occurs.

### Existing tracking & correlation

| Issue | Type | Relevance |
|---|---|---|
| [#34386](https://github.com/github/gh-aw/issues/34386) | CLI Version Checker auto-report | Documents the Copilot 1.0.51 → 1.0.52 upgrade with full release notes (created 06:42 UTC, after the first 9 Cluster-A failures) |
| [#34390](https://github.com/github/gh-aw/issues/34390) | PR/issue with code change | **Already implements the fix**: bumps `DefaultCopilotVersion` to `1.0.52`, `DefaultCodexVersion` to `0.133.0`, regenerates 235 lockfiles |
| 13 per-workflow auto-issues | `[aw] <workflow> failed` | One per failed Copilot run; listed in the Cluster A table above |
| [#34342](https://github.com/github/gh-aw/issues/34342) | Prior failure-investigator self-failure | Unrelated root cause (`npm error notarget `@anthropic-ai/claude-code`@2.1.150 with a date before 5/21/2026`) — not part of this cluster |

### Proposed fix roadmap

**P0 — immediate**
1. **Merge [#34390](https://github.com/github/gh-aw/issues/34390)** to pin Copilot CLI 1.0.52. This unblocks all 13 Cluster-A workflows. Anthropic's API rejection of `anthropic-beta: context-1m-2025-08-07` is hardcoded inside Copilot CLI 1.0.51; only a CLI upgrade fixes it.
2. After merge, re-trigger one Copilot workflow (e.g. Sub-Issue Closer) and confirm exit 0.

**P1 — short-term**
3. **Codex retry resilience** (Cluster B): investigate why `codex exec` retries lose `OPENAI_API_KEY`. The harness retry path likely respawns the process without preserving secrets. Add an env-inheritance check before retry, or surface `Missing OPENAI_API_KEY` as a non-retriable auth error so the workflow fails fast with a clearer signal.
4. **Action-timeout vs agent-completion mismatch** (Cluster C): the `Execute Claude Code CLI` step waited past agent completion. Confirm the harness exits promptly after the agent's final tool call.

**P2 — backlog**
5. **Workflow-specific tuning** (Cluster D): `step-name-alignment.md` either needs `max-turns` raised or its bash allow-list widened to include `/tmp/gh-aw/cache-memory/` and `/tmp/gh-aw/agent/` so the agent stops burning turns on denied probes.

### Sub-issues created

No new sub-issues are filed. Per-workflow auto-failure issues already exist for 14 of the 16 runs (linked above). Creating duplicates would add noise; this parent issue links the existing tracking issues via GitHub's sub-issue mechanism for visibility.

### Confidence & unknowns

- **High confidence**: Cluster A root cause and fix path. Verified across 4 sampled runs (`grep -c context-1m-2025-08-07` returned 4 occurrences per agent-stdio.log, one per Copilot retry).
- **Medium confidence**: Cluster B is fixable by the same upgrade. Codex 0.133.0 ships with auth/session changes; the missing-env-var symptom may not survive the bump even if the underlying `invalid_request_error` does.
- **Unknowns**: Why PR Description Updater (run 26352706155) and Multi-Device Docs Tester (run 26354235499) did not generate auto-failure issues. Possibly `report-failure-as-issue: false` in their frontmatter, or auto-issue creation skipped because the agent step itself reported success while a later job failed.

### References

- [§26354999018 — Sub-Issue Closer (Cluster A representative)](https://github.com/github/gh-aw/actions/runs/26354999018)
- [§26353930732 — Duplicate Code Detector (Cluster B)](https://github.com/github/gh-aw/actions/runs/26353930732)
- [§26354235499 — Multi-Device Docs Tester (Cluster C)](https://github.com/github/gh-aw/actions/runs/26354235499)







> Generated by [🔍 [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/26355447076) · ● opu47 21.2M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)
> - [x] expires  on May 31, 2026, 7:58 AM UTC






---

---

### 6h follow-up — 2026-05-24T13:11 UTC

**Cluster A (Copilot CLI 1.0.51 `anthropic-beta` 400) — RESOLVED on `main`.** Fix PR #34390 ("Bump pinned Copilot/Codex/GitHub MCP versions and regenerate workflow artifacts") was **merged at 2026-05-24T12:59:44 UTC**.

#### Failures observed in the 6h window ending 2026-05-24T13:11 UTC

12 additional failures observed; **11 of 12 occurred before the fix was merged**, and the single post-merge failure is on a PR branch that has not yet rebased onto the new pinned version. No new failure modes detected.

<details>
<summary>Failures in this window (12)</summary>

| Run ID | Workflow | Engine | Created | Pre/Post merge | Cluster |
|---|---|---|---|---|---|
| [§26361438009](https://github.com/github/gh-aw/actions/runs/26361438009) | PR Triage Agent | Copilot 1.0.51 | 12:37:18Z | Pre | A |
| [§26361477910](https://github.com/github/gh-aw/actions/runs/26361477910) | PR Description Updater | Copilot 1.0.51 | 12:39:08Z | Pre | A |
| [§26361491942](https://github.com/github/gh-aw/actions/runs/26361491942) | PR Description Updater | Copilot 1.0.51 | 12:39:49Z | Pre | A |
| [§26361510407](https://github.com/github/gh-aw/actions/runs/26361510407) | PR Description Updater | Copilot 1.0.51 | 12:40:37Z | Pre | A |
| [§26361519301](https://github.com/github/gh-aw/actions/runs/26361519301) | PR Description Updater | Copilot 1.0.51 | 12:41:02Z | Pre | A |
| [§26361533861](https://github.com/github/gh-aw/actions/runs/26361533861) | PR Description Updater | Copilot 1.0.51 | 12:41:43Z | Pre | A |
| [§26361624539](https://github.com/github/gh-aw/actions/runs/26361624539) | PR Code Quality Reviewer | Copilot 1.0.51 | 12:45:51Z | Pre | A |
| [§26361627620](https://github.com/github/gh-aw/actions/runs/26361627620) | PR Description Updater | Copilot 1.0.51 | 12:46:00Z | Pre | A |
| [§26361673647](https://github.com/github/gh-aw/actions/runs/26361673647) | Changeset Generator | Codex 0.133.0 | 12:48:04Z | Pre | B |
| [§26361673669](https://github.com/github/gh-aw/actions/runs/26361673669) | Smoke Codex | Codex 0.133.0 | 12:48:04Z | Pre | B |
| [§26361673672](https://github.com/github/gh-aw/actions/runs/26361673672) | Smoke Copilot | Copilot 1.0.51 | 12:48:04Z | Pre | A |
| [§26361959405](https://github.com/github/gh-aw/actions/runs/26361959405) | PR Code Quality Reviewer | Copilot 1.0.51 | 13:00:50Z | Post (PR branch unmerged) | A |

</details>

#### Cluster A — closing thoughts

The fix is live on `main`. Outstanding PR branches that pin Copilot 1.0.51 in their workflow artifacts will continue to fail until they rebase. This is a transient effect and **does not require additional remediation** beyond normal PR maintenance. **Recommended:** confirm post-merge Copilot runs on `main` succeed in the next 6h cycle, then close this issue.

#### Cluster B — still active and not addressed by the version bump

The two Codex failures in this window already run on Codex **0.133.0** (the bumped version), yet still exhibit the documented pattern: attempt 1 produces an `invalid_request_error` and retries 2–4 fail with `Missing environment variable: 'OPENAI_API_KEY'`. The version bump did **not** fix the harness retry env-propagation bug — that remains a separate, actionable harness-side issue that should be investigated independently.

<details>
<summary>Codex stdio excerpt (run 26361673647)</summary>

```
attempt 1: Turn error: { "type": "invalid_request_error", ... } (model=gpt-5.4-mini)
attempt 2: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 3: Turn error: Missing environment variable: `OPENAI_API_KEY`.
attempt 4: Turn error: Missing environment variable: `OPENAI_API_KEY`.
[codex-harness] all 3 retries exhausted — giving up (exitCode=1)
```

</details>

#### No new clusters

No new failure modes appeared in this window. Clusters C (Claude Code action timeout) and D (Claude Code `max_turns`) had no recurrence.

*Update generated by [aw] Failure Investigator (6h) run [§26362131625](https://github.com/github/gh-aw/actions/runs/26362131625).*

> Generated by [🔍 [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/26362131625) · ● opu47 9.3M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)



---

### Status: resolved — closing

The dominant **P0 Cluster A** tracked here (Copilot CLI 1.0.51 `anthropic-beta: context-1m-2025-08-07` regression that broke 13 workflows) has been resolved by PR #34390, which bumped `DefaultCopilotVersion` from `1.0.51` to `1.0.52`. As of 2026-05-25T01:30 UTC:

- `pkg/constants/version_constants.go` has `const DefaultCopilotVersion Version = "1.0.52"` (fix landed).
- **Zero Copilot-engine failures observed** in the last 6h sample (43 runs total, 3 failures — all on Codex/Claude, not Copilot).

### Residual concerns from this report — already tracked or self-healed

| Cluster | Original symptom | Current state |
|---|---|---|
| A — Copilot 1.0.51 anthropic-beta | 13 failures | **Fixed** by #34390 → 1.0.52. No recurrences. |
| B — Codex 0.130.0 `Missing OPENAI_API_KEY` after fallback retry | 1 failure | Codex regression now tracked separately in #34522 (different root cause: PR #34390 also bumped Codex to 0.133.0, which introduced the `stream_options.include_usage` bug). |
| C — Claude 30-min action timeout (multi-device-docs-tester) | 1 failure | Not observed in last 6h. Standalone follow-up if it recurs. |
| D — Claude `max_turns=30` exhausted | 1 failure | Avenger ran into a similar `max_turns=25` situation 2026-05-24 18:46–20:39 ([§26372192847](https://github.com/github/gh-aw/actions/runs/26372192847) and 2 others), then self-recovered. Not a persistent issue — no follow-up filed. |

### Why close now

- The P0 cluster this issue exists to track is resolved end-to-end.
- The 13 linked per-workflow auto-failure issues created by this report should be closed independently if their underlying runs are no longer failing.
- The follow-on Codex regression introduced by the same fix PR is tracked in #34522 — keeping this issue open would conflate two distinct (and now-disjoint) root causes.

Closing as resolved. Re-open if the Copilot regression resurfaces.

---

<sub>Closed 2026-05-25 by failure-investigator after verifying `DefaultCopilotVersion = 1.0.52` and zero Copilot failures in 6h lookback.</sub>

> Generated by [🔍 [aw] Failure Investigator (6h)](https://github.com/github/gh-aw/actions/runs/26378530589) · opus47 19.7M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Faw-failure-investigator%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aw-failures] 6h failure cluster (2026-05-24): Copilot CLI 1.0.51 `anthropic-beta` header regression broke 13 workflows #34394

Executive summary

Failure clusters

Evidence — Cluster A (P0)

Evidence — Cluster B (Codex)

Evidence — Cluster C (Claude Code action timeout)

Evidence — Cluster D (Claude Code max_turns)

Audit-diff baseline comparison (Cluster A)

Existing tracking & correlation

Proposed fix roadmap

Sub-issues created

Confidence & unknowns

References

6h follow-up — 2026-05-24T13:11 UTC

Failures observed in the 6h window ending 2026-05-24T13:11 UTC

Cluster A — closing thoughts

Cluster B — still active and not addressed by the version bump

No new clusters

Status: resolved — closing

Residual concerns from this report — already tracked or self-healed

Why close now

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Engine	Failures	Cluster	Severity
GitHub Copilot CLI 1.0.51	13	A — `anthropic-beta: context-1m-2025-08-07` rejected (HTTP 400)	P0
Codex 0.130.0	1	B — `Missing OPENAI_API_KEY` after fallback retry	P2
Claude Code	2	C — 30-min action timeout; D — `max_turns=30` exhausted	P2

Cluster	Engine	Runs	Symptom	Recommendation
A	Copilot CLI 1.0.51	13	`400 Unexpected value(s) \`context-1m-2025-08-07` for the `anthropic-beta` header` on every attempt → all 4 retries fail	Merge #34390 (Copilot 1.0.51 → 1.0.52)
B	Codex 0.130.0	1	First attempt: `invalid_request_error`. Retries 2–4: `Missing environment variable: 'OPENAI_API_KEY'` (env not propagated to fresh-run retry)	Same upgrade in #34390 bumps Codex; also investigate why fresh-run retry loses `OPENAI_API_KEY`
C	Claude Code	1	Agent completed successfully (called `noop`), but the `Execute Claude Code CLI` action timed out at 30 minutes during browser-driven docs testing	Raise action timeout or shorten the per-device test plan in `multi-device-docs-tester.md`
D	Claude Code	1	`terminal_reason: max_turns`, `Reached maximum number of turns (30)`. Many turns burned on permission-denied bash attempts against `/tmp/gh-aw/cache-memory/` and `/tmp/gh-aw/agent/`	Either raise `max-turns` for `step-name-alignment.md` or allow-list those tmp paths

Workflow	Run ID	Auto-issue
Sub-Issue Closer	§26354999018	#34392
PR Triage Agent	§26354991782	#34340
Daily CLI Tools Exploratory Tester	§26353911648	#34383
Contribution Check	§26353674661	#34348
Workflow Health Manager - Meta-Orchestrator	§26353253807	#34376
PR Description Updater	§26352706155	(no auto-issue created)
Documentation Noob Tester	§26352499399	#34367
jsweep - JavaScript Unbloater	§26352390877	#34365
Code Simplifier	§26352104868	#34363
GPL Dependency Cleaner (gpclean)	§26352005318	#34362
Metrics Collector - Infrastructure Agent	§26350966504	#34355
Daily Compiler Threat Spec Optimizer	§26350785539	#34353
Daily Firewall Logs Collector and Reporter	§26350663635	#34352

Metric	Successful baseline	Failed run	Delta
turns	5	0	agent never produced a turn
blocked_requests	12	8	both runs have firewall noise from `(unknown)` domains
posture	read_only	read_only	unchanged

Issue	Type	Relevance
#34386	CLI Version Checker auto-report	Documents the Copilot 1.0.51 → 1.0.52 upgrade with full release notes (created 06:42 UTC, after the first 9 Cluster-A failures)
#34390	PR/issue with code change	Already implements the fix: bumps `DefaultCopilotVersion` to `1.0.52`, `DefaultCodexVersion` to `0.133.0`, regenerates 235 lockfiles
13 per-workflow auto-issues	`[aw] <workflow> failed`	One per failed Copilot run; listed in the Cluster A table above
#34342	Prior failure-investigator self-failure	Unrelated root cause (`npm error notarget` @anthropic-ai/claude-code`@2.1.150 with a date before 5/21/2026`) — not part of this cluster

Run ID	Workflow	Engine	Created	Pre/Post merge	Cluster
§26361438009	PR Triage Agent	Copilot 1.0.51	12:37:18Z	Pre	A
§26361477910	PR Description Updater	Copilot 1.0.51	12:39:08Z	Pre	A
§26361491942	PR Description Updater	Copilot 1.0.51	12:39:49Z	Pre	A
§26361510407	PR Description Updater	Copilot 1.0.51	12:40:37Z	Pre	A
§26361519301	PR Description Updater	Copilot 1.0.51	12:41:02Z	Pre	A
§26361533861	PR Description Updater	Copilot 1.0.51	12:41:43Z	Pre	A
§26361624539	PR Code Quality Reviewer	Copilot 1.0.51	12:45:51Z	Pre	A
§26361627620	PR Description Updater	Copilot 1.0.51	12:46:00Z	Pre	A
§26361673647	Changeset Generator	Codex 0.133.0	12:48:04Z	Pre	B
§26361673669	Smoke Codex	Codex 0.133.0	12:48:04Z	Pre	B
§26361673672	Smoke Copilot	Copilot 1.0.51	12:48:04Z	Pre	A
§26361959405	PR Code Quality Reviewer	Copilot 1.0.51	13:00:50Z	Post (PR branch unmerged)	A

Cluster	Original symptom	Current state
A — Copilot 1.0.51 anthropic-beta	13 failures	Fixed by #34390 → 1.0.52. No recurrences.
B — Codex 0.130.0 `Missing OPENAI_API_KEY` after fallback retry	1 failure	Codex regression now tracked separately in #34522 (different root cause: PR #34390 also bumped Codex to 0.133.0, which introduced the `stream_options.include_usage` bug).
C — Claude 30-min action timeout (multi-device-docs-tester)	1 failure	Not observed in last 6h. Standalone follow-up if it recurs.
D — Claude `max_turns=30` exhausted	1 failure	Avenger ran into a similar `max_turns=25` situation 2026-05-24 18:46–20:39 (§26372192847 and 2 others), then self-recovered. Not a persistent issue — no follow-up filed.

[aw-failures] 6h failure cluster (2026-05-24): Copilot CLI 1.0.51 anthropic-beta header regression broke 13 workflows #34394

Description

Executive summary

Failure clusters

Evidence — Cluster A (P0)

Evidence — Cluster B (Codex)

Evidence — Cluster C (Claude Code action timeout)

Evidence — Cluster D (Claude Code max_turns)

Audit-diff baseline comparison (Cluster A)

Existing tracking & correlation

Proposed fix roadmap

Sub-issues created

Confidence & unknowns

References

6h follow-up — 2026-05-24T13:11 UTC

Failures observed in the 6h window ending 2026-05-24T13:11 UTC

Cluster A — closing thoughts

Cluster B — still active and not addressed by the version bump

No new clusters

Status: resolved — closing

Residual concerns from this report — already tracked or self-healed

Why close now

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[aw-failures] 6h failure cluster (2026-05-24): Copilot CLI 1.0.51 `anthropic-beta` header regression broke 13 workflows #34394