Describe the bug
A long-running session became permanently wedged with three identical CAPIError: 400 — "messages.5.content.2: Invalid \signature` in `thinking` block"failures, each one firing within ~1 second of a background research sub-agent completing. After the first failure, every subsequent attempt tocontinueproduced the same error against the samemessages.5.content.2` slot, because the bad thinking-block sits permanently in the conversation history and is re-sent on every retry. The session had to be abandoned — there is no in-product way to repair, prune, or rewind past the corrupted thinking block, and no way for the user to see which message is the corrupted one.
The trigger pattern is very reproducible from the events log: every error is preceded (within ≤5 s) by a subagent.completed for a research sub-agent and the corresponding system.notification that splices the completion into the parent loop. The parent then issues a brand-new assistant.turn_start → assistant.turn_end (very fast, ~4–5 s, meaning the API call returned an error rather than streaming a real response) → session.error.
The parent agent was running claude-opus-4.7-1m-internal; the sub-agents were research agents (the most recent one ran on claude-opus-4.6-1m). One plausible root cause is that thinking-block signatures issued for one Anthropic model are being kept in the parent's history when a sub-agent on a different model is integrated — Anthropic's API ties thinking-block signatures to the specific (model, turn) that produced them, and rejects the request if they are presented out of context.
What makes this worse than a normal transient API error:
- No automatic recovery. The CLI surfaces the error verbatim and stops the turn, but does not strip / regenerate / quarantine the offending thinking block, so the next user prompt re-sends the same poisoned history and gets the same 400.
- No user-visible repair affordance. There is no
/rewind to a known-good turn, no "drop the corrupted thinking block" option, no diagnostic telling the user "your conversation history has been wedged by a stale thinking-block signature; start a new session".
- The natural fallback (closing and
/resume-ing the session, or sending continue again) does not help as long as the bad block stays in the cached history.
Affected version
GitHub Copilot CLI 1.0.49
Stack trace references the same code path in 1.0.48 (app.js).
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:1254:1046 t.fromAPIError
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:3439:15527 vmt.getCompletionWithTools
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:3472:2751 O3e.getCompletionWithTools
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:4483:4797 t.runAgenticLoop
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:4481:12744 t.processQueuedItems
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:4481:3688 t.processQueue
file:///home/jaytau/.copilot/pkg/universal/1.0.48/app.js:4479:4392 t.send
(The session was started under 1.0.48 and the binary auto-updated to 1.0.49 between the failure and the time of this report; the failing code path is the same.)
Steps to reproduce the behavior
I can't synthetically reproduce this on demand yet, but the repro signature from the events log is:
- Open a long session on
claude-opus-4.7-1m-internal.
- Launch one or more background
research sub-agents that run on a different Claude model (in my case claude-opus-4.6-1m) — e.g. via the task tool with mode: "background" and agent_type: "research".
- Continue working in the parent session while the sub-agents are running (turns that produce thinking blocks).
- Let one or more of the sub-agents complete between parent turns, so the completion notification is delivered while the parent is preparing its next turn.
- The next parent turn issues an API request whose serialized history has a thinking block at some
messages.N.content.M with a signature value the API no longer accepts → 400 invalid_request_error: Invalid signature in thinking block.
- Every subsequent
continue (or any user prompt) reproduces the identical error against the same messages.N.content.M until you abandon the session.
In my case three different research sub-agents (tier1-r5-docs, tier1-r5-code, tier1-r5-tax-law) each triggered the same failure when they reported back into the parent loop:
2026-05-18T21:21:00.502Z subagent.completed tier1-r5-docs
2026-05-18T21:21:00.521Z system.notification Agent "tier1-r5-docs" (research) has completed successfully…
2026-05-18T21:21:00.859Z assistant.turn_start
2026-05-18T21:21:05.107Z assistant.turn_end
2026-05-18T21:21:05.163Z session.error CAPIError: 400 … messages.5.content.2: Invalid `signature` in `thinking` block (request_id: req_011CbAjZdRJmTT961kRWg8Ws)
2026-05-18T21:21:24.222Z subagent.completed tier1-r5-code
2026-05-18T21:21:24.237Z system.notification Agent "tier1-r5-code" (research) has completed successfully…
2026-05-18T21:21:24.593Z assistant.turn_start
2026-05-18T21:21:28.665Z assistant.turn_end
2026-05-18T21:21:28.723Z session.error CAPIError: 400 … messages.5.content.2: Invalid `signature` in `thinking` block (request_id: req_011CbAjbNMziS4DdrkguBbBG)
2026-05-18T22:10:03.836Z subagent.completed tier1-r5-tax-law
2026-05-18T22:10:03.856Z system.notification Agent "tier1-r5-tax-law" (research) has completed successfully…
2026-05-18T22:10:04.623Z assistant.turn_start
2026-05-18T22:10:10.120Z assistant.turn_end
2026-05-18T22:10:10.178Z session.error CAPIError: 400 … messages.5.content.2: Invalid `signature` in `thinking` block (request_id: req_vrtx_011CbAoJhWv6GbNY7j8vdC93)
All three errors point at the same messages.5.content.2 slot — i.e. a single corrupted thinking block at a fixed position in the cached history is wedging every retry.
The error messages as they appeared in the TUI were exactly:
✗ Execution failed: CAPIError: 400 {"type":"error","error":{"type":"invalid_request_error","message":"messages.5.content.2: Invalid `signature` in `thinking` block"},"request_id":"req_011CbAjZdRJmTT961kRWg8Ws"} (Request ID: 603E:D8F5A:C58A2B:D7672B:6A0B82BF)
● Background agent "Tier-1 round-5 code review" (research) completed
└ You are a Tier-1 reviewer for PR #132 of github.com/jay-tau/ibkr-fa, a Python…
✗ Execution failed: CAPIError: 400 {"type":"error","error":{"type":"invalid_request_error","message":"messages.5.content.2: Invalid `signature` in `thinking` block"},"request_id":"req_011CbAjbNMziS4DdrkguBbBG"} (Request ID: 603E:D8F5A:C605AF:D7EF6C:6A0B82D6)
● Background agent "Tier-1 round-5 tax-law review" (research) completed
└ You are a Tier-1 reviewer for PR #132 of github.com/jay-tau/ibkr-fa, a Python…
✗ Execution failed: CAPIError: 400 {"type":"error","error":{"type":"invalid_request_error","message":"messages.5.content.2: Invalid `signature` in `thinking` block"},"request_id":"req_vrtx_011CbAoJhWv6GbNY7j8vdC93"} (Request ID: 08FF:29E602:F6CB52:10E625B:6A0B8E3F)
Expected behavior
The CLI should not let a single corrupted thinking-block signature permanently brick a session. At least one of, ideally several of:
- Detect and recover from
invalid_request_error: Invalid signature in thinking block automatically by stripping/redacting the offending thinking block from the cached history and retrying (Anthropic's API explicitly allows you to omit thinking blocks when not using extended thinking on a follow-up turn, and to mark blocks as redacted_thinking when their signatures can't be regenerated).
- Guarantee thinking-block signatures stay paired with the model that produced them. When a sub-agent runs under a different Claude model than the parent (e.g.
opus-4.6 sub-agent under an opus-4.7 parent), the integration step should not leave any of the sub-agent's signed thinking blocks in the parent's serialized history. (And vice-versa — the parent's signed thinking blocks must not leak into the sub-agent's request.)
- Surface a user-actionable error, not a verbatim CAPI dump. Something like: "This session's conversation history was rejected by the model (corrupted reasoning signature at message 5). I've removed the corrupted block; please try again." — or, if recovery isn't possible, "…please run
/rewind to roll back to turn N, or /new to start fresh."
- Expose
/rewind (or similar) as a recovery option in the error message itself, since today the only options the user can guess at — continue, closing the terminal and /resume-ing, sending the same prompt again — all fail identically because they all re-send the same poisoned history.
- Telemetry: a session-level counter / health indicator that flips when
400 invalid_request_error is encountered, so the TUI can render a "session corrupted — start fresh" affordance instead of looking indistinguishable from a working session.
Additional context
- OS: Linux
- Parent model:
claude-opus-4.7-1m-internal
- Sub-agent model:
claude-opus-4.6-1m (running as a research sub-agent)
- Workspace:
/home/jaytau/temp/ibkr-fa (commit 8923d5fe)
- Session ID:
96bf5b50-79d8-4de9-b01a-8b57e95b3eaf
- Provider request IDs (one per failure, in order):
req_011CbAjZdRJmTT961kRWg8Ws → CAPI Request ID 603E:D8F5A:C58A2B:D7672B:6A0B82BF
req_011CbAjbNMziS4DdrkguBbBG → CAPI Request ID 603E:D8F5A:C605AF:D7EF6C:6A0B82D6
req_vrtx_011CbAoJhWv6GbNY7j8vdC93 → CAPI Request ID 08FF:29E602:F6CB52:10E625B:6A0B8E3F
- The session was using background
research sub-agents extensively (a 5-round multi-tier code/docs/tax-law review workflow) — so triggering this required nothing exotic, just sustained parallel task(mode: "background", agent_type: "research") usage on a tab- or branch-of-thought- heavy parent agent.
Workaround
The only workaround I found was to:
- End the wedged
copilot process.
/resume the session (and accept that the corrupted thinking block from the assistant's last turn would still be in the history, but the very next user continue after resume happened to succeed in my case — possibly because the resume path re-serializes the history slightly differently, possibly because the sub-agent results that were causing the conflict had now been fully consumed and dropped from the live cache).
No in-CLI recovery (continue, retrying the same prompt, sending a new prompt, switching models with /model) helped before the resume.
Related (not duplicates)
Describe the bug
A long-running session became permanently wedged with three identical
CAPIError: 400 — "messages.5.content.2: Invalid \signature` in `thinking` block"failures, each one firing within ~1 second of a background research sub-agent completing. After the first failure, every subsequent attempt tocontinueproduced the same error against the samemessages.5.content.2` slot, because the bad thinking-block sits permanently in the conversation history and is re-sent on every retry. The session had to be abandoned — there is no in-product way to repair, prune, or rewind past the corrupted thinking block, and no way for the user to see which message is the corrupted one.The trigger pattern is very reproducible from the events log: every error is preceded (within ≤5 s) by a
subagent.completedfor aresearchsub-agent and the correspondingsystem.notificationthat splices the completion into the parent loop. The parent then issues a brand-newassistant.turn_start→assistant.turn_end(very fast, ~4–5 s, meaning the API call returned an error rather than streaming a real response) →session.error.The parent agent was running
claude-opus-4.7-1m-internal; the sub-agents wereresearchagents (the most recent one ran onclaude-opus-4.6-1m). One plausible root cause is that thinking-block signatures issued for one Anthropic model are being kept in the parent's history when a sub-agent on a different model is integrated — Anthropic's API ties thinking-block signatures to the specific (model, turn) that produced them, and rejects the request if they are presented out of context.What makes this worse than a normal transient API error:
/rewindto a known-good turn, no "drop the corrupted thinking block" option, no diagnostic telling the user "your conversation history has been wedged by a stale thinking-block signature; start a new session"./resume-ing the session, or sendingcontinueagain) does not help as long as the bad block stays in the cached history.Affected version
Stack trace references the same code path in
1.0.48(app.js).(The session was started under 1.0.48 and the binary auto-updated to 1.0.49 between the failure and the time of this report; the failing code path is the same.)
Steps to reproduce the behavior
I can't synthetically reproduce this on demand yet, but the repro signature from the events log is:
claude-opus-4.7-1m-internal.researchsub-agents that run on a different Claude model (in my caseclaude-opus-4.6-1m) — e.g. via thetasktool withmode: "background"andagent_type: "research".messages.N.content.Mwith asignaturevalue the API no longer accepts →400 invalid_request_error: Invalid signature in thinking block.continue(or any user prompt) reproduces the identical error against the samemessages.N.content.Muntil you abandon the session.In my case three different
researchsub-agents (tier1-r5-docs,tier1-r5-code,tier1-r5-tax-law) each triggered the same failure when they reported back into the parent loop:All three errors point at the same
messages.5.content.2slot — i.e. a single corrupted thinking block at a fixed position in the cached history is wedging every retry.The error messages as they appeared in the TUI were exactly:
Expected behavior
The CLI should not let a single corrupted thinking-block signature permanently brick a session. At least one of, ideally several of:
invalid_request_error: Invalid signature in thinking blockautomatically by stripping/redacting the offending thinking block from the cached history and retrying (Anthropic's API explicitly allows you to omit thinking blocks when not using extended thinking on a follow-up turn, and to mark blocks asredacted_thinkingwhen their signatures can't be regenerated).opus-4.6sub-agent under anopus-4.7parent), the integration step should not leave any of the sub-agent's signed thinking blocks in the parent's serialized history. (And vice-versa — the parent's signed thinking blocks must not leak into the sub-agent's request.)/rewindto roll back to turn N, or/newto start fresh."/rewind(or similar) as a recovery option in the error message itself, since today the only options the user can guess at —continue, closing the terminal and/resume-ing, sending the same prompt again — all fail identically because they all re-send the same poisoned history.400 invalid_request_erroris encountered, so the TUI can render a "session corrupted — start fresh" affordance instead of looking indistinguishable from a working session.Additional context
claude-opus-4.7-1m-internalclaude-opus-4.6-1m(running as aresearchsub-agent)/home/jaytau/temp/ibkr-fa(commit8923d5fe)96bf5b50-79d8-4de9-b01a-8b57e95b3eafreq_011CbAjZdRJmTT961kRWg8Ws→ CAPI Request ID603E:D8F5A:C58A2B:D7672B:6A0B82BFreq_011CbAjbNMziS4DdrkguBbBG→ CAPI Request ID603E:D8F5A:C605AF:D7EF6C:6A0B82D6req_vrtx_011CbAoJhWv6GbNY7j8vdC93→ CAPI Request ID08FF:29E602:F6CB52:10E625B:6A0B8E3Fresearchsub-agents extensively (a 5-round multi-tier code/docs/tax-law review workflow) — so triggering this required nothing exotic, just sustained paralleltask(mode: "background", agent_type: "research")usage on a tab- or branch-of-thought- heavy parent agent.Workaround
The only workaround I found was to:
copilotprocess./resumethe session (and accept that the corrupted thinking block from the assistant's last turn would still be in the history, but the very next usercontinueafter resume happened to succeed in my case — possibly because the resume path re-serializes the history slightly differently, possibly because the sub-agent results that were causing the conflict had now been fully consumed and dropped from the live cache).No in-CLI recovery (
continue, retrying the same prompt, sending a new prompt, switching models with/model) helped before the resume.Related (not duplicates)