Skip to content

fix(ai-chat): stop provider tool-call replays from regressing tool part state (#1404)#1412

Merged
threepointone merged 1 commit intomainfrom
fix/tool-replay-regression-1404
Apr 29, 2026
Merged

fix(ai-chat): stop provider tool-call replays from regressing tool part state (#1404)#1412
threepointone merged 1 commit intomainfrom
fix/tool-replay-regression-1404

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

@threepointone threepointone commented Apr 29, 2026

Summary

Closes #1404. The OpenAI Responses API (and likely other providers) re-emits prior tool calls in continuation streams as a tool-input-starttool-input-deltatool-input-availabletool-output-available sequence carrying the same toolCallId and the same output the part already holds. AI SDK v6's updateToolPart mutates an existing tool part in place when the toolCallId matches, so a replayed tool-input-start was clobbering an output-available part back to input-streaming on the client and producing the worker warn:

(warn) [AIChatAgent] _applyToolResult: Tool part with toolCallId call_xxx
not in expected state (expected: input-available|approval-requested|approval-responded)

What changed

packages/agents/src/chat/message-builder.ts

  • applyChunkToParts is now idempotent against an existing tool part with the same toolCallId for tool-input-start, tool-input-delta, tool-input-available, and tool-input-error. A replayed tool-input-start no longer pushes a duplicate part or regresses state; deltas/available chunks only mutate while still input-streaming; tool-input-error preserves an existing terminal state (first-write-wins).
  • New exported helper isReplayChunk(parts, chunk) returns true when a tool-input-* chunk would visibly regress an AI SDK v6 client's tool part. Stream broadcasters use it to drop replay chunks before forwarding them. tool-output-available is intentionally not in the helper because its in-place update is safe when the data already matches.

packages/ai-chat/src/index.ts

  • _streamSSEReply calls isReplayChunk on each chunk and skips applying / storing / broadcasting if it's a replay. Combined with the applyChunkToParts idempotency, the cloned server-side streaming message stays clean and the regression-inducing chunks never reach the client.
  • _applyToolResult accepts output-available, output-error, and output-denied as starting states for idempotent re-application. A duplicate cf_agent_tool_result (cross-tab re-run, redelivered WS frame, provider replay round-trip) is now a silent no-op rather than a warn + skipped update. The cross-message tool-output-available / tool-output-error fallback gets the same first-write-wins semantics.
  • _findAndUpdateToolPart separately tracks wasFound (a matching part was processed) and hasRealChange (the apply actually mutated state). Idempotent re-applies skip the SQLite write and MESSAGE_UPDATED broadcast. The split also handles legacy duplicate tool parts correctly: a real change to one duplicate is still persisted even when another duplicate is already terminal.

Why two layers

The visible regression on the client comes from the AI SDK's own updateToolPart, which we can't fix from this repo. So the server has to stop forwarding the regression-inducing chunks (layer 1: isReplayChunk filter in _streamSSEReply). And the worker warn comes from _applyToolResult not accepting an already-terminal state, which we fix by making the apply itself idempotent (layer 2). The two fixes are independent: either alone reduces the symptoms, both together kill the bug.

The pre-existing "first-write-wins" contract for terminal tool states (locked in by client-tool-duplicate-message.test.ts) is preserved — terminal-state re-applies always become silent no-ops, never overwrites.

Test plan

  • packages/ai-chat/src/tests/message-builder.test.ts — new describe blocks for tool-input-* idempotency against existing tool parts (8 tests) and isReplayChunk (8 tests). Covers single-chunk regressions, the full provider replay sequence, approval-state preservation, and tool-input-error against terminal parts.
  • packages/ai-chat/src/tests/tool-result-replay.test.ts — new file (5 tests). Idempotent re-apply for output-available / output-error (no warn, no broadcast); regression guard that real transitions still broadcast; the full continuation-stream replay scenario verifying chunks are not forwarded; legacy duplicate-tool-part edge case.
  • packages/ai-chat/src/tests/worker.ts — adds a replayPriorToolCall body knob to TestChatAgent.onChatMessage simulating the OpenAI replay chunk pattern, and a testApplyToolResult helper.
  • npm run check — sherif + export checks + oxfmt + oxlint + typecheck all clean.
  • packages/ai-chat — 476/476 tests pass.
  • packages/agents — 1372/1372 tests pass (8 skipped, unchanged).
  • packages/think — 270/270 tests pass.
  • Two changesets included (agents patch + @cloudflare/ai-chat patch).

Made with Cursor


Open in Devin Review

Some providers (notably the OpenAI Responses API) re-emit prior tool
calls in continuation streams as a `tool-input-start` →
`tool-input-delta` → `tool-input-available` → `tool-output-available`
sequence carrying the same `toolCallId` and the same `output` the part
already holds. AI SDK v6's `updateToolPart` mutates an existing tool
part in place when the toolCallId matches, so a replayed
`tool-input-start` was clobbering an `output-available` part back to
`input-streaming` on the client and producing the worker warn
`_applyToolResult: Tool part with toolCallId X not in expected state`.

`packages/agents/src/chat/message-builder.ts`:

- `applyChunkToParts` is now idempotent against an existing tool part
  with the same `toolCallId` for `tool-input-start`, `tool-input-delta`,
  `tool-input-available`, and `tool-input-error`. A replayed
  `tool-input-start` no longer pushes a duplicate part or regresses
  state; deltas/available chunks only mutate while still
  `input-streaming`; `tool-input-error` preserves an existing terminal
  state (first-write-wins).
- New exported helper `isReplayChunk(parts, chunk)` returns true when a
  `tool-input-*` chunk would visibly regress an AI SDK v6 client's tool
  part. Stream broadcasters use it to drop replay chunks before
  forwarding them.

`packages/ai-chat/src/index.ts`:

- `_streamSSEReply` calls `isReplayChunk` on each chunk and skips
  applying / storing / broadcasting if it's a replay. Combined with the
  `applyChunkToParts` idempotency above, the cloned server-side
  streaming message stays clean and the regression-inducing chunks
  never reach the client.
- `_applyToolResult` accepts `output-available`, `output-error`, and
  `output-denied` as starting states for *idempotent* re-application.
  A duplicate `cf_agent_tool_result` (cross-tab re-run, redelivered WS
  frame, provider replay round-trip) is now a silent no-op rather than
  a warn + skipped update. The cross-message `tool-output-available` /
  `tool-output-error` fallback gets the same first-write-wins
  semantics.
- `_findAndUpdateToolPart` separately tracks `wasFound` (a matching
  part was processed) and `hasRealChange` (the apply actually mutated
  state). Idempotent re-applies skip the SQLite write and
  `MESSAGE_UPDATED` broadcast. The split also handles legacy duplicate
  tool parts correctly: a real change to one duplicate is still
  persisted even when another duplicate is already terminal.

Tests:

- `packages/ai-chat/src/tests/message-builder.test.ts` — new describe
  blocks for `tool-input-* idempotency against existing tool parts`
  and `isReplayChunk`, covering single-chunk regressions, the full
  provider replay sequence, approval-state preservation, and
  `tool-input-error` against terminal parts.
- `packages/ai-chat/src/tests/tool-result-replay.test.ts` — new file
  with end-to-end tests: idempotent re-apply for output-available and
  output-error (no warn, no broadcast), regression guard that real
  transitions still broadcast, the full continuation-stream replay
  scenario verifying chunks are not forwarded, and the legacy
  duplicate-tool-part edge case.
- `packages/ai-chat/src/tests/worker.ts` — adds a `replayPriorToolCall`
  body knob to `TestChatAgent.onChatMessage` simulating the OpenAI
  replay chunk pattern, and a `testApplyToolResult` helper.

Refs: #1404
Made-with: Cursor
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 29, 2026

🦋 Changeset detected

Latest commit: 3e6a10a

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
agents Patch
@cloudflare/ai-chat Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@threepointone threepointone merged commit 8fb7c03 into main Apr 29, 2026
1 check passed
@threepointone threepointone deleted the fix/tool-replay-regression-1404 branch April 29, 2026 01:25
@github-actions github-actions Bot mentioned this pull request Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tool part state regresses during client tool round-trip; continuation starts but emits no output

1 participant