fix(think,ai-chat): wait for all parallel client-tool results before auto-continuing (#1649) by threepointone · Pull Request #1651 · cloudflare/agents

threepointone · 2026-06-02T08:20:57Z

Problem

When the model emits multiple tool calls in a single step and the client resolves them independently (each addToolOutput sends cf_agent_tool_result with autoContinue: true), a fast tool's result could trigger the next inference while a slower sibling was still input-available.

The server's auto-continuation fired on a short coalesce timer (think 50ms, ai-chat 10ms), not on "is the model's tool-call batch actually complete." So the first result's autoContinue would continue the turn with an incomplete tool-result set:

On a build without transcript repair → AI_MissingToolResultsError (the crash the reporter saw).
On current main (Think repairs + convertToModelMessages(..., { ignoreIncompleteToolCalls: true })) → arguably worse: the still-in-flight sibling is silently flipped to errored and a spurious extra continuation runs, then the real result lands and churns the transcript.

Repro shape (from the issue):

onToolCall: async ({ toolCall, addToolOutput }) => {
  if (toolCall.toolName === "fastTool") {
    addToolOutput({ toolCallId: toolCall.toolCallId, output: { ok: true } }); // sync
  }
  if (toolCall.toolName === "slowTool") {
    const result = await agent.call("someCallable", [toolCall.input]); // >50ms
    addToolOutput({ toolCallId: toolCall.toolCallId, output: result });
  }
}

Fix

Add a server-side barrier: auto-continuation now holds until the model's parallel tool batch is fully answered, then runs a single continuation — regardless of result arrival order. This uses the transcript state the framework already tracks, so it covers tool results and approvals and needs no counting.

Scoped to the leaf (the key design choice)

The barrier fires only when the latest assistant message is mid-batch — it carries both a settled tool result and a still-pending (input-available/approval-requested) sibling (_hasIncompleteToolBatch()). That is precisely the #1649 signature (the model fanned out parallel calls and only some are answered).

It is intentionally not "any pending interaction anywhere in the transcript." A broad barrier over-blocked a legitimate, pre-existing path — a follow-up continuation when a result lands in a separate assistant message before a stream starts (chat-turn-serialization test). Leaf-scoping fixes #1649 without disturbing that.

Mechanics

think (packages/think/src/think.ts): _scheduleAutoContinuation's timer now calls _fireAutoContinuationWhenStable, which gates _fireAutoContinuation behind _awaitToolBatch. The wait runs before the turn is enqueued, so it never occupies the turn queue, and is wrapped in keepAliveWhile so the (typically sub-second) remainder of the batch can land. It awaits the in-flight tool-result apply (_pendingInteractionPromise) so a sibling that lands mid-wait is observed promptly.
ai-chat (packages/ai-chat/src/index.ts): _awaitPendingInteractionBarrier runs inside the queued continuation (after _awaitPendingAutoContinuationPrerequisite), because that path runs in-queue and so can't reuse the queue-idle waitUntilStable (it would deadlock). Same _hasIncompleteToolBatch predicate.
Bounded by AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS (60s): on timeout the continuation proceeds — Think's repair backstop errors the unanswered call, ai-chat surfaces the provider error — so a genuinely orphaned tool (client disconnected mid-batch) never pins the continuation open. A console.warn makes that fall-through visible.

Hardening found during review

Double-fire race: during a barrier wait, each sibling result re-arms the 50ms coalesce timer; when the barrier fired the continuation, that leftover timer could enqueue a second continuation in the window before pastCoalesce was set. _fireAutoContinuation now cancels any still-armed timer when it fires, and _fireAutoContinuationWhenStable bails if the continuation is already pastCoalesce. (ai-chat is structurally immune — a sibling during a pending continuation hits the merge branch, not a re-queue.)
Stale barrier on clear: resetTurnState now clears the barrier flag, and both wait loops bail when _continuation.pending is gone, so a chat clear / turn reset can't strand the wait holding the isolate.

Why server-side (not the client-side option from the issue)

The issue floated "client only sends autoContinue on the last result." That's fragile (tools resolve independently in onToolCall; the client isn't the transcript's source of truth) and wouldn't protect custom / non-useAgentChat clients. The server already owns the transcript and the completeness invariant, so the barrier belongs there.

Tests

New parallel fast/slow coverage in both packages: assert exactly one continuation fires after both results land, and the slow tool is never flipped to errored by a premature repair.
Updated the think deferred continuation test and the ai-chat continuation-stream-error test — both previously encoded the old partial-continuation behavior (continue after the first of two parallel tools).
Full suites green: think 483, ai-chat 492. Typecheck, oxlint, oxfmt clean.

Follow-up

#1650 tracks making the Think barrier event-driven (return-and-wait-for-the-next-result-event instead of a bounded timer) to avoid the 60s fire-through for human-in-the-loop tools (e.g. ask_user/display_ui emitted in parallel) and the orphan keepAlive hold. Kept symmetric (bounded-wait) with ai-chat here on purpose, since #1642 (N3) wants to unify the duplicated think↔ai-chat layer.

Test plan

CI green (npm run build + npm run check + affected tests)
Manual: parallel client tools where one resolves synchronously and one via a >50ms backend RPC → single clean continuation, no MissingToolResultsError, no spurious repair
Sanity: a single client tool still auto-continues immediately (no added latency)
Sanity: server-approval continuation (Bug: needsApproval tool approval-responded state never transitions to output-available (think@0.7.3 + agents@0.13.3) #1627) and chained approvals (ai-chat: tool continuations after client-side tools lose earlier steps' parts #1160) still work

Made with Cursor

…auto-continuing (#1649) When the model emitted multiple tool calls in one step and the client resolved them independently via `addToolOutput`, a fast result's `autoContinue` could trigger the next inference while a slower sibling was still `input-available`. That fed the provider an incomplete tool-result set (`MissingToolResultsError`) or — via the transcript-repair backstop — silently flipped the in-flight sibling to errored and ran a spurious extra continuation. Auto-continuation now holds on a barrier until the model's parallel tool batch is fully answered. The barrier is scoped to the *leaf* assistant message — it fires only when that step is mid-batch (carries both a settled tool result and a still-pending `input-available`/`approval-requested` sibling), so an unrelated dangling tool in an earlier message can't block a legitimate follow-up continuation. The wait awaits the in-flight tool-result apply (so a sibling landing mid-wait is observed promptly) and is bounded by `AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS` (60s) so a genuinely orphaned tool call (client disconnected mid-batch) still falls through to the existing backstop instead of pinning the continuation open. - think: gate `_fireAutoContinuation` behind `_fireAutoContinuationWhenStable` (runs before enqueue, so it never occupies the turn queue) + `_awaitToolBatch` + `_hasIncompleteToolBatch`. Hardened against a double-fire race where a sibling result re-arms the coalesce timer during the wait: `_fireAutoContinuation` cancels any still-armed timer, and `_fireAutoContinuationWhenStable` bails if the continuation is already `pastCoalesce`. `resetTurnState` clears the barrier flag so a chat clear can't strand it. - ai-chat: gate inside the queued continuation via `_awaitPendingInteractionBarrier` + `_hasIncompleteToolBatch` (it runs in-queue, so it can't reuse the queue-idle `waitUntilStable`). - Both: a `console.warn` on timeout fall-through so a continuation proceeding with an unanswered tool isn't mysterious. Tests: new parallel fast/slow coverage in both packages (exactly one continuation after both results land, no premature repair of the slow tool); updated the think "deferred continuation" and ai-chat stream-error tests, which encoded the old partial-continuation behavior. Follow-up tracked in #1650 (event-driven barrier to avoid the bounded fire-through for human-in-the-loop tools + orphan keepAlive). Co-authored-by: Cursor <cursoragent@cursor.com>

changeset-bot · 2026-06-02T08:21:02Z

🦋 Changeset detected

Latest commit: f7bb27c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
@cloudflare/think	Patch
@cloudflare/ai-chat	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

threepointone · 2026-06-02T08:22:46Z

/bonk do a deep review here, look for edge cases and things we might've missed

pkg-pr-new · 2026-06-02T08:26:40Z

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1651

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1651

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1651

hono-agents

npm i https://pkg.pr.new/hono-agents@1651

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1651

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1651

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1651

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1651

commit: f7bb27c

ask-bonk

Deep Review — Parallel Tool-Call Barrier (#1649 fix)

Went through the full diff, the surrounding implementation (tool-result application, _pendingInteractionPromise lifecycle, continuation state machine, ContinuationState, keepAliveWhile, turn queue), and analyzed edge cases. Overall this is a well-designed fix — the leaf-scoped barrier predicate is the right call, the bounded wait with timeout fallback is solid, and the double-fire hardening in Think is necessary and correct. Tests are thorough.

A few things I want to flag:

1. `_pendingInteractionPromise` is a single slot — N≥3 tool calls with near-simultaneous arrivals

Think path (_awaitToolBatch): When the barrier is in its 100ms poll sleep and two tool results arrive ~simultaneously (say medium and slow for a 3-tool batch), the second overwrites _pendingInteractionPromise before the first resolves. The barrier loop only awaits whichever value is current when it wakes up.

This is actually fine in practice because:

_applyToolResult is microtask-deferred (Promise.resolve().then(...)) so both results' apply promises are created synchronously during the same event-loop turn
The second one overwrites the first, but the first still runs and patches the cache
After awaiting the second, _hasIncompleteToolBatch() sees both updates since both patched _cachedMessages
The only risk is if the first apply throws and the barrier never observes it — but the error is irrelevant since we continue to re-check _hasIncompleteToolBatch() anyway

However, this is worth noting: the barrier's correctness relies on _hasIncompleteToolBatch() being the true gate, not on awaiting every individual _pendingInteractionPromise. The single-slot pattern is a "wake up sooner" optimization, not a correctness invariant. If that's the intent, a brief comment on the single-slot nature would help future readers understand why this isn't a bug.

2. `_hasIncompleteToolBatch` — `[...this.messages].reverse()` allocation per poll tick

Both Think and ai-chat do:

const leaf = [...this.messages].reverse().find(m => m.role === "assistant");

This copies + reverses the entire message array on every 100ms poll tick and on every _pendingInteractionPromise resolution during the barrier wait. For long conversations this is wasteful. A backward for loop would be zero-allocation:

const msgs = this.messages;
let leaf: typeof msgs[number] | undefined;
for (let i = msgs.length - 1; i >= 0; i--) {
  if (msgs[i].role === "assistant") { leaf = msgs[i]; break; }
}

Not a correctness issue, but given this runs in a tight loop during a potentially latent barrier, it's worth optimizing — especially since this.messages in Think returns the raw _cachedMessages reference so the array can be large.

3. ai-chat: no `_continuationBarrierActive` guard — is double-entry possible?

Think has an explicit _continuationBarrierActive flag to prevent multiple concurrent barrier waits. ai-chat does not — its barrier runs inside the turn queue, so the turn queue serialization prevents concurrent entry.

But: the ai-chat barrier is called from _awaitPendingAutoContinuationPrerequisite, which is called inside _queueAutoContinuation's turn. If a second tool result arrives while the first continuation is waiting on the barrier (still in the turn queue), it hits the merge branch in _enqueueAutoContinuation (updating pending.prerequisite), not enqueueing a new turn. So the turn queue + merge semantics structurally prevent double-entry.

This is correct but subtle. The PR description mentions "ai-chat is structurally immune" — accurate, but a brief comment in _awaitPendingInteractionBarrier noting why a guard isn't needed (turn queue serialization) would make the asymmetry between Think and ai-chat more obvious.

4. Timeout + `keepAliveWhile` holds the isolate for up to 60 seconds on orphan

When a client disconnects mid-batch and a tool result never arrives, the barrier holds for AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS (60s) with keepAliveWhile active. This pins the DO alive on the alarm cycle for a full minute before falling through to the repair backstop.

The PR description acknowledges this and #1650 tracks making it event-driven. Just confirming: is 60s a reasonable worst-case hold? For human-in-the-loop tools (mentioned in the follow-up), 60s is quite short — but those tools presumably don't use autoContinue: true from the client, so they shouldn't trigger the barrier at all. Worth verifying that assumption holds.

5. Edge case: all tool calls in the batch fail/error

If the model emits 2 tool calls and the client sends both back as state: "output-error", _hasIncompleteToolBatch correctly treats output-error as settled (it's in the settled set). Good — no issue here.

6. Edge case: tool result arrives after a `resetTurnState` / chat clear

Think: resetTurnState sets _continuationBarrierActive = false and _continuation.pending = null. The _awaitToolBatch loop checks !this._continuation.pending and bails. The .finally() in _fireAutoContinuationWhenStable checks pending before firing. All correct.

ai-chat: resetTurnState clears _continuation.pending. The barrier checks !this._continuation.pending and returns. The continuation then checks for connection and bails at the if (!connection) guard. Correct.

7. The updated test in `client-tools-continuation.test.ts` — both results sent ~synchronously

The updated stream-error test now sends both tool results back-to-back before waiting for the continuation. This is the correct behavior under the new barrier, but it also means the test no longer exercises the "second result arrives during an active continuation" path. The old test had a specific ordering (first result → continuation starts → second result arrives during stream). The new ordering (both results → single continuation) is what the fix mandates, but the "result during active stream" path is still a valid scenario for separate assistant messages. Worth confirming that path has test coverage elsewhere (the chat-turn-serialization test mentioned in the PR description).

Summary

The fix is correct. The leaf-scoped _hasIncompleteToolBatch() predicate is the right boundary — it precisely targets the #1649 signature without over-blocking. The double-fire hardening in Think (clearing the timer in _fireAutoContinuation, the pastCoalesce + _continuationBarrierActive guards) is necessary and well-implemented. The ai-chat path is structurally simpler because it runs in-queue.

The two minor suggestions (backward loop instead of spread+reverse, comments on single-slot _pendingInteractionPromise and ai-chat's implicit guard) are polish, not blockers.

devin-ai-integration

Devin Review found 2 potential issues.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-06-02T08:29:30Z

+      } else if (
+        typeof record.type === "string" &&
+        record.type.startsWith("tool-") &&
+        (state === "output-available" ||
+          state === "output-error" ||
+          state === "output-denied" ||
+          state === "approval-responded")
+      ) {
+        hasSettled = true;


🟡 _hasIncompleteToolBatch omits dynamic-tool from settled check, breaking barrier for dynamic tool batches

In Think's _hasIncompleteToolBatch, the settled-state check only matches record.type.startsWith("tool-"), but "dynamic-tool".startsWith("tool-") is false. Every other tool-part identification site in Think consistently checks record.type.startsWith("tool-") || record.type === "dynamic-tool" — see _incompleteToolCallIds at packages/think/src/think.ts:1663, _repairInterruptedMessages at packages/think/src/think.ts:1724, and _partialHasSettledToolResults at packages/think/src/think.ts:8030. Meanwhile, the pending side of _hasIncompleteToolBatch has no type filter at all, so a dynamic-tool in input-available IS counted as pending. This asymmetry means if a parallel batch contains a settled dynamic-tool result alongside a still-pending tool-* call, hasSettled stays false and the method returns false — the barrier doesn't hold and the continuation fires prematurely, which is exactly the #1649 scenario this PR is supposed to fix.

Suggested change

} else if (

typeof record.type === "string" &&

record.type.startsWith("tool-") &&

(state === "output-available" ||

state === "output-error" ||

state === "output-denied" ||

state === "approval-responded")

) {

hasSettled = true;

} else if (

typeof record.type === "string" &&

(record.type.startsWith("tool-") || record.type === "dynamic-tool") &&

(state === "output-available" ||

state === "output-error" ||

state === "output-denied" ||

state === "approval-responded")

) {

hasSettled = true;

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-06-02T08:29:31Z

+      } else if (
+        typeof record.type === "string" &&
+        record.type.startsWith("tool-") &&
+        (state === "output-available" ||
+          state === "output-error" ||
+          state === "output-denied" ||
+          state === "approval-responded")
+      ) {
+        hasSettled = true;


🟡 _hasIncompleteToolBatch in AIChatAgent also omits dynamic-tool from settled check

Same issue as in Think. AIChatAgent's _hasIncompleteToolBatch checks record.type.startsWith("tool-") for settled parts but "dynamic-tool" doesn't match this prefix. AIChatAgent recognizes dynamic-tool as a tool type elsewhere (see packages/ai-chat/src/index.ts:3337: type.startsWith("tool-") || type === "dynamic-tool"). If a dynamic-tool settles and triggers autoContinue while a sibling tool-* call is still input-available, the barrier won't detect the batch as incomplete and the continuation fires prematurely.

Suggested change

} else if (

typeof record.type === "string" &&

record.type.startsWith("tool-") &&

(state === "output-available" ||

state === "output-error" ||

state === "output-denied" ||

state === "approval-responded")

) {

hasSettled = true;

} else if (

typeof record.type === "string" &&

(record.type.startsWith("tool-") || record.type === "dynamic-tool") &&

(state === "output-available" ||

state === "output-error" ||

state === "output-denied" ||

state === "approval-responded")

) {

hasSettled = true;

Was this helpful? React with 👍 or 👎 to provide feedback.

ask-bonk · 2026-06-02T08:29:52Z

+   */
+  private _hasIncompleteToolBatch(): boolean {
+    const leaf = [...this.messages]
+      .reverse()


Nit: [...this.messages].reverse().find(...) allocates a full copy + reverse on every 100ms poll tick during the barrier wait. For long conversations this is wasteful. A backward for-loop would be zero-allocation:

const msgs = this.messages; let leaf: UIMessage | undefined; for (let i = msgs.length - 1; i >= 0; i--) { if (msgs[i].role === "assistant") { leaf = msgs[i]; break; } }

Not blocking, but this is a hot path during the barrier poll.

ask-bonk · 2026-06-02T08:29:52Z

+   */
+  private _hasIncompleteToolBatch(): boolean {
+    const leaf = [...this.messages]
+      .reverse()


Same spread+reverse nit as in Think — a backward for-loop avoids the allocation.

ask-bonk · 2026-06-02T08:29:52Z

+   * continuation turn, so — unlike `waitUntilStable` — it must not wait on the
+   * turn queue (that would deadlock).
+   */
+  private async _awaitPendingInteractionBarrier(): Promise<void> {


Worth a brief comment that this barrier does NOT need a _continuationBarrierActive guard (unlike Think's) because it runs inside the turn queue — the queue serialization structurally prevents concurrent entry. The asymmetry is correct but easy to miss when reading the two implementations side by side.

ask-bonk · 2026-06-02T08:29:52Z

+      // The pending continuation was cleared (chat clear / turn reset) — nothing
+      // to wait for; bail so the isolate isn't held by a stale wait.
+      if (!this._continuation.pending) return;
+      const pending = this._pendingInteractionPromise;


_pendingInteractionPromise is a single slot — when N≥3 tool results arrive in quick succession, later arrivals overwrite earlier ones before the barrier observes them. This is correct because the true gate is _hasIncompleteToolBatch() (which reads the patched cache), and the single-slot await is a "wake up sooner" optimization, not a correctness invariant. A brief comment noting that would help future readers understand why the overwrite is not a bug.

ask-bonk · 2026-06-02T08:30:10Z

Done. I posted a comprehensive review comment and 4 inline comments on PR #1651. Here's a summary of what I found:

Overall: The fix is correct and well-designed. The leaf-scoped _hasIncompleteToolBatch() predicate precisely targets the #1649 race without over-blocking legitimate continuation paths. The double-fire hardening in Think is necessary and correctly implemented.

Key findings:

_pendingInteractionPromise single-slot with N≥3 tools — Later arrivals overwrite earlier ones, but this is fine because the true correctness gate is _hasIncompleteToolBatch() reading the patched message cache, not awaiting every promise. Suggested a clarifying comment.
[...this.messages].reverse().find(...) allocation — Copies + reverses the full message array on every 100ms poll tick. A backward for-loop would be zero-allocation. Minor but easy fix since this is a hot path.
Think vs ai-chat barrier guard asymmetry — Think needs _continuationBarrierActive because its barrier runs outside the turn queue; ai-chat doesn't because the turn queue serialization prevents concurrent entry. Correct but worth documenting the "why".
60s keepAliveWhile on orphaned tools — Acknowledged as a known trade-off; Auto-continuation parallel-tool barrier: make Think event-driven (avoid timeout fire-through for human-in-the-loop tools + orphan keepAlive) — follow-up to #1649 #1650 tracks the event-driven improvement. Confirmed human-in-the-loop tools shouldn't trigger this since they don't use autoContinue: true.
Edge cases verified as handled: all-errored batches, resetTurnState during barrier wait, deferred continuation after barrier-gated continuation.

github run

- `_hasIncompleteToolBatch`: replace `[...this.messages].reverse().find(...)` with a zero-allocation backward scan — it runs on every barrier poll tick and `this.messages` can be large. - Document that `_pendingInteractionPromise` is a single slot: awaiting it is a "wake sooner" optimization, not the correctness gate (`_hasIncompleteToolBatch` re-checked each loop is). Clarifies the N>=3 near-simultaneous-arrival case. - ai-chat: note why no `_continuationBarrierActive`-style guard is needed (the barrier runs inside the exclusive continuation turn; a sibling result hits the merge branch of `_enqueueAutoContinuation`, so the turn queue serializes it). No behavior change. Co-authored-by: Cursor <cursoragent@cursor.com>

…barrier (#1649) `_hasIncompleteToolBatch` only matched `type.startsWith("tool-")` on the settled side, but the pending side has no type filter — so a `dynamic-tool` in `input-available` counted as pending while a settled `dynamic-tool` (`output-available`/etc.) did NOT count as settled. A parallel batch with a settled `dynamic-tool` beside a pending `tool-*` would therefore read as not-mid-batch and the continuation would fire prematurely — the exact #1649 failure, for dynamic-tool batches. Match `dynamic-tool` alongside `tool-*` on the settled side, consistent with every other tool-part check in both packages (`_incompleteToolCallIds`, `_repairInterruptedMessages`, `_partialHasSettledToolResults`, ai-chat's sanitizer). Regression test added in each package: a settled `dynamic-tool` beside a pending `tool-*` holds the barrier until the regular tool resolves. Co-authored-by: Cursor <cursoragent@cursor.com>

threepointone · 2026-06-02T08:49:21Z

/bonk please review again

ask-bonk

Re-review — Parallel Tool-Call Barrier (#1649 fix)

Went through all three commits on the branch (e65bdf0 initial fix, bddfa14 review polish, f7bb27c dynamic-tool fix), the full diff against main, and verified the issues from the first review round.

Previous review items — status

Item	Status
`dynamic-tool` omitted from settled check (Devin flagged)	Fixed in `f7bb27c` — both packages now have `record.type.startsWith("tool-") \|\| record.type === "dynamic-tool"` on the settled side, with regression tests
`[...this.messages].reverse().find(...)` allocation (bonk flagged)	Fixed in `bddfa14` — replaced with zero-allocation backward `for` loop in both packages
`_pendingInteractionPromise` single-slot comment	Fixed — clear comment added in both `_awaitToolBatch` (Think) and `_awaitPendingInteractionBarrier` (ai-chat) explaining the single-slot is a wake-up optimization, not the correctness gate
ai-chat missing `_continuationBarrierActive` guard comment	Fixed — JSDoc on `_awaitPendingInteractionBarrier` explains why no guard is needed (turn queue serialization)

All four items are resolved.

Architecture verification

Confirmed the following structural properties:

No barrier bypass paths. Every entry into _fireAutoContinuation in Think goes through _fireAutoContinuationWhenStable — the timer callback (line ~8698) and _activateDeferredContinuation (line ~8897) both route through the stable wrapper. No direct calls exist outside the wrapper itself.
Double-fire prevention in Think is correct. _fireAutoContinuationWhenStable has three guards: (a) !pending early return, (b) pastCoalesce check (continuation already running), (c) _continuationBarrierActive check (barrier already waiting). _fireAutoContinuation clears the timer synchronously before any async work. resetTurnState clears _continuationBarrierActive.
ai-chat barrier placement is correct. _awaitPendingInteractionBarrier() is called after the prerequisite loop exits in _awaitPendingAutoContinuationPrerequisite, not inside it. The break change (line ~2101, ~2114) replaces return true to allow fall-through to the barrier. The return true after the barrier (line ~2127) preserves the method contract.
resetTurnState teardown is complete in both packages. Think clears _continuationBarrierActive, _continuationTimer, _pendingInteractionPromise, and calls _continuation.clearAll(). ai-chat clears _pendingInteractionPromise, resets the turn queue, and calls _continuation.clearAll(). Both barrier loops check !this._continuation.pending at the top of each iteration, so a chat clear during a barrier wait causes immediate bail-out.

Test coverage

The new tests are well-structured:

Think: 3 new test cases — fast/slow parallel batch, dynamic-tool mixed batch, and deferred-continuation coalescing (updated from the old "deferred tool result runs after active continuation" test that encoded the pre-barrier behavior).
ai-chat: 2 new test cases — fast/slow parallel batch, dynamic-tool mixed batch. Plus the stream-error continuation test updated to send both results before expecting the continuation.

All tests verify: (a) zero continuations after the first result + delay, (b) exactly one continuation after the batch completes, (c) no spurious repair of the slow sibling.

Test coverage gaps (non-blocking)

Three scenarios lack test coverage. None are blocking for this PR but are worth noting for follow-up:

60s timeout path. No test exercises AUTO_CONTINUATION_PENDING_TOOL_TIMEOUT_MS firing when a tool result genuinely never arrives. The console.warn + fall-through-to-repair path is untested. Testing with a reduced timeout constant would be straightforward and would catch regressions where the timeout breaks (e.g., the barrier hangs forever on orphaned tools).
resetTurnState / chat clear during active barrier wait. The barrier loop's !this._continuation.pending bail-out is the safety valve for this, but no test sends a chat clear while the barrier is actively polling between two tool results. A regression here would cause the barrier to hold the isolate after a clear.
N >= 3 parallel tool calls. All barrier tests use exactly 2 tools. The barrier logic is a simple "any pending AND any settled = hold" scan so 2 tools is likely sufficient, but a 3-tool test would catch potential off-by-one issues with the settled/pending detection (e.g., "all tools must be settled" vs "at least one settled" confusion).

Verdict

The fix is correct and complete. The leaf-scoped _hasIncompleteToolBatch() predicate precisely targets the #1649 race without over-blocking. The three-layer guard in Think (pastCoalesce / barrierActive / timer clearing) prevents double-fires. The ai-chat design correctly relies on turn-queue serialization instead. The dynamic-tool fix and allocation optimization from review round 1 are both landed.

The test gaps above are minor — consider them for follow-up alongside #1650 (event-driven barrier).

Looks good.

ask-bonk · 2026-06-02T08:53:27Z

Review posted on PR #1651. Here's the summary:

All four items from the first review round are resolved — the dynamic-tool fix, the allocation optimization, and the two documentation comments are all landed across the three commits.

The fix is correct. The leaf-scoped _hasIncompleteToolBatch() barrier precisely targets the #1649 race (fast tool result triggering continuation while slow sibling is still in-flight) without over-blocking legitimate continuation paths. The double-fire prevention in Think (three-layer guard + timer clearing) and ai-chat's reliance on turn-queue serialization are both sound.

Three minor test gaps noted (non-blocking): no coverage for the 60s timeout path, no resetTurnState during active barrier wait test, and no N>=3 parallel tool calls test. These are follow-up material alongside #1650.

github run

…1649 follow-up) (#1657) * fix(think): serialize parallel client-tool result/approval applies (#1649 follow-up) ## Problem The auto-continuation barrier added in #1651 stopped *premature* continuation, but a deeper data race remained in Think. Each `cf_agent_tool_result` / `cf_agent_tool_approval` WebSocket message triggers a read-modify-write of the whole assistant message via `_applyToolUpdateToMessages`, which awaits a storage read (`await this.session.getHistory()`) before writing. When the model fans out parallel tool calls in a single assistant step, the per-result applies run concurrently: 1. apply A and apply B both `await` the storage read and resolve with the same snapshot — every sibling still `input-available`. 2. A patches only its own part and writes the full message back. 3. B (holding the stale snapshot) patches only *its* part and writes the full message back, clobbering A's result to `input-available`. The clobber lands in both durable storage and the `this.messages` cache, so `_hasIncompleteToolBatch()` keeps seeing an unanswered sibling. The continuation barrier then times out and the transcript-repair backstop errors the lost call with "The tool call was interrupted before a result was recorded." ## Fix Introduce a per-instance serialization tail (`_interactionApplyTail`) and route every result/approval apply through `_enqueueInteractionApply`, which chains each apply off the previous one. This makes each read-modify-write atomic and ordered, so siblings can no longer clobber each other. - `.then(run, run)` is used so a rejected apply can't poison the rest of the batch. - `_pendingInteractionPromise` still tracks the newest link; because the chain is serial, the barrier's single-slot wake-up transitively waits for every predecessor. - `resetTurnState` drops the tail so a new turn never queues behind a stale (possibly hung) apply from a reset turn. ## ai-chat (defensive) The same serialization is added to `@cloudflare/ai-chat` for symmetry. ai-chat does **not** exhibit this bug today: `_findAndUpdateToolPart` performs the read-modify-write synchronously (no await between reading `this.messages` and the synchronous SQLite write), so concurrent applies cannot interleave. The queue is a forward-guard that keeps the invariant safe if that apply ever gains an await (e.g. async storage). ## Tests - Added a deterministic serialization test in both packages (`testInteractionApplySerialization`) that drives two overlapping read-modify-writes through the queue. Without serialization the result is 1 (clobbered); serialized it is 2. Verified the Think test FAILS without the fix (`expected 1 to be 2`). - Added an integration test in Think asserting three concurrently-sent parallel results all settle to `output-available`, none is repaired, and exactly one continuation runs. - `_enqueueInteractionApply` is `protected` + `@internal` so the test agents can drive it deterministically (the vitest-workers harness can't reproduce the real storage interleave because DO SQLite is synchronous). All checks pass: format, oxlint, typecheck (91/91), think (486) and ai-chat (566) suites. * fix(think): apply client-tool results/approvals that arrive mid-stream (#1649 follow-up) ## Problem The serialization fix (prior commit) stopped parallel client-tool results from clobbering each other, but the reporter still hit "The tool call was interrupted before a result was recorded." on every client-resolved tool call — even single (non-parallel) ones. Root cause is an ordering inversion between streaming and persistence: - During a streaming turn the assistant message lives ONLY in the in-flight `StreamAccumulator`. It is not written to durable storage (or added to `this.messages`) until `_persistAssistantMessage` runs at the end of the stream. - The `tool-input-available` chunk, however, is broadcast to the client mid-stream. A fast client resolves the tool and sends `cf_agent_tool_result` BEFORE the message is ever persisted. - `_applyToolUpdateToMessages` only scanned durable storage, so the apply found no matching message and silently no-op'd. - The end-of-stream persist then wrote the tool part as `input-available`, and the auto-continuation's transcript-repair backstop errored it as "interrupted". This is why "parallel sometimes works": a slow sibling (e.g. an RPC tool) resolves AFTER the stream has finished and persisted, so its apply finds the message in storage. ## Fix `_applyToolUpdateToMessages` now applies the update to BOTH: 1. the in-flight `StreamAccumulator` (in place), so the result rides into the eventual end-of-stream persist instead of being dropped; and 2. durable storage (unchanged), which still handles already-persisted messages — including partials written mid-stream by stall recovery and cross-message tool results that target an earlier message. Applying to both is deliberately monotonic: a later persist of `accumulator.toMessage()` (e.g. stall-recovery partial persist) carries the settled result and cannot downgrade it back to `input-available`. The accumulator is exposed via a new `_streamingAssistant` field for the duration of each streaming turn and cleared on every exit path (`_streamResult` normal + stall-scheduled/exhausted early returns, `_streamResultToRpcCallback` finally) and in `resetTurnState`. The field is intentionally ephemeral and never persisted; it is only ever set while the isolate is held alive by `keepAliveWhile`, so hibernation cannot strand it. This mirrors `@cloudflare/ai-chat`'s existing `_streamingMessage` handling, generalized to also cover the post-persist path. It fixes both client tool RESULTS and APPROVALS, since both flow through `_applyToolUpdateToMessages`. ## Why correctness holds for the auto-continuation barrier Mid-stream, the in-flight message is not yet in `this.messages`, so `_hasIncompleteToolBatch()` inspects a stale (prior) leaf. This does NOT break correctness: the continuation is enqueued behind the active streaming turn on the turn queue, and that turn persists the now-settled result before the continuation's transcript repair runs. The ordering guarantee is the turn queue, not the barrier's cache view. Documented inline. ## Tests - Deterministic reproduction (`simulateMidStreamClientToolResult`): a tool call exists only in the accumulator when the result is applied, then the message is persisted. Asserts the part ends `output-available`. Fails on the pre-fix code (persists `input-available`). - Deterministic approval variant (`simulateMidStreamClientToolApproval`): an `approval-requested` part reaches `approval-responded` via the same path. - Full-pipeline integration test over the real WS protocol: a new slow client-tool mock model finalizes the tool call (emits `tool-call`) then holds the stream open with trailing text. The test waits via `streamingToolCallState` until the call is exposed, sends `cf_agent_tool_result` while the stream is still open, then asserts the part is `output-available`, carries the output, and that NO `chat:transcript:repaired` fired for it. E2E (wrangler-spawning) was assessed and deferred: the integration test already exercises the full server pipeline in the real Workers runtime and times the mid-stream send deterministically; the e2e harness here is built for process-kill/eviction scenarios where a timing-based test would be flaky and largely redundant. ## Verification - format, oxlint (0/0), typecheck (91/91 projects) - full think suite: 488 passed (pre-additions); client-tools: 61 passed including the 3 new tests

ask-bonk Bot reviewed Jun 2, 2026

View reviewed changes

devin-ai-integration Bot reviewed Jun 2, 2026

View reviewed changes

ask-bonk Bot reviewed Jun 2, 2026

View reviewed changes

threepointone and others added 2 commits June 2, 2026 09:40

ask-bonk Bot reviewed Jun 2, 2026

View reviewed changes

threepointone mentioned this pull request Jun 2, 2026

_scheduleAutoContinuation 50ms timer fires before all parallel client-tool results arrive → MissingToolResultsError #1649

Closed

threepointone merged commit d118d11 into main Jun 2, 2026
4 checks passed

threepointone deleted the fix/parallel-client-tool-continuation-1649 branch June 2, 2026 08:55

threepointone mentioned this pull request Jun 2, 2026

fix(think): serialize parallel client-tool result/approval applies (#1649 follow-up) #1657

Merged

github-actions Bot mentioned this pull request Jun 2, 2026

Version Packages #1660

Merged

threepointone mentioned this pull request Jun 3, 2026

refactor: unify duplicated chat-recovery/repair machinery into the shared agents/chat layer (N3) #1642

Open

Conversation

threepointone commented Jun 2, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Scoped to the leaf (the key design choice)

Mechanics

Hardening found during review

Why server-side (not the client-side option from the issue)

Tests

Follow-up

Test plan

Uh oh!

changeset-bot Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

threepointone commented Jun 2, 2026

Uh oh!

pkg-pr-new Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Deep Review — Parallel Tool-Call Barrier (#1649 fix)

1. _pendingInteractionPromise is a single slot — N≥3 tool calls with near-simultaneous arrivals

2. _hasIncompleteToolBatch — [...this.messages].reverse() allocation per poll tick

3. ai-chat: no _continuationBarrierActive guard — is double-entry possible?

4. Timeout + keepAliveWhile holds the isolate for up to 60 seconds on orphan

5. Edge case: all tool calls in the batch fail/error

6. Edge case: tool result arrives after a resetTurnState / chat clear

7. The updated test in client-tools-continuation.test.ts — both results sent ~synchronously

Summary

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ask-bonk Bot commented Jun 2, 2026

Uh oh!

threepointone commented Jun 2, 2026

Uh oh!

ask-bonk Bot left a comment

Choose a reason for hiding this comment

Re-review — Parallel Tool-Call Barrier (#1649 fix)

Previous review items — status

Architecture verification

Test coverage

Test coverage gaps (non-blocking)

Verdict

Uh oh!

ask-bonk Bot commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

threepointone commented Jun 2, 2026 •

edited by devin-ai-integration Bot

Loading

changeset-bot Bot commented Jun 2, 2026 •

edited

Loading

pkg-pr-new Bot commented Jun 2, 2026 •

edited

Loading

1. `_pendingInteractionPromise` is a single slot — N≥3 tool calls with near-simultaneous arrivals

2. `_hasIncompleteToolBatch` — `[...this.messages].reverse()` allocation per poll tick

3. ai-chat: no `_continuationBarrierActive` guard — is double-entry possible?

4. Timeout + `keepAliveWhile` holds the isolate for up to 60 seconds on orphan

6. Edge case: tool result arrives after a `resetTurnState` / chat clear

7. The updated test in `client-tools-continuation.test.ts` — both results sent ~synchronously