Skip to content

fix(ai-chat): simplify turn coordination API#1151

Merged
threepointone merged 12 commits intomainfrom
docs/turn-coordination-patterns
Mar 23, 2026
Merged

fix(ai-chat): simplify turn coordination API#1151
threepointone merged 12 commits intomainfrom
docs/turn-coordination-patterns

Conversation

@whoiskatrin
Copy link
Contributor

@whoiskatrin whoiskatrin commented Mar 22, 2026

Summary

Simplifies the turn coordination API surface from 6 protected helpers to 3, and rewrites waitForPendingInteractionResolution() with event-driven semantics (no more polling).

Builds on the turn serialization machinery from #1142. That PR added 6 protected helpers; this PR consolidates them before any release ships.


What changed

1. API surface: 6 protected helpers → 3

Kept (protected):

Helper Purpose
waitUntilStable() Waits until fully stable — no active stream, no pending interactions, no queued continuations
resetTurnState() Aborts the active turn and invalidates queued continuations; use in clear interceptors
hasPendingInteraction() Synchronous check for pending tool input or approval

Demoted to private:

Helper Why
isChatTurnActive() Sync check with no standalone subclass use case — callers always then wait or abort
waitForIdle() Strict subset of waitUntilStable(), which drains the queue as its first step
abortActiveTurn() resetTurnState() is correct in every real scenario (clear, workflow switch)

2. waitForPendingInteractionResolution()waitUntilStable()

Renamed to match the actual semantics ("waits until the conversation is fully stable").

Before: Polled hasPendingInteraction() on an interval. Could resolve at the exact moment the SDK queued an auto-continuation, requiring the fragile waitForIdle() → waitForPendingInteraction() → waitForIdle() dance.

After: Event-driven via a new _pendingInteractionPromise field. Drains _chatTurnQueue, awaits the in-flight tool apply, then loops to catch any autoContinue continuation. One call does everything:

const ready = await this.waitUntilStable({ timeout: 30_000 });
if (!ready) return;
await this.saveMessages([...this.messages, syntheticMessage]);

3. resetTurnState() added

Encapsulates the full reset (epoch increment + abort controllers + pending promise cleanup) that the built-in CF_AGENT_CHAT_CLEAR handler previously did inline. Subclasses intercepting clear can now call one method instead of knowing three internals.

4. Unhandled rejection suppression

The _pendingInteractionPromise cleanup chains now attach .catch(() => {}) so rejected tool-result / approval applies do not leak as unhandled rejections.

5. Expanded documentation

The docs/chat-agents.md turn coordination section now includes:

  • When coordination is needed (and when it is not)
  • How the turn lifecycle works (turns vs pending interactions)
  • Usage guidance for each helper with code examples
  • Hibernation behavior (all coordination state is in-memory; messages persist in SQLite)

Files changed

File Change
packages/ai-chat/src/index.ts Demote 3 helpers to private; rename waitForPendingInteractionResolutionwaitUntilStable; add _pendingInteractionPromise; rewrite with event-driven semantics; add resetTurnState(); suppress unhandled rejections
packages/ai-chat/src/tests/pending-interaction.test.ts Rename to waitUntilStableForTest; add WebSocket tool-result, tool-approval, auto-continuation, active-turn timeout, and resetTurnState coverage
packages/ai-chat/src/tests/worker.ts Add waitUntilStableForTest, resetTurnStateForTest; rewrite demoted wrappers to use as unknown as casts for private field access
docs/chat-agents.md Rewrite turn coordination section with lifecycle explanation, usage guidance, and examples
packages/ai-chat/README.md Trim API table from 6 to 3 helpers
.changeset/... Patch changeset for @cloudflare/ai-chat

Verification

  • npm run check passes (sherif, export checks, oxfmt, oxlint, typecheck across 65 projects)
  • packages/ai-chat vitest suite passes (315 tests)

Reviewer notes

  • Recommended review order: index.ts (API changes) → tests → docs
  • All coordination state is in-memory and intentionally does not survive hibernation — the queue resets on wake, and hasPendingInteraction() reads from persisted messages in SQLite
  • Public surface is unchanged: all touched methods are protected or private
  • The as unknown as pattern in test wrappers for private field access matches the existing getAbortControllerCount() pattern in the same file

…interception

- expand waitForPendingInteractionResolution JSDoc to warn about the
  auto-continuation race: when a tool interaction resolves the SDK queues
  a continuation turn at the same moment the poll resolves; a second
  waitForIdle() is required before reading this.messages or calling
  saveMessages(); add the full three-step recipe in the docstring

- expand abortActiveTurn JSDoc to warn that subclasses intercepting
  CF_AGENT_CHAT_CLEAR and returning early bypass the SDK's built-in epoch
  increment and abort; they must call abortActiveTurn() explicitly to stop
  any active stream before performing a scoped delete

- expand docs/chat-agents.md turn coordination section: add helper
  reference table, three-step saveMessages pattern with explanation of why
  the second waitForIdle is needed, workflow-switch abort example, and
  CF_AGENT_CHAT_CLEAR interception guide with the mandatory abortActiveTurn
  call and consequences of omitting it
@changeset-bot
Copy link

changeset-bot bot commented Mar 22, 2026

🦋 Changeset detected

Latest commit: af096ac

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/ai-chat Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@whoiskatrin whoiskatrin marked this pull request as ready for review March 22, 2026 16:46
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 1 additional finding.

Open in Devin Review

@threepointone
Copy link
Contributor

hmm we might have to revisit 1142, I'll have a look tomorrow

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 22, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1151

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1151

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1151

hono-agents

npm i https://pkg.pr.new/hono-agents@1151

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1151

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1151

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1151

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1151

commit: 6006449

@whoiskatrin whoiskatrin changed the title docs(ai-chat): document turn coordination patterns and clear handler interception fix(ai-chat): event-driven turn coordination, resetTurnState(), drop polling Mar 22, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

@whoiskatrin whoiskatrin changed the title fix(ai-chat): event-driven turn coordination, resetTurnState(), drop polling fix(ai-chat): tighten pending interaction coordination and clear resets Mar 22, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration[bot]

This comment was marked as resolved.

whoiskatrin and others added 3 commits March 22, 2026 20:41
Rename waitForPendingInteractionResolution() to waitUntilStable() and make it wait for a fully stable conversation (including queued continuation turns). Demote isChatTurnActive(), waitForIdle(), and abortActiveTurn() to private; expose resetTurnState() as the public way to abort and invalidate queued continuations. Update docs and README to reflect the new helpers and usage guidance, and adjust tests/worker helpers to call the renamed APIs or access internal fields for test-only behavior. Also tighten pending-interaction bookkeeping to avoid leaking rejected tool/apply promises.
@threepointone threepointone changed the title fix(ai-chat): tighten pending interaction coordination and clear resets fix(ai-chat): simplify turn coordination API Mar 23, 2026
devin-ai-integration[bot]

This comment was marked as resolved.

Replace the inline timeout Promise in a Promise.race with an explicit timer variable, await the race result, then clearTimeout(timer) before returning to avoid the timeout callback firing after the race completes. Also update a comment to reference waitUntilStable instead of waitForPendingInteractionResolution for clarity.
@threepointone threepointone merged commit b0c52a5 into main Mar 23, 2026
1 check passed
@threepointone threepointone deleted the docs/turn-coordination-patterns branch March 23, 2026 11:09
@github-actions github-actions bot mentioned this pull request Mar 23, 2026
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 13 additional findings in Devin Review.

Open in Devin Review

Comment on lines +1068 to +1077
const remainingMs = Math.max(0, deadline - Date.now());
let timer: ReturnType<typeof setTimeout>;
const result = await Promise.race([
promise,
new Promise<typeof TIMED_OUT>((resolve) => {
timer = setTimeout(() => resolve(TIMED_OUT), remainingMs);
})
]);
clearTimeout(timer!);
return result;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Timer leak in _awaitWithDeadline when the input promise rejects

If the promise argument rejects, Promise.race rejects and the await throws, so the clearTimeout(timer!) on line 1076 is never reached. The timer keeps running in the background until it fires after remainingMs. This is called from waitUntilStable at packages/ai-chat/src/index.ts:1017, where a rejected _pendingInteractionPromise (e.g., from a SQLite failure in _findAndUpdateToolPart) triggers catch { continue }, re-entering the loop and potentially creating additional leaked timers. Each leaked timer lives for up to remainingMs before self-cleaning, so the impact is bounded but unnecessary.

Suggested change
const remainingMs = Math.max(0, deadline - Date.now());
let timer: ReturnType<typeof setTimeout>;
const result = await Promise.race([
promise,
new Promise<typeof TIMED_OUT>((resolve) => {
timer = setTimeout(() => resolve(TIMED_OUT), remainingMs);
})
]);
clearTimeout(timer!);
return result;
const remainingMs = Math.max(0, deadline - Date.now());
let timer: ReturnType<typeof setTimeout>;
try {
const result = await Promise.race([
promise,
new Promise<typeof TIMED_OUT>((resolve) => {
timer = setTimeout(() => resolve(TIMED_OUT), remainingMs);
})
]);
return result;
} finally {
clearTimeout(timer!);
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants