Skip to content

Follow-up cleanups for recent chat and voice PRs#1497

Merged
threepointone merged 9 commits into
mainfrom
cleanups
May 11, 2026
Merged

Follow-up cleanups for recent chat and voice PRs#1497
threepointone merged 9 commits into
mainfrom
cleanups

Conversation

@threepointone
Copy link
Copy Markdown
Contributor

@threepointone threepointone commented May 11, 2026

Thank you @whoiskatrin for the very good PRs here. I spent some time reviewing the recent fixes in detail and I am adding a handful of cleanups on top because I am, unfortunately, a little anal about tightening the nearby edge cases once I see them. Sorry for the extra follow-up churn, but these are all meant to preserve the direction of the landed PRs while hardening the corners around them.

Summary

  • Adds follow-up robustness for the recently landed voice STT, voice streaming, useVoiceAgent, agent-tool recovery, chat resume negotiation, and ai-chat socket disconnect fixes.
  • Keeps the fixes scoped to the reviewed behavior, with regression tests for the edge cases we found during review.
  • Updates relevant changesets/docs/readmes where the behavior is user-facing.

Commit Details

bd1348a4 - fix(voice): harden Workers AI STT turn handling

Builds on PR #1458.

This expands the Workers AI STT fix beyond the original late/empty final transcript issue:

  • Tracks the current Flux transcript across Update, StartOfTurn, and TurnResumed, so an empty EndOfTurn can still emit the best known utterance.
  • Uses StartOfTurn as a model-driven speech-start signal for server-side barge-in.
  • Adds playback_interrupt handling so the server can tell the client to stop playback when user speech starts during assistant audio.
  • Adds a playback generation guard in VoiceClient so a stale decodeAudioData() task cannot start playing after an interrupt has already arrived.
  • Adds server/client regression coverage for Flux turn handling, Nova late messages, barge-in, playback interruption, and decode races.
  • Updates the voice docs/readmes to describe model-driven barge-in behavior.

f7b1b344 - fix(chat): harden stream resume negotiation close races

Builds on PR #1463.

This extends the sendIfOpen pattern from the original replay fallback fix to the rest of the resume negotiation paths:

  • Wraps bare resume protocol sends in @cloudflare/think and @cloudflare/ai-chat so TypeError: WebSocket send() after close does not bubble out of close races.
  • Only records pending resume connections when the STREAM_RESUMING notification send actually succeeds.
  • Applies the same close-safe behavior to ContinuationState.sendResumeNone() in the shared agents chat continuation path.
  • Adds focused tests for closed connection handling and non-close send errors.
  • Updates changesets to reflect the broader resume negotiation hardening.

2954bd35 - fix(voice): parse raw NDJSON text streams

Builds on PR #1462.

This fills the NDJSON half of the documented iterateText() byte-stream support:

  • Parses raw newline-delimited JSON as well as SSE data: lines.
  • Handles OpenAI-style choices[0].delta.content, response, split byte chunks, final unterminated lines, raw [DONE], and malformed raw JSON lines.
  • Accepts valid SSE data:{...} lines without requiring a space after data:.
  • Preserves the earlier AI SDK textStream custom async iterator preference from PR Fix voice TTS for AI SDK textStream responses #1462.

5aefa984 - fix(agents): defer recovered agent-tool finish hooks (#1476)

Builds on PR #1476.

This addresses recovery lifecycle gaps found during review of agent-tool finalization:

  • Defers recovered onAgentToolFinish hooks until after the agent's user onStart has completed, so user startup/mirror initialization runs before recovered finish hooks execute.
  • Keeps durable internal state and terminal client events finalized during recovery before user finish hooks run.
  • Isolates errors from deferred finish hooks through onError, so one failed hook does not prevent later recovered runs from draining or agent startup from completing.
  • Adds coverage for completed, still-running, uninspectable, replay-failure, deferred-hook ordering, and throwing-hook recovery scenarios.

527c5ba2 - test(voice): cover useVoiceAgent enabled lifecycle (#1478)

Builds on PR #1478.

This adds follow-up tests/docs around the new useVoiceAgent({ enabled }) gate:

  • Verifies that enabling after a disabled mount connects using the latest query/capability token.
  • Verifies onReconnect fires for real connection identity changes while enabled, but not for first enable.
  • Documents that disabled action callbacks are safe no-ops and that first enable is treated as an initial connection rather than a reconnect.

b1497715 - fix(ai-chat): close resumed streams on disconnect (#1487)

Builds on PR #1487.

PR #1487 correctly closed the original sendMessages() stream when the socket closes before a terminal frame. This extends the same transport-owned stream cleanup to the other WebSocket-fed paths:

  • Closes tool continuation streams if the socket closes before the resume handshake completes.
  • Closes and cleans request bookkeeping if the socket closes mid tool-continuation stream.
  • Closes resumed replay/live streams and removes their active request ids if the socket closes before done: true.
  • Adds focused regression tests for those disconnect paths.

Test Plan

Ran focused verification while developing the commits:

  • npm run build -w agents
  • npm run build -w @cloudflare/ai-chat
  • npm --workspace @cloudflare/voice run build
  • npm run test:workers -w @cloudflare/ai-chat -- src/tests/agent-tools.test.ts
  • npm run test:workers -w @cloudflare/ai-chat -- src/tests/ws-transport-resume.test.ts
  • npm run test:react -w @cloudflare/ai-chat -- use-agent-chat.test.tsx
  • npm run test:workers -w @cloudflare/think -- agent-tools.test.ts
  • npm --workspace @cloudflare/voice run test:react -- useVoiceAgent.test.tsx
  • npx tsc -p packages/ai-chat/src/tests/tsconfig.json --noEmit
  • Focused oxfmt --check / lint checks for edited files

Made with Cursor


Open in Devin Review

threepointone and others added 6 commits May 11, 2026 12:42
Follow up on PR #1458 by preserving Flux turn transcripts across lifecycle events and using model-detected speech start for low-latency barge-in.

Co-authored-by: Cursor <cursoragent@cursor.com>
Follow-up to PR #1463: route stream-resume negotiation sends through close-safe helpers so WebSocket close races do not crash resume handling in think and ai-chat.

Co-authored-by: Cursor <cursoragent@cursor.com>
Follow-up to PR #1462: make the voice text stream parser honor its documented NDJSON support while preserving SSE parsing for AI text streams.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Client-side interrupt detection bypasses #playbackGeneration guard, allowing stale audio to play after interrupt

The PR introduces #playbackGeneration and #stopPlayback() (at packages/voice/src/voice-client.ts:766-779) to prevent a race where a pending decodeAudioData resolves after playback is stopped and starts unwanted audio. #stopPlayback() increments #playbackGeneration, and #playAudio() checks it at lines 726 and 731. However, the client-side interrupt path in #processAudioLevel still uses the old manual pattern — it clears #activeSource, #playbackQueue, and #isPlaying without incrementing #playbackGeneration. If decodeAudioData is pending when the client-side interrupt fires, the decode resolves, the generation check passes (generation hasn't changed), and unwanted audio plays after the user interrupted.

(Refers to lines 863-867)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 11, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1497

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1497

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1497

hono-agents

npm i https://pkg.pr.new/hono-agents@1497

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1497

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1497

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1497

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1497

commit: 71cd902

@whoiskatrin whoiskatrin self-requested a review May 11, 2026 14:07
Co-authored-by: Cursor <cursoragent@cursor.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 11, 2026

⚠️ No Changeset found

Latest commit: 71cd902

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes changesets to release 4 packages
Name Type
agents Patch
@cloudflare/ai-chat Patch
@cloudflare/voice Patch
@cloudflare/think Patch

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

threepointone and others added 2 commits May 11, 2026 15:36
Co-authored-by: Cursor <cursoragent@cursor.com>
Ensure recovered agent-tool finish hooks are only executed after a successful user onStart. Await _runDeferredAgentToolFinishHooks inside the onStart flow so deferred finishes are skipped when startup fails. Add a test and helper (reconcileCompletedChildWithFailedStartupForTest) to verify finish hooks are not run on failed startup and to cover lifecycle ordering and event emission.
@threepointone threepointone merged commit f5df638 into main May 11, 2026
4 checks passed
@threepointone threepointone deleted the cleanups branch May 11, 2026 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants