Skip to content

fix: preserve voice text across streamed tool gaps#1525

Draft
whoiskatrin wants to merge 5 commits into
mainfrom
fix/voice-tool-gap-streaming
Draft

fix: preserve voice text across streamed tool gaps#1525
whoiskatrin wants to merge 5 commits into
mainfrom
fix/voice-tool-gap-streaming

Conversation

@whoiskatrin
Copy link
Copy Markdown
Contributor

@whoiskatrin whoiskatrin commented May 14, 2026

Summary

  • fix voice text-stream parsing for OpenAI/OpenRouter-compatible SSE streams where only the first assistant delta includes role: \"assistant\"
  • preserve assistant text chunks that arrive after tool-call deltas instead of dropping them from the voice pipeline
  • add focused parser regressions for omitted-role chunks and an OpenRouter-style text → tool → text → tool → text stream, plus the required @cloudflare/voice patch changeset

Problem

A voice user reported that when an agent streams:

  1. assistant intro text
  2. tool call
  3. assistant explanatory text
  4. another tool call
  5. assistant conclusion

the WebSocket client does not receive the assistant text between tool calls until the stream ends, so voice playback starts late and reads the accumulated text together.

For OpenAI/OpenRouter-style raw SSE streams, later assistant content deltas may omit role after the initial assistant chunk. iterateText() previously required every content delta to include role: \"assistant\", so later text chunks in the same assistant stream were skipped.

Fix

iterateText() now remembers when an OpenAI-style stream is currently emitting assistant deltas. Once an assistant role is established, subsequent content deltas without a repeated role are still yielded until another role appears.

This keeps assistant text available to withVoice across intervening tool-call chunks instead of dropping the in-between and final text segments.

Reviewer notes

The core behavior change is in:

  • packages/voice/src/text-stream.ts

The regression coverage is intentionally parser-level because that is where the bug occurs:

  • omitted-role assistant content continuation
  • OpenRouter-style SSE stream with text → tool → text → tool → text

See:

  • packages/voice/src/tests/text-stream.test.ts
  • .changeset/bright-voices-stream.md

Validation

  • npm run test -- src/tests/text-stream.test.ts -w @cloudflare/voice
  • npm run check
    • formatting, lint, exports, and most typecheck projects pass
    • repo typecheck still reports existing unrelated Think/submissions failures in packages/think and examples/think-submissions

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 14, 2026

🦋 Changeset detected

Latest commit: 2203a85

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/voice Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented May 14, 2026

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1525

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1525

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1525

hono-agents

npm i https://pkg.pr.new/hono-agents@1525

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1525

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1525

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1525

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1525

commit: 2203a85

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant