feat(realtime): support multi-message generation per response by rosetta-livekit-bot[bot] · Pull Request #1555 · livekit/agents-js

rosetta-livekit-bot · 2026-05-20T00:43:01Z

Summary

Process each MessageGeneration from generation_ev.message_stream serially via perform_audio_forwarding + perform_text_forwarding + wait_for_playout. Only one flush is in flight at a time.
Per-msg state is derived directly from the playback_finished event:
- full → emit ChatMessage(interrupted=False) with the msg's message_id
- partial → emit ChatMessage(interrupted=True) and call _rt_session.truncate(...) with this msg's local playback_position (not a cumulative offset)
- skipped → drop locally and call update_chat_ctx(...) so the realtime server removes never-played items from its history
_on_first_frame now early-returns once started_speaking_at is set, so per-msg first-frame callbacks don't re-fire _update_agent_state("speaking") for each message.

Alternative considered

#5690 makes multi-message work by flushing per message — that needs the synchronizer to keep pending/finalizing impls alive and serialize concurrent flushes in room_io/_output.py. Our AudioOutput assumes there is only one speech at a time, serializing per-message at the wait_for_playout boundary (this PR) avoids both changes.

close livekit/agents#5690, livekit/agents#5684

changeset-bot · 2026-05-20T00:43:07Z

🦋 Changeset detected

Latest commit: 896d71c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 33 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-fishaudio	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-hume	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-liveavatar	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-minimax	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-mistralai	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-perplexity	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-runway	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugin-tavus	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

yaniv-peretz · 2026-05-21T08:09:36Z

Local Test Results: Success

I built the package and swapped the agents/dist/voice/agent_activity.* files in my local dev server.

Result

Instructions

Availability per group size:
1-2: not available
3-15:  available - only if all are adult males
15-30:  available - only if all are adult females
25+: not available

Conversation Log

Note the 2nd Agent conversation item

Agent: hi;
Agent: hi;
User: ~~On planet~~ **I Plan** to come either as a group of two males, a group of ten mixed, a group of twenty only females, or a group of forty. Can I come?
Agent: Let’s walk through each group option against the availability rules.
Agent: Only the group of 20 adult females can come. The group of two males is too small, the group of 10 is mixed so it doesn’t qualify, and the group of 40 is not available.

yaniv-peretz · 2026-05-21T08:41:27Z

@tinalenguyen Stating the obvious how can / what required to push this through.
gpt-realtime-2 expected to be a banger (75% cost reduction + higher intelligence).
Opening opportunities for customer support and more complex customer-service scenarios.

Speech Reasoning (Big Bench Audio) vs Cost per Hour of Input Audio (21 May '26) (1)

longcw · 2026-05-22T02:08:40Z

-          this.agentSession._conversationItemAdded(message);
-
-          // TODO(brian): add tracing span
+      if (realtimeModel.capabilities.midSessionChatCtxUpdate) {


didn't check if any messages are skipped so the updateChatCtx is called every time the agent speech is interrupted. should we check if any skipped before this?

Updated to only call updateChatCtx on interruption when at least one processed message was skipped.

longcw · 2026-05-22T02:15:30Z

+        } else if (interrupted && output.synchronizedTranscript !== undefined) {
+          forwardedText = output.synchronizedTranscript;
+        }


this branch is not reachable since synchronizedTranscript is set only if audioOut is valid?

Removed the unreachable fallback branch; interrupted audio still uses synchronizedTranscript when available and falls back to an empty string otherwise.

longcw · 2026-05-22T02:16:01Z

        }
-        await waitFor(forwardTasks);
      } catch (error) {
        this.logger.error(error, 'error reading messages from the realtime API');


should we raise this error? what was the original behavior when failed?

Kept the existing TS behavior here: the previous readMessages path caught and logged errors from reading the realtime message stream rather than rethrowing. This port preserves that behavior.

longcw · 2026-05-22T02:18:52Z

-            textOut = _textOut;
+            forwardTasks.push(forwardTask);
+            output.audioOut = audioOut;
+            audioOut.firstFrameFut.await


should we clean up firstFrameFut like that in python?

No extra cleanup is needed in the JS path: performAudioForwarding resolves firstFrameFut on playback start and rejects it in its finally block if playback never starts, and each await already has a catch to avoid an unhandled rejection.

feat(realtime): support multi-message generation per response

186bab6

This comment was marked as resolved.

Sign in to view

Merge origin/main into realtime multi-message response

91764f1

tinalenguyen linked an issue May 21, 2026 that may be closed by this pull request

feat(realtime): support multi-message generation per response - gpt-realtime-2 #1563

Closed

yaniv-peretz mentioned this pull request May 21, 2026

support multi message - pgt-realtime-2 #1570

Open

8 tasks

theomonnom requested review from longcw and removed request for longcw May 21, 2026 17:56

longcw reviewed May 22, 2026

View reviewed changes

fix(realtime): address multi-message review feedback

896d71c

longcw approved these changes May 25, 2026

View reviewed changes

longcw merged commit 181c868 into main May 25, 2026
9 checks passed

longcw deleted the footsie-sundial-perused branch May 25, 2026 07:26

github-actions Bot mentioned this pull request May 25, 2026

Version Packages #1580

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(realtime): support multi-message generation per response#1555

feat(realtime): support multi-message generation per response#1555
longcw merged 3 commits into
mainfrom
footsie-sundial-perused

rosetta-livekit-bot Bot commented May 20, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented May 20, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

yaniv-peretz commented May 21, 2026 •

edited

Loading

Uh oh!

yaniv-peretz commented May 21, 2026 •

edited

Loading

Uh oh!

longcw May 22, 2026

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Uh oh!

longcw May 22, 2026

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Uh oh!

longcw May 22, 2026

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Uh oh!

longcw May 22, 2026

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rosetta-livekit-bot Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Alternative considered

Uh oh!

changeset-bot Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

This comment was marked as resolved.

Uh oh!

yaniv-peretz commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Local Test Results: Success

Result

Instructions

Conversation Log

Uh oh!

yaniv-peretz commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

longcw May 22, 2026

Choose a reason for hiding this comment

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

longcw May 22, 2026

Choose a reason for hiding this comment

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

longcw May 22, 2026

Choose a reason for hiding this comment

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

longcw May 22, 2026

Choose a reason for hiding this comment

Uh oh!

rosetta-livekit-bot Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rosetta-livekit-bot Bot commented May 20, 2026 •

edited

Loading

changeset-bot Bot commented May 20, 2026 •

edited

Loading

yaniv-peretz commented May 21, 2026 •

edited

Loading

yaniv-peretz commented May 21, 2026 •

edited

Loading