Inworld tts auto mode #1008

ianbbqzy · 2026-01-29T22:14:22Z

auto_mode to be added to config param in a separate PR when word tokenizer and user-controlled manual flushes are supported. For now, auto_mode should enhance quality and naturalness of agent response

Description

Changes Made

Pre-Review Checklist

Build passes: All builds (lint, typecheck, tests) pass locally
AI-generated code reviewed: Removed unnecessary comments and ensured code quality
Changes explained: All changes are properly documented and justified above
Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included
Video demo: A small video demo showing changes works as expected and did not break any existing functionality using Agent Playground (if applicable)

Testing

Automated tests added/updated (if applicable)
All tests pass
Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes

Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.

Summary by CodeRabbit

Improvements
- Enhanced text-to-speech with automatic streaming for more responsive audio synthesis.
- Improved timing and alignment of words/characters in streamed audio for smoother, monotonic playback.
- Better handling of stream completion to ensure continuous, correctly-timed audio output.

changeset-bot · 2026-01-29T22:14:26Z

🦋 Changeset detected

Latest commit: 9a585c6

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages

Name	Type
@livekit/agents-plugin-inworld	Patch
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugin-xai	Patch
@livekit/agents-plugins-test	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

CLAassistant · 2026-01-29T22:14:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ toubatbrian
❌ Ian Lee

Ian Lee seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-01-29T22:14:41Z

📝 Walkthrough

Walkthrough

Added timestamp-cumulative handling and flush semantics to TTS synthesis stream; introduced flushCompleted?: boolean on InworldResult and autoMode?: boolean on CreateContextConfig; context creation now forces autoMode: true. Adjusted alignment timestamp offsets and generation end tracking in the stream implementation.

Changes

Cohort / File(s)	Summary
TTS implementation & types `plugins/inworld/src/tts.ts`	Added `autoMode?: boolean` to `CreateContextConfig` and `flushCompleted?: boolean` to `InworldResult`. Implemented cumulative timestamp tracking (`#cumulativeTime`, `#generationEndTime`) in `SynthesizeStream`, applied cumulative offsets to word/char alignments, reset cumulative time on `flushCompleted`, and always set `autoMode: true` in context creation. Added explanatory comments about monotonic timestamps and autoMode rationale.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through timestamps, neat and spry,
Offsets stacked so words don't lie,
A tiny flag — autoMode true,
Flushes tidy, streaming through,
Carrots sync up — audio by-by-by 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description contains only a brief note about auto_mode without filling out the required template sections like 'Description', 'Changes Made', 'Testing', and most checklist items remain unchecked.	Complete the PR description by providing a clear description of changes, listing specific modifications made (interface additions, timestamp tracking logic), and documenting testing approach and checklist completion status.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'inworld tts auto mode' is directly related to the main change in the changeset, which adds auto mode functionality to the Inworld TTS system for enhancing response quality.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52d155b and 2baf4a1.

📒 Files selected for processing (1)

plugins/inworld/src/tts.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

plugins/inworld/src/tts.ts

🧠 Learnings (3)

📓 Common learnings

Learnt from: cshape
Repo: livekit/agents-js PR: 1008
File: plugins/inworld/src/tts.ts:639-641
Timestamp: 2026-02-02T23:20:23.828Z
Learning: The `autoMode` field in Inworld's WebSocket TTS API `create_context` configuration is a forward-compatible feature that will be officially released by Inworld. It is safe to include this field in the configuration as Inworld's API will silently ignore unsupported fields until the feature is available.

📚 Learning: 2026-02-02T23:20:17.980Z

Learnt from: cshape
Repo: livekit/agents-js PR: 1008
File: plugins/inworld/src/tts.ts:639-641
Timestamp: 2026-02-02T23:20:17.980Z
Learning: Include the autoMode field in the Inworld WebSocket TTS API create_context configuration in plugins/inworld/src/tts.ts as a forward-compatible option. Since the API will silently ignore unsupported fields until the feature is released, adding autoMode now is safe and prepares for future usage. Ensure you don’t rely on autoMode for current behavior and consider adding a comment indicating it's forward-compatible. If possible, add a test to verify that existing behavior remains unchanged when autoMode is not yet recognized by the API.

Applied to files:

plugins/inworld/src/tts.ts

📚 Learning: 2026-01-16T14:33:39.551Z

Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

plugins/inworld/src/tts.ts

🔇 Additional comments (5)

plugins/inworld/src/tts.ts (5)

77-77: LGTM!

The optional interface additions for autoMode and flushCompleted properly extend the existing types without breaking backward compatibility.

Also applies to: 106-106

481-486: LGTM!

The cumulative timestamp tracking fields are well-documented. The comment clearly explains the monotonic timestamp invariant and why this offset mechanism is needed when the server resets timestamps after each generation.

515-519: LGTM!

The flushCompleted handler correctly captures the generation end time as the new cumulative offset, ensuring subsequent generation timestamps continue monotonically from where the previous generation ended.

525-535: LGTM!

The cumulative offset is correctly applied to word alignment timestamps. Using Math.max to update #generationEndTime properly handles potential out-of-order timestamp arrivals within a generation.

547-557: LGTM!

Character alignment timestamp handling mirrors the word alignment logic correctly. Both contribute to tracking #generationEndTime, which is appropriate when both alignment types are present.

Optional: The word and character alignment processing blocks share similar structure. Consider extracting a helper if this pattern expands further.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@plugins/inworld/src/tts.ts`:
- Around line 639-641: Remove the unsupported autoMode field and its comment
from the create_context/create (or create) message builder in the TTS WebSocket
code: find the autoMode: true property (and the preceding comment referencing
auto_mode) in the code that constructs the Inworld "create" context/message
(e.g., inside the function building the create_context payload) and delete both
the property and the misleading comment; if you believe auto-mode must be
enabled, instead add a TODO or a verification step to call Inworld support or
adjust the implementation to implement sentence-tokenizer-driven flush behavior
locally rather than relying on a non-existent API flag.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c02ff2 and 52d155b.

📒 Files selected for processing (1)

plugins/inworld/src/tts.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

plugins/inworld/src/tts.ts

🧠 Learnings (1)

📚 Learning: 2026-01-16T14:33:39.551Z

Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

plugins/inworld/src/tts.ts

🔇 Additional comments (1)

plugins/inworld/src/tts.ts (1)

68-77: LGTM: optional autoMode in CreateContextConfig is a clean, non-breaking extension.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

plugins/inworld/src/tts.ts

toubatbrian · 2026-02-09T23:31:45Z

Hi @ianbbqzy, is this PR ready for review? I saw it is still labeled as "[Draft]".

ianbbqzy · 2026-02-10T00:05:28Z

Hi @toubatbrian, Yes! I don't seem to have access to edit PR title once it's created. I will make sure to not include [DRAFT] in PR title next time

toubatbrian

Code looks good, have you tested the changes on your end? (just to make sure it doesn't break anything)

ianbbqzy · 2026-02-10T18:59:07Z

Thanks! Yup I have tested locally by connecting to the UI on cloud.livekit

ianbbqzy · 2026-02-10T19:00:16Z

Hi @toubatbrian, do you mind helping with the CLA situation? I have signed it multiple times but it's not registering?

toubatbrian · 2026-02-10T21:14:20Z

Hi @toubatbrian, do you mind helping with the CLA situation? I have signed it multiple times but it's not registering?

@ianbbqzy hmm I'm also been constantly redirected to the same page. Let me see if anyone can help

toubatbrian · 2026-02-10T21:18:16Z

In the meantime, would you mind try opening another PR to see if the CLA gets unblocked?

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

inworld tts ws auto mode

52d155b

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

plugins/inworld/src/tts.ts Show resolved Hide resolved

fix timestamps cumulation within a context

2baf4a1

toubatbrian changed the title ~~[Draft] inworld tts auto mode~~ Inworld tts auto mode Feb 10, 2026

toubatbrian approved these changes Feb 10, 2026

View reviewed changes

Create little-tables-joke.md

9a585c6

devin-ai-integration bot reviewed Feb 10, 2026

View reviewed changes

toubatbrian merged commit 9eee0c2 into livekit:main Feb 10, 2026
3 of 4 checks passed

github-actions bot mentioned this pull request Feb 10, 2026

Version Packages #1039

Open

Inworld tts auto mode #1008

Inworld tts auto mode #1008

Uh oh!

Conversation

ianbbqzy commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

Pre-Review Checklist

Testing

Additional Notes

Summary by CodeRabbit

Uh oh!

changeset-bot bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

CLAassistant commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

toubatbrian commented Feb 9, 2026

Uh oh!

ianbbqzy commented Feb 10, 2026

Uh oh!

toubatbrian left a comment

Choose a reason for hiding this comment

Uh oh!

ianbbqzy commented Feb 10, 2026

Uh oh!

ianbbqzy commented Feb 10, 2026

Uh oh!

toubatbrian commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

toubatbrian commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ianbbqzy commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Jan 29, 2026 •

edited

Loading

CLAassistant commented Jan 29, 2026 •

edited

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

toubatbrian commented Feb 10, 2026 •

edited

Loading

toubatbrian commented Feb 10, 2026 •

edited

Loading