Add Inworld STT provider to livekit-plugins-inworld by cshape · Pull Request #5451 · livekit/agents

cshape · 2026-04-14T19:47:35Z

Summary

Add streaming speech-to-text support to the existing livekit-plugins-inworld package via Inworld's WebSocket STT API (wss://api.inworld.ai/stt/v1/transcribe:streamBidirectional)
Follows established plugin patterns (Soniox, Deepgram, Google) including reconnection loop with tasks_group cleanup, stream tracking, START/END_OF_SPEECH events, and request_id
Pass-through model selection — any model string accepted (default: inworld/inworld-stt-1)
Eager end-of-turn defaults (endOfTurnConfidenceThreshold=0.3, minEndOfTurnSilenceWhenConfident=200ms) for low-latency voice agent use
Voice profile detection enabled by default — age, gender, emotion, accent data surfaced via SpeechData.metadata["voice_profile"]
update_options() for runtime config changes
Adds metadata: dict[str, Any] | None field to SpeechData in livekit-agents for plugin-specific data (backwards-compatible, defaults to None)

Changes

livekit-agents/livekit/agents/stt/stt.py — Add optional metadata field to SpeechData for plugin-specific data
livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py — New STT + SpeechStream implementation
livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/__init__.py — Add STT/SpeechStream exports
livekit-plugins/livekit-plugins-inworld/pyproject.toml — Update description/keywords for STT+TTS
livekit-plugins/livekit-plugins-inworld/README.md — Add STT usage docs
examples/voice_agents/inworld_agent.py — Voice agent using Inworld STT+TTS
.gitignore — Add .env

Usage

from livekit.plugins import inworld

session = AgentSession(
    stt=inworld.STT(),                          # default: inworld/inworld-stt-1
    tts=inworld.TTS(voice="Clive"),
    llm="openai/gpt-4.1-mini",
)

# Access voice profile metadata
@session.on("user_input_transcribed")
def on_input(ev):
    print(ev.transcript)

# Override model or tune end-of-turn behavior
stt = inworld.STT(
    model="assemblyai/universal-streaming-multilingual",
    end_of_turn_confidence_threshold=0.5,
    min_end_of_turn_silence_when_confident=400,
)

Test plan

Verified imports with SDK 1.5.2
Tested streaming transcription end-to-end with 26s audio file (interim + final transcripts)
Tested full agent workflow via console mode (Inworld STT + TTS + OpenAI LLM)
Verified update_options() works for runtime model changes
Verified no secrets in committed files
ruff check and ruff format pass
mypy type checks pass (resolved all 11 prior errors)
CI checks (framework change to SpeechData may need separate PR if CI runs cross-package)

Add streaming speech-to-text support via Inworld's WebSocket STT API alongside the existing TTS plugin. The implementation follows established LiveKit plugin patterns (Deepgram, Soniox, Google) including reconnection logic, stream tracking, START/END_OF_SPEECH events, and request_id. - STT class with streaming and sync recognition - SpeechStream with reconnection loop and _reconnect_event - WeakSet stream tracking for coordinated shutdown - Pass-through model selection (no hardcoded model list) - update_options() for runtime config changes - Voice profile and VAD configuration support - Example agent and standalone test script

tinalenguyen

left a few comments, i also think the example would be the most useful and accessible in the plugin README

… parsing helper, inline example in README, restore product-specific docs links

The landing page /agents/integrations/inworld/ 404s. Point at the TTS and STT integration pages instead, matching the README.

A final transcript with an empty transcript string (VAD false positive, unrecognizable noise) was returning early before the END_OF_SPEECH block, leaving _speaking=True. Subsequent speechStarted events were then ignored, wedging the stream. Skip only the transcript event when text is empty; let final events fall through to the end-of-speech emission.

The base class stt._metrics_monitor_task reports metrics_collected for streaming STT off RECOGNITION_USAGE events; without them no usage was ever surfaced for Inworld sessions. Mirror the pattern used by deepgram, gladia, xai, and elevenlabs: accumulate audio frame durations via a PeriodicCollector and emit a RECOGNITION_USAGE event every 5 seconds. Flush on FlushSentinel and on input-channel close so the last partial window is reported.

Per review: STT page lives under /agents/models/stt/inworld/ (not /integrations/). Remove the now-redundant standalone example file — the same agent is inlined in the plugin README.

This comment was marked as resolved.

Sign in to view

cshape force-pushed the inworld-stt branch 2 times, most recently from 0daf792 to 35acfa0 Compare April 14, 2026 20:20

ianbbqzy reviewed Apr 14, 2026

View reviewed changes

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py Outdated

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py Outdated

cshape force-pushed the inworld-stt branch 2 times, most recently from a850403 to 82b9dd7 Compare April 14, 2026 20:27

anotherkirillryzhov reviewed Apr 14, 2026

View reviewed changes

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py Outdated

This comment was marked as resolved.

Sign in to view

cshape force-pushed the inworld-stt branch from 82b9dd7 to 7bdb0a1 Compare April 14, 2026 20:39

This comment was marked as resolved.

Sign in to view

cshape force-pushed the inworld-stt branch from 7bdb0a1 to 654c160 Compare April 14, 2026 20:49

cshape force-pushed the inworld-stt branch from 654c160 to 1719a4d Compare April 14, 2026 20:53

tinalenguyen reviewed Apr 21, 2026

View reviewed changes

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py Outdated

Comment thread livekit-plugins/livekit-plugins-inworld/livekit/plugins/inworld/stt.py Outdated

cshape added 2 commits April 21, 2026 12:59

address review feedback: drop redundant _STTOptions defaults, extract…

13532df

… parsing helper, inline example in README, restore product-specific docs links

fix broken docs link in plugin __init__.py

1fc9674

The landing page /agents/integrations/inworld/ 404s. Point at the TTS and STT integration pages instead, matching the README.

This comment was marked as resolved.

Sign in to view

Merge remote-tracking branch 'upstream/main' into inworld-stt-pr

8dfbb08

This comment was marked as resolved.

Sign in to view

cshape added 5 commits April 21, 2026 14:19

move STT docs link to /models/ and drop examples/inworld_agent.py

f4085ea

Per review: STT page lives under /agents/models/stt/inworld/ (not /integrations/). Remove the now-redundant standalone example file — the same agent is inlined in the plugin README.

move TTS docs link to /models/

ccec936

revert TTS docs link to /integrations/ (served via LiveKit inference)

f3e4a10

tinalenguyen approved these changes Apr 21, 2026

View reviewed changes

tinalenguyen merged commit 3756780 into livekit:main Apr 21, 2026
14 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Inworld STT provider to livekit-plugins-inworld#5451

Add Inworld STT provider to livekit-plugins-inworld#5451
tinalenguyen merged 9 commits intolivekit:mainfrom
cshape:inworld-stt

cshape commented Apr 14, 2026 •

edited

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

tinalenguyen left a comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

cshape commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

tinalenguyen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cshape commented Apr 14, 2026 •

edited

Loading