-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix(elevenlabs/stt): allow specifying scribe_v2 non-realtime model #4515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(elevenlabs/stt): allow specifying scribe_v2 non-realtime model #4515
Conversation
chenghao-mou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor issues but looks good otherwise! Thanks for contributing!
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
|
/test-stt |
STT Test ResultsStatus: ✗ Some tests failed
Failed Tests
Skipped Tests
Triggered by workflow run #233 |
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
|
/test-stt |
|
❌ |
|
I have made the suggested changes ⚡ |
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
Outdated
Show resolved
Hide resolved
| if is_given(use_realtime): | ||
| if use_realtime is True: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. " | ||
| "Specify a realtime model_id to enable streaming. " | ||
| "Defaulting model_id to 'scribe_v2_realtime' " | ||
| ) | ||
| model_id = "scribe_v2_realtime" | ||
| else: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. Instead set model_id to determine if streaming is enabled." | ||
| ) | ||
| if is_given(model_id) and "realtime" in model_id: | ||
| raise ValueError( | ||
| "The currently selected model is a realtime model but use_realtime is False" | ||
| ) | ||
| else: | ||
| use_realtime = True if (is_given(model_id) and "realtime" in model_id) else False | ||
|
|
||
| # Handle model_id defaults | ||
| if not is_given(model_id): | ||
| if use_realtime: | ||
| logger.warning("model_id is not provided. Defaulting to 'scribe_v2_realtime'.") | ||
| model_id = "scribe_v2_realtime" | ||
| else: | ||
| logger.warning("model_id is not provided. Defaulting to 'scribe_v1'.") | ||
| model_id = "scribe_v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe simplify the validation
| if is_given(use_realtime): | |
| if use_realtime is True: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. " | |
| "Specify a realtime model_id to enable streaming. " | |
| "Defaulting model_id to 'scribe_v2_realtime' " | |
| ) | |
| model_id = "scribe_v2_realtime" | |
| else: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. Instead set model_id to determine if streaming is enabled." | |
| ) | |
| if is_given(model_id) and "realtime" in model_id: | |
| raise ValueError( | |
| "The currently selected model is a realtime model but use_realtime is False" | |
| ) | |
| else: | |
| use_realtime = True if (is_given(model_id) and "realtime" in model_id) else False | |
| # Handle model_id defaults | |
| if not is_given(model_id): | |
| if use_realtime: | |
| logger.warning("model_id is not provided. Defaulting to 'scribe_v2_realtime'.") | |
| model_id = "scribe_v2_realtime" | |
| else: | |
| logger.warning("model_id is not provided. Defaulting to 'scribe_v1'.") | |
| model_id = "scribe_v1" | |
| if is_given(realtime_model): | |
| if is_given(model_id): | |
| logger.warning( | |
| "both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored." | |
| ) | |
| else: | |
| logger.warning( | |
| "`use_realtime` parameter is deprecated. " | |
| "Specify a realtime model_id to enable streaming. " | |
| "Defaulting model_id to 'scribe_v2_realtime' " | |
| ) | |
| model_id = "scribe_v2_realtime" if realtime_model else "scribe_v1" | |
| model_id = model_id if is_given(model_id) else "scribe_v1" | |
| realtime_model = model_id == "scribe_v2_realtime" |
then use realtime_model for streaming and
if not realtime_model and is_given(server_vad):
logger.warning("Server-side VAD is only supported for Scribe v2 realtime model")There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I have made that change
|
/test-stt |
|
@bml1g12 just need to fix CI and we'll get this merged. |
@davidzhao Sorry but I do not understand this CI failure - as when I run All looks healthy, and @chenghao-mou suggests the failure is unrelated to this PR here |
The latest failure came from ruff in 3.9: You can run |
📝 WalkthroughWalkthroughType annotations added to an ElevenLabs example script. The ElevenLabs STT component now supports configurable model selection through a new model_id parameter, replacing hardcoded model identifiers and deprecating the use_realtime flag with appropriate warnings. Changes
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
Oh my apologies for missing that, I have pushed the ruff formatted changes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py`:
- Around line 109-125: The logger.warning call in the branch that handles both
use_realtime and model_id ("both `use_realtime` and `model_id` parameters are
provided. `use_realtime` will be ignored.") exceeds the 100-character limit;
update the warning in the if is_given(use_realtime) and is_given(model_id)
branch by splitting the long string across multiple shorter literal parts (e.g.,
implicit string concatenation with parentheses or separate +-joined strings) in
the logger.warning call so it stays under 100 chars per line, keeping the same
message and leaving the surrounding logic (use_realtime, model_id, and
subsequent model_id defaulting) unchanged.
🧹 Nitpick comments (1)
examples/other/elevenlab_scribe_v2.py (1)
27-36: Remove deprecateduse_realtimeparameter from example.The example uses both
use_realtime=True(deprecated) andmodel_id="scribe_v2_realtime"(new). Per the implementation instt.py, this combination logs a warning: "bothuse_realtimeandmodel_idparameters are provided.use_realtimewill be ignored." Examples should demonstrate the recommended usage pattern.♻️ Suggested fix
stt = elevenlabs.STT( - use_realtime=True, server_vad={ "vad_silence_threshold_secs": 0.5, "vad_threshold": 0.5, "min_speech_duration_ms": 100, "min_silence_duration_ms": 300, }, model_id="scribe_v2_realtime", )
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/other/elevenlab_scribe_v2.pylivekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
examples/other/elevenlab_scribe_v2.pylivekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Run all checks (format-check, lint, type-check) using 'make check' before committing
Applied to files:
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Applies to **/*.py : Run ruff linter and auto-fix issues
Applied to files:
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py
🧬 Code graph analysis (2)
examples/other/elevenlab_scribe_v2.py (7)
examples/bank-ivr/ivr_navigator_agent.py (1)
prewarm(79-80)examples/bank-ivr/ivr_system_agent.py (1)
prewarm(625-626)examples/other/kokoro_tts.py (1)
prewarm(30-31)examples/voice_agents/basic_agent.py (1)
prewarm(71-72)examples/other/transcription/multi-user-transcriber.py (1)
prewarm(139-140)examples/voice_agents/nvidia_test.py (1)
prewarm(21-22)examples/voice_agents/zapier_mcp_integration.py (1)
prewarm(48-49)
livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (2)
livekit-agents/livekit/agents/utils/misc.py (1)
is_given(25-26)livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/tts.py (1)
model(193-194)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (9)
examples/other/elevenlab_scribe_v2.py (2)
17-17: LGTM on type annotations!The added return type annotations (
-> None) improve code clarity and align with stricter typing patterns in the codebase (e.g.,examples/bank-ivr/ivr_navigator_agent.py).Also applies to: 25-25
38-38: Explicit type annotation is acceptable but optional.The type annotation on
sessionis valid. However, sinceAgentSession(...)already returnsAgentSession, the annotation is redundant. This is a style preference—keeping it is fine for explicitness.livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py (7)
23-23: LGTM on import addition.
Literalimport is correctly added to support theElevenLabsSTTModelstype alias.
60-66: LGTM on type definitions.The
ElevenLabsSTTModelstype alias with documentation link and themodel_idfield inSTTOptionsprovide good type safety while allowing custom model strings for flexibility. This aligns with the Model Interface Pattern.
84-89: LGTM on parameter additions.The deprecation annotation for
use_realtimeand addition ofmodel_idwithNOT_GIVENdefault follows the pattern established in past review discussions. Documentation in the docstring properly explains the deprecation.
155-157: LGTM on model property.Returning
self._opts.model_idaligns with the TTS pattern (livekit-plugins-elevenlabs/tts.pyreturnsself._opts.model) and addresses the past review comment.
182-182: LGTM on API integration.The
model_idis correctly passed to the ElevenLabs API form data, replacing previously hardcoded values.
432-439: LGTM on WebSocket integration.The
model_idis correctly added to the WebSocket URL query parameters for the realtime streaming endpoint.
142-151: LGTM on options initialization.The
model_idis correctly passed toSTTOptions, ensuring the resolved model is propagated through the options dataclass.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| if is_given(use_realtime): | ||
| if is_given(model_id): | ||
| logger.warning( | ||
| "both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored." | ||
| ) | ||
| else: | ||
| logger.warning( | ||
| "`use_realtime` parameter is deprecated. " | ||
| "Specify a realtime model_id to enable streaming. " | ||
| "Defaulting model_id to one based on use_realtime parameter. " | ||
| ) | ||
| model_id = "scribe_v2_realtime" if use_realtime else "scribe_v1" | ||
| model_id = model_id if is_given(model_id) else "scribe_v1" | ||
| use_realtime = model_id == "scribe_v2_realtime" | ||
|
|
||
| if not use_realtime and is_given(server_vad): | ||
| logger.warning("Server-side VAD is only supported for Scribe v2 realtime model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the actual file
fd -t f "stt.py" | grep elevenlabsRepository: livekit/agents
Length of output: 134
🏁 Script executed:
# Check the exact code at lines 109-125
if [ -f "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py" ]; then
sed -n '100,130p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"
fiRepository: livekit/agents
Length of output: 1797
🌐 Web query:
ElevenLabs Scribe realtime models API 2025 streaming STT
💡 Result:
Short summary — ElevenLabs Scribe (2025/2026 Realtime):
- Model: Scribe v2 Realtime — low‑latency live STT (≈150 ms median latency), 90+ languages, word-level timestamps, speaker diarization, entity detection and advanced vocab handling. [1][2][5]
- Streaming API: WebSocket-based streaming (server- and client-side guides). Returns partial (interim) transcripts and committed segments; supports PCM 8–48 kHz and μ‑law. [3][4]
- Client vs Server: Client connections use a single-use temporary token (for mic/browser use); server implementations use a standard API key and can stream from URLs or files. [4][3]
- Developer controls: Voice Activity Detection (VAD), manual commit control, ability to send prior text context, and support for multi-language / auto language detection. [3][1]
- Where to start / docs: ElevenLabs realtime STT streaming docs and cookbooks (includes quickstarts and WebSocket examples). [3][4]
If you want, I can: (A) paste the WebSocket quickstart example for client or server, or (B) walk through obtaining temporary tokens and a minimal streaming demo in Node/Python. Which do you prefer?
🏁 Script executed:
# Check file structure and look for model constants/definitions
head -50 "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"Repository: livekit/agents
Length of output: 1506
🏁 Script executed:
# Check for any other realtime model references
rg "realtime|scribe" "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/" -iRepository: livekit/agents
Length of output: 4141
🏁 Script executed:
# Check line lengths in the problematic section
sed -n '109,125p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py" | while IFS= read -r line; do
len=${`#line`}
if [ $len -gt 100 ]; then
echo "Line exceeds 100 chars (length: $len): $line"
fi
doneRepository: livekit/agents
Length of output: 209
🏁 Script executed:
# Look at the full method/context including docstrings
sed -n '70,130p' "livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py"Repository: livekit/agents
Length of output: 3107
Fix line length violation on line 113.
Line 113 exceeds the 100-character maximum:
"both `use_realtime` and `model_id` parameters are provided. `use_realtime` will be ignored."
Split this warning message across multiple lines to comply with the Python style requirement.
The deprecation logic is sound. The exact match on line 122 (model_id == "scribe_v2_realtime") is appropriate—ElevenLabs currently provides only one realtime model, and the Literal type restricts the model_id to known values ("scribe_v1", "scribe_v2", "scribe_v2_realtime").
🤖 Prompt for AI Agents
In `@livekit-plugins/livekit-plugins-elevenlabs/livekit/plugins/elevenlabs/stt.py`
around lines 109 - 125, The logger.warning call in the branch that handles both
use_realtime and model_id ("both `use_realtime` and `model_id` parameters are
provided. `use_realtime` will be ignored.") exceeds the 100-character limit;
update the warning in the if is_given(use_realtime) and is_given(model_id)
branch by splitting the long string across multiple shorter literal parts (e.g.,
implicit string concatenation with parentheses or separate +-joined strings) in
the logger.warning call so it stays under 100 chars per line, keeping the same
message and leaving the surrounding logic (use_realtime, model_id, and
subsequent model_id defaulting) unchanged.
This PR allows users to select scribe_v2 for use as a non-realtime STT engine, it also allows selecting STT engine by model_id. It deprecates the use_realtime parameter in favour of automatically enabling this based on the model_id name.
Context: I wanted to run the new scribe_v2 model (https://elevenlabs.io/docs/overview/models#models-overview) but saw the STT plugin has no model parameter, and even seems to be hardcoded to the old scribe_v1 model. This PR fixes that by introducing a model_id in a similar way to many other plugins.
Disclaimer: I am not an expert on Elevenlabs code, I just spotted what looks like a missing feature or oversight, and made this PR accordingly
Note: When I tried use_realtime=True with latest livekit version, I found regular " ElevenLabs STT connection closed unexpectedly" when using manual turn detection and ending user turn. I see when searching livekit agents forum many people report similar issues. For this reason I wanted to disable use_realtime and work with the latest model.
Warning: I notice that when use_realtime is set, it used a hardcoded realtime_scribe_v2 - this PR tries to avoid a breaking change by enforcing the existing behaviour, that use_realtime=True will always use that model, if we want to be more future proof here we could make a breaking change and make it fully dynamic
Summary by CodeRabbit
New Features
Deprecations
✏️ Tip: You can customize this high-level summary in your review settings.