feat(elevenlabs): add applyLanguageTextNormalization TTS option#1427
Conversation
Ports livekit/agents#5679 to expose the ElevenLabs `apply_language_text_normalization` query parameter on the multi-stream WebSocket URL. The new option is optional and only sent when explicitly provided, matching Python's NotGivenOr semantics.
|
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: fd9c37b582
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| params.push(`inactivity_timeout=${opts.inactivityTimeout}`); | ||
| params.push(`apply_text_normalization=${opts.applyTextNormalization}`); | ||
| if (opts.applyLanguageTextNormalization !== undefined) { | ||
| params.push(`apply_language_text_normalization=${opts.applyLanguageTextNormalization}`); |
There was a problem hiding this comment.
Send language normalization on the HTTP TTS path
When callers use new TTS({ applyLanguageTextNormalization: true }).synthesize(...), the option is never sent: ChunkedStream posts to synthesizeUrl() and the JSON body at run() only includes text, model_id, and voice_settings. The ElevenLabs Stream speech endpoint is the documented path that accepts apply_language_text_normalization, while this change only appends it to the multi-context WebSocket URL, so one-shot/chunked synthesis still runs with the API default (false) for the Japanese normalization case this option is meant to enable.
Useful? React with 👍 / 👎.
Summary
Ports livekit/agents#5679 (
(elevenlabs tts): add apply_language_text_normalization param) from the Pythonlivekit-agentsrepo intoagents-js.This PR adds support for ElevenLabs'
apply_language_text_normalizationquery parameter on the multi-stream WebSocket URL. When set totrue, ElevenLabs applies language-aware text normalization, which helps with proper pronunciation of text in some supported languages.cc @toubatbrian @livekit/agent-devs for review.
Ported features
applyLanguageTextNormalizationoption onTTS(plugins/elevenlabs/src/tts.ts)New optional
applyLanguageTextNormalization?: booleanfield onTTSOptions. When provided, it is appended to themulti-stream-inputWebSocket URL asapply_language_text_normalization=<true|false>. When omitted, the parameter is not sent at all (preserving ElevenLabs' server-side default).The option flows through the same path as the existing
applyTextNormalizationoption:TTSOptionsinterface.ResolvedTTSOptions(kept optional so the URL builder can detect "not given").multiStreamUrl(...)only when defined, matching Python'sis_given(...)guard.Implementation nuances vs Python
NotGivenOr[bool]mapping: Python represents "not given" with the sentinelNOT_GIVENand gates URL inclusion onis_given(...). JS usesboolean | undefinedand gates on!== undefined. Behavior is equivalent: when the user does not pass the option, the query parameter is omitted from the WebSocket URL and ElevenLabs falls back to its server-side default.str(value).lower()to produce"true"/"false". JS template-literal interpolation of abooleanalready yields"true"/"false"lowercase, so no explicit conversion is needed.apply_language_text_normalization(Python field & wire param) maps to camelCaseapplyLanguageTextNormalizationin JS. The wire query parameter remainsapply_language_text_normalization.ChunkedStream(REST/text-to-speech/.../stream) not modified: The Python diff only touches the multi-stream WebSocket URL builder (_multi_stream_url); the RESTsynthesize_urlis unchanged. Mirroring that exactly here — onlymultiStreamUrllearns the new param.updateOptionschange: The Python PR does not add this field to runtimeupdate_optionseither, so it is a constructor-only knob in JS for parity.Files changed
plugins/elevenlabs/src/tts.tsapplyLanguageTextNormalization?: booleantoTTSOptionsandResolvedTTSOptions, propagate from constructor, conditionally appendapply_language_text_normalization=...to the multi-stream URL..changeset/elevenlabs-language-text-normalization.mdpatchchangeset for@livekit/agents-plugin-elevenlabs.Test plan
pnpm --filter '@livekit/agents-plugin-elevenlabs...' buildsucceedspnpm format:checkpassespnpm --filter '@livekit/agents-plugin-elevenlabs' lintpassesnew TTS({ applyLanguageTextNormalization: true }), run a streaming synthesis on a non-English voice, and confirm the WebSocket URL includesapply_language_text_normalization=trueand pronunciation is improvednew TTS()without the option and confirm the URL omits the parameter (current behavior unchanged)This PR was created by an automated Claude Code Routine maintained by @toubatbrian. The routine is currently in experimentation stage.
Generated by Claude Code