From 943e7c10cb7ebca4bbcb3eafecf124778352c8af Mon Sep 17 00:00:00 2001 From: Andre Natal Date: Tue, 28 Apr 2026 14:32:48 -0700 Subject: [PATCH 1/2] Incorporating flux-general-multi --- fern/customization/speech-configuration.mdx | 2 +- .../transcriber-fallback-plan.mdx | 2 +- .../voice-pipeline-configuration.mdx | 127 ++++++++++++------ 3 files changed, 90 insertions(+), 41 deletions(-) diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx index 53a9bb27f..0a06af4a9 100644 --- a/fern/customization/speech-configuration.mdx +++ b/fern/customization/speech-configuration.mdx @@ -51,7 +51,7 @@ This plan defines the parameters for when the assistant begins speaking after th **Audio-text based providers:** - - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. + - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. Available in English (`flux-general-en`) and multilingual (`flux-general-multi`) variants. - **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response. diff --git a/fern/customization/transcriber-fallback-plan.mdx b/fern/customization/transcriber-fallback-plan.mdx index 45a34548c..ee5ad72c2 100644 --- a/fern/customization/transcriber-fallback-plan.mdx +++ b/fern/customization/transcriber-fallback-plan.mdx @@ -124,7 +124,7 @@ Each transcriber provider supports different configuration options. Expand the a - - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, etc.). + - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, `flux-general-multi`, etc.). Use `flux-general-en` for English-only conversations and `flux-general-multi` for multilingual conversations. - **language**: Language code for transcription. - **keywords**: Keywords with optional boost values for improved recognition (e.g., `["companyname", "productname:2"]`). - **keyterm**: Keyterm prompting for up to 90% keyword recall rate improvement. diff --git a/fern/customization/voice-pipeline-configuration.mdx b/fern/customization/voice-pipeline-configuration.mdx index a3299b562..7fe863c9e 100644 --- a/fern/customization/voice-pipeline-configuration.mdx +++ b/fern/customization/voice-pipeline-configuration.mdx @@ -189,7 +189,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh - **krisp**: Audio-based model analyzing prosodic features (intonation, pitch, rhythm) **Audio-text based providers:** - - **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. (English only) + - **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Use `flux-general-en` for English-only conversations or `flux-general-multi` for multilingual conversations. - **assembly**: Transcriber with built-in end-of-turn detection (English only)
@@ -199,7 +199,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh **When to use smart endpointing:** -- **Deepgram Flux**: English conversations using Deepgram as a transcriber. +- **Deepgram Flux**: English and Multi-lingual conversations using Deepgram as a transcriber. - **Assembly**: Best used when Assembly is already your transcriber provider for English conversations with integrated end-of-turn detection - **LiveKit**: English conversations where Deepgram is not the transcriber of choice. - **Vapi**: Non-English conversations with default stop speaking plan settings @@ -221,19 +221,37 @@ Deepgram Flux's end-of-turn detection is configured at the transcriber level, al - **4000-6000:** Standard timeout (default: 5000) - natural conversation flow - **7000-10000:** Extended timeout for complex or thoughtful responses -**Configuration example:** +**Configuration examples:** -```json -{ - "transcriber": { - "provider": "deepgram", - "model": "flux-general-en", - "language": "en", - "eotThreshold": 0.7, - "eotTimeoutMs": 5000 - } -} -``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-en", + "language": "en", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + } + } + ``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-multi", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + } + } + ``` + + When using `flux-general-multi`, omit or leave `language` unset to enable automatic language detection. + + ### LiveKit's Wait function @@ -669,32 +687,63 @@ User Interrupts → Assistant Audio Stopped → backoffSeconds Blocks All Output ### Audio-text based endpointing (Deepgram Flux example) -```json -{ - "transcriber": { - "provider": "deepgram", - "model": "flux-general-en", - "language": "en", - "eotThreshold": 0.7, - "eotTimeoutMs": 5000 - }, - "stopSpeakingPlan": { - "numWords": 2, - "voiceSeconds": 0.2, - "backoffSeconds": 1.0, - "acknowledgementPhrases": [ - "okay", - "right", - "uh-huh", - "yeah", - "mm-hmm", - "got it" - ] - } -} -``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-en", + "language": "en", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + }, + "stopSpeakingPlan": { + "numWords": 2, + "voiceSeconds": 0.2, + "backoffSeconds": 1.0, + "acknowledgementPhrases": [ + "okay", + "right", + "uh-huh", + "yeah", + "mm-hmm", + "got it" + ] + } + } + ``` + + **Optimized for:** English conversations where Deepgram is set as transcriber. + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-multi", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + }, + "stopSpeakingPlan": { + "numWords": 2, + "voiceSeconds": 0.2, + "backoffSeconds": 1.0, + "acknowledgementPhrases": [ + "okay", + "right", + "uh-huh", + "yeah", + "mm-hmm", + "got it" + ] + } + } + ``` -**Optimized for:** English conversations where Deepgram is set as transcriber. + **Optimized for:** Multilingual conversations where Deepgram is set as transcriber. Omit `language` to enable automatic language detection. + + ### Education and training From 765c6658a4c65ae3ca735b57d4fbafd05282c1f0 Mon Sep 17 00:00:00 2001 From: Andre Natal Date: Tue, 28 Apr 2026 14:36:56 -0700 Subject: [PATCH 2/2] Stating languages supported by multilang flux --- fern/customization/speech-configuration.mdx | 2 +- fern/customization/transcriber-fallback-plan.mdx | 2 +- fern/customization/voice-pipeline-configuration.mdx | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx index 0a06af4a9..974d53eaa 100644 --- a/fern/customization/speech-configuration.mdx +++ b/fern/customization/speech-configuration.mdx @@ -51,7 +51,7 @@ This plan defines the parameters for when the assistant begins speaking after th **Audio-text based providers:** - - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. Available in English (`flux-general-en`) and multilingual (`flux-general-multi`) variants. + - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. Available in English (`flux-general-en`) and multilingual (`flux-general-multi`) variants. Supported languages for `flux-general-multi`: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). - **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response. diff --git a/fern/customization/transcriber-fallback-plan.mdx b/fern/customization/transcriber-fallback-plan.mdx index ee5ad72c2..19466ca4c 100644 --- a/fern/customization/transcriber-fallback-plan.mdx +++ b/fern/customization/transcriber-fallback-plan.mdx @@ -124,7 +124,7 @@ Each transcriber provider supports different configuration options. Expand the a - - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, `flux-general-multi`, etc.). Use `flux-general-en` for English-only conversations and `flux-general-multi` for multilingual conversations. + - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, `flux-general-multi`, etc.). Use `flux-general-en` for English-only conversations and `flux-general-multi` for multilingual conversations. Supported languages for `flux-general-multi`: `en`, `es`, `fr`, `de`, `hi`, `ru`, `pt`, `ja`, `it`, `nl`. - **language**: Language code for transcription. - **keywords**: Keywords with optional boost values for improved recognition (e.g., `["companyname", "productname:2"]`). - **keyterm**: Keyterm prompting for up to 90% keyword recall rate improvement. diff --git a/fern/customization/voice-pipeline-configuration.mdx b/fern/customization/voice-pipeline-configuration.mdx index 7fe863c9e..2ac78292f 100644 --- a/fern/customization/voice-pipeline-configuration.mdx +++ b/fern/customization/voice-pipeline-configuration.mdx @@ -249,7 +249,7 @@ Deepgram Flux's end-of-turn detection is configured at the transcriber level, al } ``` - When using `flux-general-multi`, omit or leave `language` unset to enable automatic language detection. + Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Set the `language` field to one of these codes, or omit it to enable automatic language detection. @@ -741,7 +741,7 @@ User Interrupts → Assistant Audio Stopped → backoffSeconds Blocks All Output } ``` - **Optimized for:** Multilingual conversations where Deepgram is set as transcriber. Omit `language` to enable automatic language detection. + **Optimized for:** Multilingual conversations where Deepgram is set as transcriber. Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Omit `language` to enable automatic language detection.