diff --git a/fern/customization/speech-configuration.mdx b/fern/customization/speech-configuration.mdx index 53a9bb27f..974d53eaa 100644 --- a/fern/customization/speech-configuration.mdx +++ b/fern/customization/speech-configuration.mdx @@ -51,7 +51,7 @@ This plan defines the parameters for when the assistant begins speaking after th **Audio-text based providers:** - - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. + - **Deepgram Flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Flux combines high-quality speech-to-text with native turn detection, while delivering ultra-low latency and Nova-3 level accuracy. Available in English (`flux-general-en`) and multilingual (`flux-general-multi`) variants. Supported languages for `flux-general-multi`: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). - **Assembly**: Transcriber that also reports end-of-turn detection. To use Assembly, choose it as your transcriber without setting a separate smart endpointing plan. As transcripts arrive, we consider the `end_of_turn` flag that Assembly sends to mark the end-of-turn, stream to the LLM, and generate a response. diff --git a/fern/customization/transcriber-fallback-plan.mdx b/fern/customization/transcriber-fallback-plan.mdx index 45a34548c..19466ca4c 100644 --- a/fern/customization/transcriber-fallback-plan.mdx +++ b/fern/customization/transcriber-fallback-plan.mdx @@ -124,7 +124,7 @@ Each transcriber provider supports different configuration options. Expand the a - - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, etc.). + - **model**: Model selection (`nova-3`, `nova-3-general`, `nova-3-medical`, `nova-2`, `flux-general-en`, `flux-general-multi`, etc.). Use `flux-general-en` for English-only conversations and `flux-general-multi` for multilingual conversations. Supported languages for `flux-general-multi`: `en`, `es`, `fr`, `de`, `hi`, `ru`, `pt`, `ja`, `it`, `nl`. - **language**: Language code for transcription. - **keywords**: Keywords with optional boost values for improved recognition (e.g., `["companyname", "productname:2"]`). - **keyterm**: Keyterm prompting for up to 90% keyword recall rate improvement. diff --git a/fern/customization/voice-pipeline-configuration.mdx b/fern/customization/voice-pipeline-configuration.mdx index a3299b562..2ac78292f 100644 --- a/fern/customization/voice-pipeline-configuration.mdx +++ b/fern/customization/voice-pipeline-configuration.mdx @@ -189,7 +189,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh - **krisp**: Audio-based model analyzing prosodic features (intonation, pitch, rhythm) **Audio-text based providers:** - - **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. (English only) + - **deepgram-flux**: Deepgram's latest transcriber model with built-in conversational speech recognition. Use `flux-general-en` for English-only conversations or `flux-general-multi` for multilingual conversations. - **assembly**: Transcriber with built-in end-of-turn detection (English only)
@@ -199,7 +199,7 @@ Uses AI models to analyze speech patterns, context, and audio cues to predict wh **When to use smart endpointing:** -- **Deepgram Flux**: English conversations using Deepgram as a transcriber. +- **Deepgram Flux**: English and Multi-lingual conversations using Deepgram as a transcriber. - **Assembly**: Best used when Assembly is already your transcriber provider for English conversations with integrated end-of-turn detection - **LiveKit**: English conversations where Deepgram is not the transcriber of choice. - **Vapi**: Non-English conversations with default stop speaking plan settings @@ -221,19 +221,37 @@ Deepgram Flux's end-of-turn detection is configured at the transcriber level, al - **4000-6000:** Standard timeout (default: 5000) - natural conversation flow - **7000-10000:** Extended timeout for complex or thoughtful responses -**Configuration example:** +**Configuration examples:** -```json -{ - "transcriber": { - "provider": "deepgram", - "model": "flux-general-en", - "language": "en", - "eotThreshold": 0.7, - "eotTimeoutMs": 5000 - } -} -``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-en", + "language": "en", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + } + } + ``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-multi", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + } + } + ``` + + Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Set the `language` field to one of these codes, or omit it to enable automatic language detection. + + ### LiveKit's Wait function @@ -669,32 +687,63 @@ User Interrupts → Assistant Audio Stopped → backoffSeconds Blocks All Output ### Audio-text based endpointing (Deepgram Flux example) -```json -{ - "transcriber": { - "provider": "deepgram", - "model": "flux-general-en", - "language": "en", - "eotThreshold": 0.7, - "eotTimeoutMs": 5000 - }, - "stopSpeakingPlan": { - "numWords": 2, - "voiceSeconds": 0.2, - "backoffSeconds": 1.0, - "acknowledgementPhrases": [ - "okay", - "right", - "uh-huh", - "yeah", - "mm-hmm", - "got it" - ] - } -} -``` + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-en", + "language": "en", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + }, + "stopSpeakingPlan": { + "numWords": 2, + "voiceSeconds": 0.2, + "backoffSeconds": 1.0, + "acknowledgementPhrases": [ + "okay", + "right", + "uh-huh", + "yeah", + "mm-hmm", + "got it" + ] + } + } + ``` + + **Optimized for:** English conversations where Deepgram is set as transcriber. + + + ```json + { + "transcriber": { + "provider": "deepgram", + "model": "flux-general-multi", + "eotThreshold": 0.7, + "eotTimeoutMs": 5000 + }, + "stopSpeakingPlan": { + "numWords": 2, + "voiceSeconds": 0.2, + "backoffSeconds": 1.0, + "acknowledgementPhrases": [ + "okay", + "right", + "uh-huh", + "yeah", + "mm-hmm", + "got it" + ] + } + } + ``` -**Optimized for:** English conversations where Deepgram is set as transcriber. + **Optimized for:** Multilingual conversations where Deepgram is set as transcriber. Supported languages: English (`en`), Spanish (`es`), French (`fr`), German (`de`), Hindi (`hi`), Russian (`ru`), Portuguese (`pt`), Japanese (`ja`), Italian (`it`), Dutch (`nl`). Omit `language` to enable automatic language detection. + + ### Education and training