On-prem streaming STT latency — 8 kHz telephony, Finalize + endpointing #1617
Replies: 3 comments
-
|
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
|
It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
We use self-hosted Deepgram from a Go voice bot over WebSocket /v1/listen:
sample_rate=8000, encoding=linear16, interim_results=true, endpointing=300 (configurable)
We also send {"type":"Finalize"} when our VAD detects end-of-speech
KeepAlive every 5s
Goal: minimize latency
What do you recommend for on-prem to cut latency—model choice (Nova-2 vs Nova-3/Flux), endpointing vs Finalize, and any query params we should enable for telephony (utterance_end_ms, vad_events, etc.)?
Beta Was this translation helpful? Give feedback.
All reactions