On-prem streaming STT latency — 8 kHz telephony, Finalize + endpointing #1617

bdc-001 · 2026-05-26T04:07:56Z

bdc-001
May 26, 2026

We use self-hosted Deepgram from a Go voice bot over WebSocket /v1/listen:

sample_rate=8000, encoding=linear16, interim_results=true, endpointing=300 (configurable)
We also send {"type":"Finalize"} when our VAD detects end-of-speech
KeepAlive every 5s

Goal: minimize latency

What do you recommend for on-prem to cut latency—model choice (Nova-2 vs Nova-3/Flux), endpointing vs Finalize, and any query params we should enable for telephony (utterance_end_ms, vad_events, etc.)?

2026-05-26T04:07:59Z

deepgram-community[bot]
Bot May 26, 2026

Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently.
_{Consider joining our Discord community for more opportunity to engage with your fellow Deepgram users. You can earn points which can be redeemed for cool stuff by being active in our communities!}

0 replies

2026-05-26T04:08:04Z

deepgram-community[bot]
Bot May 26, 2026

Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion.

0 replies

2026-05-26T04:08:06Z

deepgram-community[bot]
Bot May 26, 2026

It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?

The deepgram product you are using (e.g Speech to Text, Agent API)
A request ID that triggered your error or issue.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepgram

On-prem streaming STT latency — 8 kHz telephony, Finalize + endpointing #1617

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Deepgram

On-prem streaming STT latency — 8 kHz telephony, Finalize + endpointing #1617

Uh oh!

bdc-001 May 26, 2026

Replies: 3 comments

Uh oh!

deepgram-community[bot] Bot May 26, 2026

Uh oh!

deepgram-community[bot] Bot May 26, 2026

Uh oh!

deepgram-community[bot] Bot May 26, 2026

bdc-001
May 26, 2026

deepgram-community[bot]
Bot May 26, 2026

deepgram-community[bot]
Bot May 26, 2026

deepgram-community[bot]
Bot May 26, 2026