Native-script transcripts for multilingual/code-switching streaming STT #1606
Replies: 3 comments 1 reply
-
|
Thanks for asking your question. Please be sure to reply with as much detail as possible so the community can assist you efficiently. |
Beta Was this translation helpful? Give feedback.
-
|
Hey there! It looks like you haven't connected your GitHub account to your Deepgram account. You can do this at https://community.deepgram.com - being verified through this process will allow our team to help you in a much more streamlined fashion. |
Beta Was this translation helpful? Give feedback.
-
|
It looks like we're missing some important information to help debug your issue. Would you mind providing us with the following details in a reply?
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We are using Deepgram streaming STT for live transcription with model=nova-2 & nova-3 and language=multi for multilingual/code-switching support.
For Hindi speech, the person may speak Hindi mixed with English technical terms. The transcription currently comes back mostly in Latin-script Hinglish, for example:
docker container me run...
instead of native Devanagari Hindi, such as:
docker container में run...
Questions:
Is Latin-script Hinglish expected behavior when using language=multi for Hindi/code-switched streaming audio?
Is there a supported way to bias or force Hindi output into native Devanagari script while still preserving English technical terms naturally?
Would model=nova-3 & language=multi, language=hi, or flux-general-multi with language_hint=hi&language_hint=en be the recommended setup for this use case?
For real-time streaming speech, what configuration do you recommend when the expected speech is mostly Hindi with English technical terms?
Are there any parameters or post-processing recommendations for native-script multilingual transcripts?
Our goal is to store and display speech in the selected speech language’s native script, while still allowing natural code-switching for technical terms.
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions