nerd-dictation like output #382
Replies: 3 comments 5 replies
-
|
Hi @murali-agastya, thanks for the feedback — this is actually something we are actively working on right now! We are building real-time streaming transcription as an experimental feature. It will show text as you speak (similar to nerd-dictation's live output) rather than waiting until you stop recording. However, this requires significant architectural changes to the recognition pipeline:
All of this is behind a setting toggle ( We have an open PR in progress — once it is ready for testing, we will circle back here so you can try it out. Appreciate your patience! |
Beta Was this translation helpful? Give feedback.
-
|
I also want to experience and test this feature. please let me know if it 's ready.Thanks for your work. mr.kechen@outlook.com |
Beta Was this translation helpful? Give feedback.
-
|
Why do developers use a four year-old, piece of shit asr? Whisper is, quite literally, the worst asr available, yet it continues to show up in apps like this one. It blows my mind. I saw you recommend 15 gigs of vram for whisper medium--15 gigs!! sensevoie small is 100-times the asr whisper is and it is ~200mb and uses 300mb vram. Absolute misguided lunacy. And your streaming implementatino wouldn't need insane chunking hacks (see: simulstream) just to get something resembling realtime working. You could just use, oh i don't know, one of the 15 actually MODERN ASRs? What the hell are you doing? seriously. Use Nemotron-realtime. qwen3-asr. There are so many possibilities, and you choose a four year-old piece of shit relic. Awesome. I guess I'll just dominate this space when I publish mine, since all "devs" seem to think is out there is Whisper. Take a gander at hf from time to time--it'll do you some good. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Thanks for the program. I tried it, the default settings, on Linux. It works, but I was looking for something like nerd-dictaton where (I think using xdotool) keystrokes for anything spoken would be inserted on the fly! Here, the insertion appears to work only when I stop the recording. Wonder if your program can achieve that feature. The only reason I was looking to move from nerd-dictation is that its backend is not whisper (but vosk).
I am happy to try any experimetal versions if you choose to include this feature.
Thanks and kind regards
Murali
murali.agastya@gmail.com
Beta Was this translation helpful? Give feedback.
All reactions