aliases | title | tags | date | draft | |||
---|---|---|---|---|---|---|---|
|
Speech to Text Apps |
|
2022-05-05 12:41:00 -0700 |
false |
Related links:
🔗 Speech to Text Apps
🔗 Text to Speech Apps
🔗 Speech to Speech (Fake Voice Generator)
- DeepSpeech : simpler although inferior
- Kaldi : STT supports hybrid NN-HMM and lattice-free MMI models. Kaldi is used by many people both in research and in production.
- Lingvo is the open source version of Google speech recognition toolkit, with support mostly for end-to-end models.
- ESPNet is good and well known for end-to-end models as well.
- RASR + RETURNN are very good as well, both for end-to-end models and hybrid NN-HMM, but they are for non-commercial applications only (or you need a commercial licence) (disclaimer: I work at the university chair which develops these frameworks).
- http://gkarsay.github.io/parlatype/
- https://github.com/juanerasmoe/pmTrans
- https://pythonbasics.org/transcribe-audio/
- Wav2Letter, the tool by Facebook.
- snakers4/silero-models at mlnews Silero Speech to Text
- coqui Coqui STT and TTS
- voice2json - Command-line tools for speech and intent recognition on Linux
- VOSK Offline Speech Recognition API
- Dataset
- English: Tedlium, Librispeech, etc.
- https://github.com/gooofy/zamia-speech
- https://commonvoice.mozilla.org/en/datasets
- https://www.openslr.org/resources.php
- snakers4/silero-models: Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple
- Voice Notebook
- Speech Texter
- Voicenote
- Speechnotes
- Dictation
- Dictanote
- oTranscribe
- Google Web Speech API
- Google Docs Type for your Voice
Tools
and thenVoice typing
- Wav2vec: Semi and Unsupervised Speech Recognition - Vaclav Kosar’s Blog
- The Illustrated Wav2vec - Jonathan Bgn