STT

🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.


This project is no longer actively maintained, and we have stopped hosting the online Model Zoo. We've seen focus shift towards newer STT models such as [Whisper](, and have ourselves focused on [Coqui TTS]( and [Coqui Studio](

The models will remain available in [the releases of the coqui-ai/STT-models repo](

Coqui STT (🐸STT) is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. 🐸STT is battle tested in both production and research 🚀

🐸STT features

  • High-quality pre-trained STT model.
  • Efficient training pipeline with Multi-GPU support.
  • Streaming inference.
  • Multiple possible transcripts, each with an associated confidence score.
  • Real-time inference.
  • Small-footprint acoustic model.
  • Bindings for various programming languages.

Where to Ask Questions

Type Link
🚨 Bug Reports Github Issue Tracker
🎁 Feature Requests & Ideas Github Issue Tracker
Questions Github Discussions
💬 General Discussion Github Discussions or Gitter Room

Links & Resources

Type Link
📰 Documentation
🚀 Latest release with pre-trained models see the latest release on GitHub
🤝 Contribution Guidelines CONTRIBUTING.rst