sherpa
is an open-source speech-text-text inference framework using
PyTorch, focusing exclusively on end-to-end (E2E) models,
namely transducer- and CTC-based models. It provides both C++ and Python APIs.
This project focuses on deployment, i.e., using pre-trained models to transcribe speech. If you are interested in how to train or fine-tune your own models, please refer to icefall.
We also have other similar projects that don't depend on PyTorch:
sherpa-onnx
andsherpa-ncnn
also support iOS, Android and embedded systems.
Please refer to the documentation at https://k2-fsa.github.io/sherpa/
Try sherpa
from within your browser without installing anything:
https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition