Skip to content

Silero Models: pre-trained STT models and benchmarks

License

Notifications You must be signed in to change notification settings

ASRlytics/silero-models

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mailing list : test Mailing list : test License: CC BY-NC 4.0

Open In Colab

header)

Silero Models

Silero Models: pre-trained enterprise-grade STT models and benchmarks. Enterprise-grade STT made refreshingly simple (seriously, see bechmarks). We provide quality comparable to Google's STT (and sometimes even better) and we are not Google.

As a bonus:

  • No Kaldi;
  • No compilation;
  • No 20-step instructions;

Getting started

All of the provided models are listed in the models.yml file. Any meta-data and newer versions will be added there.

Currently we provide the following checkpoints:

PyTorch ONNX TensorFlow Quantization Quality Colab
English (en_v1) ✔️ ✔️ ✔️ link Open In Colab
German (de_v1) ✔️ ✔️ ✔️ link Open In Colab
Spanish (es_v1) ✔️ ✔️ ✔️ link Open In Colab

PyTorch

Open In Colab

Dependencies:

  • PyTorch 1.6+
  • TorchAudio 0.7+ (you can use your own data loaders)
  • omegaconf (or any similar library to work with yaml files)

Loading a model is as easy as cloning this repository and:

import torch
from omegaconf import OmegaConf

models = OmegaConf.load('models.yml')
device = torch.device('cpu')   # you can use any pytorch device
model, decoder = init_jit_model(models.stt_models.en.latest.jit, device=device)

We provide our models as TorchScript packages, so you can use the deployment options PyTorch itself provides (C++, Java). See details in the example notebook.

ONNX

Open In Colab

You can run our model everywhere, where you can import the ONNX model or run ONNX runtime.

Dependencies:

  • PyTorch 1.6+ (used for utilities only)
  • omegaconf (or any similar library to work with yaml files)
  • onnx
  • onnxruntime

Just clone the repo and:

import json
import onnx
import torch
import tempfile
import onnxruntime
from omegaconf import OmegaConf

models = OmegaConf.load('models.yml')

with tempfile.NamedTemporaryFile('wb', suffix='.json') as f:
    torch.hub.download_url_to_file(models.stt_models.en.latest.labels,
                               f.name,
                               progress=True)
    with open(f.name) as f:
        labels = json.load(f)
        decoder = Decoder(labels)

with tempfile.NamedTemporaryFile('wb', suffix='.model') as f:
    torch.hub.download_url_to_file(models.stt_models.en.latest.onnx,
                                   f.name,
                                   progress=True)
    onnx_model = onnx.load(f.name)
    onnx.checker.check_model(onnx_model)
    ort_session = onnxruntime.InferenceSession(f.name)

See details in the example notebook.

TensorFlow

We provide tensorflow checkpoints, but we do not provide any related utilities.

Examples

Colab notebooks and interactive demos are on the way. Please refer to this notebook in the meantime for:

  • PyTorch example
  • ONNX example

Wiki

Also check out our wiki.

Performance

TBD, here will be a link and a summary of model sizes and performance metrics.

TLDR for now: our models work ok on CPUs with RTF < 1. Each model weighs about ~200MB. We have succeeded in compressing our models up to 50MB with some loss of quality. Our aspirational goal is first to achieve 50 MB w/o quality loss and then scale to 10-20MB or even provide some compact variants of our EE models.

Get in Touch

Try our models, create an issue, join our chat, email us.

Commercial Inquiries

Please see our tiers and email us.

About

Silero Models: pre-trained STT models and benchmarks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages

  • Jupyter Notebook 73.5%
  • Python 26.5%