<a href="https://colab.research.google.com/github/csukuangfj/colab/blob/master/sherpa_onnx_whisper_models.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

This colab notebooks shows how to use [sherpa-onnx][sherpa-onnx]
to run [whisper][whisper] models.

[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx
[whisper]: https://github.com/openai/whisper/

Real time factors (RTF) of each type of model are listed below:

|model|CPU or CUDA| RTF|
|---|---|---|
|float32 tiny.en| CPU|0.208|
|float32 tiny.en| CUDA|0.186|
|int8 tiny.en| CPU|0.142|
|int8 tiny.en| CUDA|0.146|
|float32 base.en| CPU|0.430|
|float32 base.en| CUDA|0.129|
|int8 base.en| CPU|0.332|
|int8 base.en| CUDA|0.259|
|float32 small.en| CPU|1.633|
|float32 small.en| CUDA|0.268|
|int8 small.en| CPU|1.369|
|int8 small.en| CUDA|0.722|
|float32 medium.en| CPU|5.515|
|float32 medium.en| CUDA|0.675|
|int8 medium.en| CPU|4.269|
|int8 medium.en| CUDA|1.867|


## Install sherpa-onnx

First we need to install [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx).

Please refer to
https://k2-fsa.github.io/sherpa/onnx/install/index.html
for other installation methods.

In [1]:
# ! pip install sherpa-onnx   # <--- It installs a CPU version

%%shell

git clone http://github.com/k2-fsa/sherpa-onnx
cd sherpa-onnx
mkdir build
cd build
cmake \
  -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_SHARED_LIBS=ON \
  -DSHERPA_ONNX_ENABLE_GPU=ON \
  -DCMAKE_INSTALL_PREFIX=./install \
  ..

make -j
make install

Cloning into 'sherpa-onnx'...
remote: Enumerating objects: 2497, done.[K
remote: Counting objects: 100% (969/969), done.[K
remote: Compressing objects: 100% (385/385), done.[K
remote: Total 2497 (delta 692), reused 678 (delta 573), pack-reused 1528[K
Receiving objects: 100% (2497/2497), 1.84 MiB | 10.34 MiB/s, done.
Resolving deltas: 100% (1430/1430), done.
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
  Compiling for NVIDIA GPU is enabled.  Please make sure cudatoolkit

  is installed on your system.  Otherwise, you will get error



Check that [sherpa-onnx](https://github.com/k2-fsa/sherpa-onnx) has been successfully installed:

In [2]:
! ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --help

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:PrintUsage:402 

Usage:

(1) Transducer from icefall

  ./bin/sherpa-onnx-offline \
    --tokens=/path/to/tokens.txt \
    --encoder=/path/to/encoder.onnx \
    --decoder=/path/to/decoder.onnx \
    --joiner=/path/to/joiner.onnx \
    --num-threads=1 \
    --decoding-method=greedy_search \
    /path/to/foo.wav [bar.wav foobar.wav ...]


(2) Paraformer from FunASR

  ./bin/sherpa-onnx-offline \
    --tokens=/path/to/tokens.txt \
    --paraformer=/path/to/model.onnx \
    --num-threads=1 \
    --decoding-method=greedy_search \
    /path/to/foo.wav [bar.wav foobar.wav ...]

(3) Whisper models

  ./bin/sherpa-onnx-offline \
    --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \
    --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \
    --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \
    --num-threads=1 \
    /path/to/foo.wav [bar.wav foobar.wav ...]


Note: It supports decod

## tiny.en

The models are hosted at
https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en

and we need to use `git-lfs` to download it.

In [3]:
! git lfs install

Git LFS initialized.


In [4]:
! git clone https://huggingface.co/csukuangfj/sherpa-onnx-whisper-tiny.en

Cloning into 'sherpa-onnx-whisper-tiny.en'...
remote: Enumerating objects: 30, done.[K
remote: Counting objects: 100% (30/30), done.[K
remote: Compressing objects: 100% (29/29), done.[K
remote: Total 30 (delta 3), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (30/30), 1023.85 KiB | 9.39 MiB/s, done.
Filtering content: 100% (8/8), 676.48 MiB | 68.58 MiB/s, done.


In [5]:
! ls -lh sherpa-onnx-whisper-tiny.en

total 678M
-rw-r--r-- 1 root root  259 Aug  7 14:06 README.md
drwxr-xr-x 2 root root 4.0K Aug  7 14:06 test_wavs
-rw-r--r-- 1 root root 105M Aug  7 14:06 tiny.en-decoder.int8.onnx
-rw-r--r-- 1 root root 105M Aug  7 14:06 tiny.en-decoder.int8.ort
-rw-r--r-- 1 root root 186M Aug  7 14:06 tiny.en-decoder.onnx
-rw-r--r-- 1 root root 186M Aug  7 14:06 tiny.en-decoder.ort
-rw-r--r-- 1 root root  13M Aug  7 14:06 tiny.en-encoder.int8.onnx
-rw-r--r-- 1 root root  13M Aug  7 14:06 tiny.en-encoder.int8.ort
-rw-r--r-- 1 root root  36M Aug  7 14:06 tiny.en-encoder.onnx
-rw-r--r-- 1 root root  36M Aug  7 14:06 tiny.en-encoder.ort
-rw-r--r-- 1 root root 816K Aug  7 14:06 tiny.en-tokens.txt


### float32 onnx models

We can use `sherpa-onnx-offline` to run it:

In [6]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx", decoder="./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx"), tokens="./sherp



To use CUDA, please provide the argument `--provider=cuda`:

In [7]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.onnx", decoder="./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.onnx"), tokens="./sher



### int8 onnx models

Run with CPU:

In [8]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onn



Run with CUDA:

In [9]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/1.wav ./sherpa-onnx-whisper-tiny.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.on



## base.en

In [10]:
! git clone https://huggingface.co/csukuangfj/sherpa-onnx-whisper-base.en

Cloning into 'sherpa-onnx-whisper-base.en'...
remote: Enumerating objects: 27, done.[K
remote: Counting objects:   3% (1/27)[Kremote: Counting objects:   7% (2/27)[Kremote: Counting objects:  11% (3/27)[Kremote: Counting objects:  14% (4/27)[Kremote: Counting objects:  18% (5/27)[Kremote: Counting objects:  22% (6/27)[Kremote: Counting objects:  25% (7/27)[Kremote: Counting objects:  29% (8/27)[Kremote: Counting objects:  33% (9/27)[Kremote: Counting objects:  37% (10/27)[Kremote: Counting objects:  40% (11/27)[Kremote: Counting objects:  44% (12/27)[Kremote: Counting objects:  48% (13/27)[Kremote: Counting objects:  51% (14/27)[Kremote: Counting objects:  55% (15/27)[Kremote: Counting objects:  59% (16/27)[Kremote: Counting objects:  62% (17/27)[Kremote: Counting objects:  66% (18/27)[Kremote: Counting objects:  70% (19/27)[Kremote: Counting objects:  74% (20/27)[Kremote: Counting objects:  77% (21/27)[Kremote: Counting objects:  81% (22/27)

In [11]:
! ls -lh sherpa-onnx-whisper-base.en

total 1.1G
-rw-r--r-- 1 root root 150M Aug  7 14:07 base.en-decoder.int8.onnx
-rw-r--r-- 1 root root 151M Aug  7 14:07 base.en-decoder.int8.ort
-rw-r--r-- 1 root root 289M Aug  7 14:07 base.en-decoder.onnx
-rw-r--r-- 1 root root 290M Aug  7 14:07 base.en-decoder.ort
-rw-r--r-- 1 root root  28M Aug  7 14:07 base.en-encoder.int8.onnx
-rw-r--r-- 1 root root  28M Aug  7 14:07 base.en-encoder.int8.ort
-rw-r--r-- 1 root root  91M Aug  7 14:07 base.en-encoder.onnx
-rw-r--r-- 1 root root  91M Aug  7 14:07 base.en-encoder.ort
-rw-r--r-- 1 root root 816K Aug  7 14:07 base.en-tokens.txt
-rw-r--r-- 1 root root  259 Aug  7 14:07 README.md
drwxr-xr-x 2 root root 4.0K Aug  7 14:07 test_wavs


### float32 onnx models

Run with CPU:

In [12]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-base.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.onnx --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-base.en/test_wavs/0.wav ./sherpa-onnx-whisper-base.en/test_wavs/1.wav ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-base.en/base.en-encoder.onnx", decoder="./sherpa-onnx-whisper-base.en/base.en-decoder.onnx"), tokens="./sherp



Run with CUDA:

In [13]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-base.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.onnx --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-base.en/test_wavs/0.wav ./sherpa-onnx-whisper-base.en/test_wavs/1.wav ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-base.en/base.en-encoder.onnx", decoder="./sherpa-onnx-whisper-base.en/base.en-decoder.onnx"), tokens="./sher



### int8 onnx models

Run with CPU:

In [14]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-base.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-base.en/test_wavs/0.wav ./sherpa-onnx-whisper-base.en/test_wavs/1.wav ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onn



Run with CUDA:

In [15]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-base.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-base.en/base.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-base.en/base.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-base.en/test_wavs/0.wav ./sherpa-onnx-whisper-base.en/test_wavs/1.wav ./sherpa-onnx-whisper-base.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-base.en/base.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-base.en/base.en-decoder.int8.on



## small.en

In [16]:
! git clone https://huggingface.co/csukuangfj/sherpa-onnx-whisper-small.en

Cloning into 'sherpa-onnx-whisper-small.en'...
remote: Enumerating objects: 27, done.[K
remote: Counting objects: 100% (27/27), done.[K
remote: Compressing objects: 100% (26/26), done.[K
remote: Total 27 (delta 2), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (27/27), 1023.53 KiB | 9.84 MiB/s, done.
Filtering content: 100% (8/8), 2.87 GiB | 57.06 MiB/s, done.


In [17]:
! ls -lh sherpa-onnx-whisper-small.en

total 2.9G
-rw-r--r-- 1 root root  260 Aug  7 14:08 README.md
-rw-r--r-- 1 root root 288M Aug  7 14:08 small.en-decoder.int8.onnx
-rw-r--r-- 1 root root 289M Aug  7 14:08 small.en-decoder.int8.ort
-rw-r--r-- 1 root root 686M Aug  7 14:09 small.en-decoder.onnx
-rw-r--r-- 1 root root 686M Aug  7 14:09 small.en-decoder.ort
-rw-r--r-- 1 root root 108M Aug  7 14:08 small.en-encoder.int8.onnx
-rw-r--r-- 1 root root 108M Aug  7 14:08 small.en-encoder.int8.ort
-rw-r--r-- 1 root root 391M Aug  7 14:08 small.en-encoder.onnx
-rw-r--r-- 1 root root 391M Aug  7 14:08 small.en-encoder.ort
-rw-r--r-- 1 root root 816K Aug  7 14:08 small.en-tokens.txt
drwxr-xr-x 2 root root 4.0K Aug  7 14:08 test_wavs


### float32 onnx models

Run with CPU:

In [18]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-small.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.onnx --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-small.en/test_wavs/0.wav ./sherpa-onnx-whisper-small.en/test_wavs/1.wav ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-small.en/small.en-encoder.onnx", decoder="./sherpa-onnx-whisper-small.en/small.en-decoder.onnx"), to



Run with CUDA:

In [19]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-small.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.onnx --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-small.en/test_wavs/0.wav ./sherpa-onnx-whisper-small.en/test_wavs/1.wav ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-small.en/small.en-encoder.onnx", decoder="./sherpa-onnx-whisper-small.en/small.en-decoder.onnx"), t



### int8 onnx models

Run with CPU:

In [20]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-small.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-small.en/test_wavs/0.wav ./sherpa-onnx-whisper-small.en/test_wavs/1.wav ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-small.en/small.en-dec



Run with CUDA:

In [21]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-small.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-small.en/small.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-small.en/small.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-small.en/test_wavs/0.wav ./sherpa-onnx-whisper-small.en/test_wavs/1.wav ./sherpa-onnx-whisper-small.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-small.en/small.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-small.en/small.en-de



## medium.en

In [22]:
! rm -rf sherpa-onnx-whisper-small.en

In [23]:
! git clone https://huggingface.co/csukuangfj/sherpa-onnx-whisper-medium.en

Cloning into 'sherpa-onnx-whisper-medium.en'...
remote: Enumerating objects: 24, done.[K
remote: Counting objects: 100% (24/24), done.[K
remote: Compressing objects: 100% (23/23), done.[K
remote: Total 24 (delta 1), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (24/24), 1023.23 KiB | 7.31 MiB/s, done.
Filtering content: 100% (8/8), 7.95 GiB | 53.58 MiB/s, done.


In [24]:
%%shell
cd sherpa-onnx-whisper-medium.en/
git lfs pull --include "*.onnx"




### float32 onnx models

Run with CPU:

In [25]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.onnx --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx", decoder="./sherpa-onnx-whisper-medium.en/medium.en-decod



Run with CUDA:

In [26]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.onnx \
  --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.onnx --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-medium.en/medium.en-encoder.onnx", decoder="./sherpa-onnx-whisper-medium.en/medium.en-deco



### int8 onnx models

Run with CPU:

In [27]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt \
  --num-threads=1 \
  --provider=cpu \
  ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt --num-threads=1 --provider=cpu ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-medium.en/



Run with CUDA:

In [28]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt \
  --num-threads=1 \
  --provider=cuda \
  ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt --num-threads=1 --provider=cuda ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-medium.en



Run with CUDA using 2 threads:

In [29]:
%%shell

./sherpa-onnx/build/install/bin/sherpa-onnx-offline \
  --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx \
  --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx \
  --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt \
  --num-threads=2 \
  --provider=cuda \
  ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav \
  ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav

/content/sherpa-onnx/sherpa-onnx/csrc/parse-options.cc:Read:361 ./sherpa-onnx/build/install/bin/sherpa-onnx-offline --whisper-encoder=./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx --whisper-decoder=./sherpa-onnx-whisper-medium.en/medium.en-decoder.int8.onnx --tokens=./sherpa-onnx-whisper-medium.en/medium.en-tokens.txt --num-threads=2 --provider=cuda ./sherpa-onnx-whisper-medium.en/test_wavs/0.wav ./sherpa-onnx-whisper-medium.en/test_wavs/1.wav ./sherpa-onnx-whisper-medium.en/test_wavs/8k.wav 

OfflineRecognizerConfig(feat_config=OfflineFeatureExtractorConfig(sampling_rate=16000, feature_dim=80), model_config=OfflineModelConfig(transducer=OfflineTransducerModelConfig(encoder_filename="", decoder_filename="", joiner_filename=""), paraformer=OfflineParaformerModelConfig(model=""), nemo_ctc=OfflineNemoEncDecCtcModelConfig(model=""), whisper=OfflineWhisperModelConfig(encoder="./sherpa-onnx-whisper-medium.en/medium.en-encoder.int8.onnx", decoder="./sherpa-onnx-whisper-medium.en

