Skip to content

Latest commit

 

History

History
430 lines (267 loc) · 13.7 KB

conformer-transducer-models.rst

File metadata and controls

430 lines (267 loc) · 13.7 KB

Conformer-transducer-based Models

Hint

Please refer to install_sherpa_onnx to install sherpa-onnx before you read this section.

csukuangfj/sherpa-onnx-conformer-zh-stateless2-2023-05-23 (Chinese)

This model is converted from

https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless2

which supports only Chinese as it is trained on the WenetSpeech corpus.

You can find the training code at

https://github.com/k2-fsa/icefall/tree/master/egs/wenetspeech/ASR/pruned_transducer_stateless2

In the following, we describe how to download it and use it with sherpa-onnx.

Download the model

Please use the following commands to download it.

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2

# For Chinese users, you can use the following mirror
# wget https://hub.nuaa.cf/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2

tar xvf sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2
rm sherpa-onnx-conformer-zh-stateless2-2023-05-23.tar.bz2

Please check that the file sizes of the pre-trained models are correct. See the file sizes of *.onnx files below.

sherpa-onnx-conformer-zh-stateless2-2023-05-23 fangjun$ ls -lh *.onnx
-rw-r--r--  1 fangjun  staff    11M May 23 15:29 decoder-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff    12M May 23 15:29 decoder-epoch-99-avg-1.onnx
-rw-r--r--  1 fangjun  staff   122M May 23 15:30 encoder-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff   315M May 23 15:31 encoder-epoch-99-avg-1.onnx
-rw-r--r--  1 fangjun  staff   2.7M May 23 15:29 joiner-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff    11M May 23 15:29 joiner-epoch-99-avg-1.onnx

Decode wave files

Hint

It supports decoding only wave files of a single channel with 16-bit encoded samples, while the sampling rate does not need to be 16 kHz.

fp32

The following code shows how to use fp32 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/joiner-epoch-99-avg-1.onnx \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/0.wav \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/1.wav \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/2.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-zh-stateless2-2023-05-23.txt

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

int8

The following code shows how to use int8 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/encoder-epoch-99-avg-1.int8.onnx \
  --decoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/joiner-epoch-99-avg-1.int8.onnx \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/0.wav \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/1.wav \
  ./sherpa-onnx-conformer-zh-stateless2-2023-05-23/test_wavs/2.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

Caution

We did not use int8 for the decoder model above.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-zh-stateless2-2023-05-23.int8.txt

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

Speech recognition from a microphone

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-microphone-offline \
  --tokens=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-stateless2-2023-05-23/joiner-epoch-99-avg-1.onnx

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

csukuangfj/sherpa-onnx-conformer-zh-2023-05-23 (Chinese)

This model is converted from

https://huggingface.co/luomingshuang/icefall_asr_wenetspeech_pruned_transducer_stateless5_offline

which supports only Chinese as it is trained on the WenetSpeech corpus.

You can find the training code at

https://github.com/k2-fsa/icefall/tree/master/egs/wenetspeech/ASR/pruned_transducer_stateless5

In the following, we describe how to download it and use it with sherpa-onnx.

Download the model

Please use the following commands to download it.

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-zh-2023-05-23.tar.bz2

# For Chinese users, you can use the following mirror
# wget https://hub.nuaa.cf/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-zh-2023-05-23.tar.bz2

tar xvf sherpa-onnx-conformer-zh-2023-05-23.tar.bz2
rm sherpa-onnx-conformer-zh-2023-05-23.tar.bz2

Please check that the file sizes of the pre-trained models are correct. See the file sizes of *.onnx files below.

sherpa-onnx-conformer-zh-2023-05-23 fangjun$ ls -lh *.onnx
-rw-r--r--  1 fangjun  staff    11M May 23 13:45 decoder-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff    12M May 23 13:45 decoder-epoch-99-avg-1.onnx
-rw-r--r--  1 fangjun  staff   129M May 23 13:47 encoder-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff   345M May 23 13:48 encoder-epoch-99-avg-1.onnx
-rw-r--r--  1 fangjun  staff   2.7M May 23 13:45 joiner-epoch-99-avg-1.int8.onnx
-rw-r--r--  1 fangjun  staff    11M May 23 13:45 joiner-epoch-99-avg-1.onnx

Decode wave files

Hint

It supports decoding only wave files of a single channel with 16-bit encoded samples, while the sampling rate does not need to be 16 kHz.

fp32

The following code shows how to use fp32 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-zh-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-2023-05-23/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-zh-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-2023-05-23/joiner-epoch-99-avg-1.onnx \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/0.wav \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/1.wav \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/2.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-zh-2023-05-23.txt

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

int8

The following code shows how to use int8 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-zh-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-2023-05-23/encoder-epoch-99-avg-1.int8.onnx \
  --decoder=./sherpa-onnx-conformer-zh-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-2023-05-23/joiner-epoch-99-avg-1.int8.onnx \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/0.wav \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/1.wav \
  ./sherpa-onnx-conformer-zh-2023-05-23/test_wavs/2.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

Caution

We did not use int8 for the decoder model above.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-zh-2023-05-23.int8.txt

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

Speech recognition from a microphone

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-microphone-offline \
  --tokens=./sherpa-onnx-conformer-zh-2023-05-23/tokens.txt \
  --encoder=./sherpa-onnx-conformer-zh-2023-05-23/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-zh-2023-05-23/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-zh-2023-05-23/joiner-epoch-99-avg-1.onnx

Caution

If you use Windows and get encoding issues, please run:

CHCP 65001

in your commandline.

csukuangfj/sherpa-onnx-conformer-en-2023-03-18 (English)

This model is converted from

https://huggingface.co/csukuangfj/icefall-asr-librispeech-pruned-transducer-stateless3-2022-05-13

which supports only English as it is trained on the LibriSpeech corpus.

You can find the training code at

https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/pruned_transducer_stateless3

In the following, we describe how to download it and use it with sherpa-onnx.

Download the model

Please use the following commands to download it.

cd /path/to/sherpa-onnx

wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-en-2023-03-18.tar.bz2

# For Chinese users, you can use the following mirror
# wget https://hub.nuaa.cf/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-conformer-en-2023-03-18.tar.bz2

tar xvf sherpa-onnx-conformer-en-2023-03-18.tar.bz2
rm sherpa-onnx-conformer-en-2023-03-18.tar.bz2

Please check that the file sizes of the pre-trained models are correct. See the file sizes of *.onnx files below.

sherpa-onnx-en-2023-03-18$ ls -lh *.onnx
-rw-r--r-- 1 kuangfangjun root  1.3M Apr  1 07:02 decoder-epoch-99-avg-1.int8.onnx
-rw-r--r-- 1 kuangfangjun root  2.0M Apr  1 07:02 decoder-epoch-99-avg-1.onnx
-rw-r--r-- 1 kuangfangjun root  122M Apr  1 07:02 encoder-epoch-99-avg-1.int8.onnx
-rw-r--r-- 1 kuangfangjun root  315M Apr  1 07:02 encoder-epoch-99-avg-1.onnx
-rw-r--r-- 1 kuangfangjun root  254K Apr  1 07:02 joiner-epoch-99-avg-1.int8.onnx
-rw-r--r-- 1 kuangfangjun root 1003K Apr  1 07:02 joiner-epoch-99-avg-1.onnx

Decode wave files

Hint

It supports decoding only wave files of a single channel with 16-bit encoded samples, while the sampling rate does not need to be 16 kHz.

fp32

The following code shows how to use fp32 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-en-2023-03-18/tokens.txt \
  --encoder=./sherpa-onnx-conformer-en-2023-03-18/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-en-2023-03-18/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-en-2023-03-18/joiner-epoch-99-avg-1.onnx \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/0.wav \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/1.wav \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/8k.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-en-2023-03-18.txt

int8

The following code shows how to use int8 models to decode wave files:

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-offline \
  --tokens=./sherpa-onnx-conformer-en-2023-03-18/tokens.txt \
  --encoder=./sherpa-onnx-conformer-en-2023-03-18/encoder-epoch-99-avg-1.int8.onnx \
  --decoder=./sherpa-onnx-conformer-en-2023-03-18/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-en-2023-03-18/joiner-epoch-99-avg-1.int8.onnx \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/0.wav \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/1.wav \
  ./sherpa-onnx-conformer-en-2023-03-18/test_wavs/8k.wav

Note

Please use ./build/bin/Release/sherpa-onnx-offline.exe for Windows.

You should see the following output:

./code-conformer/sherpa-onnx-conformer-en-2023-03-18-int8.txt

Speech recognition from a microphone

cd /path/to/sherpa-onnx

./build/bin/sherpa-onnx-microphone-offline \
  --tokens=./sherpa-onnx-conformer-en-2023-03-18/tokens.txt \
  --encoder=./sherpa-onnx-conformer-en-2023-03-18/encoder-epoch-99-avg-1.onnx \
  --decoder=./sherpa-onnx-conformer-en-2023-03-18/decoder-epoch-99-avg-1.onnx \
  --joiner=./sherpa-onnx-conformer-en-2023-03-18/joiner-epoch-99-avg-1.onnx