Hint
Please refer to install_sherpa_ncnn
to install sherpa-ncnn before you read this section.
This model is a small version of lstm-transducer trained in icefall.
It only has 13.3 million parameters
and can be deployed on embedded devices
for real-time speech recognition. You can find the models in fp16
format at https://huggingface.co/marcoyang/sherpa-ncnn-lstm-transducer-small-2023-02-13.
The model is trained on a bi-lingual dataset tal_csasr
(Chinese + English), so it can be used for both Chinese and English.
In the following, we show you how to download it and deploy it with sherpa-ncnn.
Please use the following commands to download it.
cd /path/to/sherpa-ncnn
wget https://github.com/k2-fsa/sherpa-ncnn/releases/download/models/sherpa-ncnn-lstm-transducer-small-2023-02-13.tar.bz2
tar xvf sherpa-ncnn-lstm-transducer-small-2023-02-13.tar.bz2
Note
Please refer to sherpa-ncnn-embedded-linux-arm-install
for how to compile sherpa-ncnn for a 32-bit ARM platform.
Hint
It supports decoding only wave files with a single channel and the sampling rate should be 16 kHz.
cd /path/to/sherpa-ncnn
./build/bin/sherpa-ncnn \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/tokens.txt \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/joiner_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-lstm-transducer-small-2023-02-13/test_wavs/0.wav
Note
The default option uses 4 threads and greedy_search
for decoding.
Note
Please use ./build/bin/Release/sherpa-ncnn.exe
for Windows.
Caution
If you use Windows and get encoding issues, please run:
CHCP 65001
in your commandline.
This is a model trained using the GigaSpeech and the LibriSpeech dataset.
Please see k2-fsa/icefall#558 for how the model is trained.
You can find the training code at
https://github.com/k2-fsa/icefall/tree/master/egs/librispeech/ASR/lstm_transducer_stateless2
In the following, we describe how to download it and use it with sherpa-ncnn.
Please use the following commands to download it.
cd /path/to/sherpa-ncnn
wget https://github.com/k2-fsa/sherpa-ncnn/releases/download/models/sherpa-ncnn-2022-09-05.tar.bz2
tar xvf sherpa-ncnn-2022-09-05.tar.bz2
Hint
It supports decoding only wave files with a single channel and the sampling rate should be 16 kHz.
cd /path/to/sherpa-ncnn
for method in greedy_search modified_beam_search; do
./build/bin/sherpa-ncnn \
./sherpa-ncnn-2022-09-05/tokens.txt \
./sherpa-ncnn-2022-09-05/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-05/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-05/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/joiner_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-05/test_wavs/1089-134686-0001.wav \
2 \
$method
done
You should see the following output:
./code-lstm/2022-09-05.txt
Note
Please use ./build/bin/Release/sherpa-ncnn.exe
for Windows.
cd /path/to/sherpa-ncnn
./build/bin/sherpa-ncnn-microphone \
./sherpa-ncnn-2022-09-05/tokens.txt \
./sherpa-ncnn-2022-09-05/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-05/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-05/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-05/joiner_jit_trace-pnnx.ncnn.bin \
2 \
greedy_search
Note
Please use ./build/bin/Release/sherpa-ncnn-microphone.exe
for Windows.
It will print something like below:
Number of threads: 4
num devices: 4
Use default device: 2
Name: MacBook Pro Microphone
Max input channels: 1
Started
Speak and it will show you the recognition result in real-time.
You can find a demo below:
m6ynSxycpX0
This is a model trained using the WenetSpeech dataset.
Please see k2-fsa/icefall#595 for how the model is trained.
In the following, we describe how to download it and use it with sherpa-ncnn.
Please use the following commands to download it.
cd /path/to/sherpa-ncnn
wget https://github.com/k2-fsa/sherpa-ncnn/releases/download/models/sherpa-ncnn-2022-09-30.tar.bz2
tar xvf sherpa-ncnn-2022-09-30.tar.bz2
Hint
It supports decoding only wave files with a single channel and the sampling rate should be 16 kHz.
cd /path/to/sherpa-ncnn
for method in greedy_search modified_beam_search; do
./build/bin/sherpa-ncnn \
./sherpa-ncnn-2022-09-30/tokens.txt \
./sherpa-ncnn-2022-09-30/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-30/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-30/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/joiner_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-30/test_wavs/0.wav \
2 \
$method
done
You should see the following output:
./code-lstm/2022-09-30.txt
Caution
If you use Windows and get encoding issues, please run:
CHCP 65001
in your commandline.
cd /path/to/sherpa-ncnn
./build/bin/sherpa-ncnn-microphone \
./sherpa-ncnn-2022-09-30/tokens.txt \
./sherpa-ncnn-2022-09-30/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-30/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-2022-09-30/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-2022-09-30/joiner_jit_trace-pnnx.ncnn.bin \
2 \
greedy_search
Note
Please use ./build/bin/Release/sherpa-ncnn-microphone.exe
for Windows.
Caution
If you use Windows and get encoding issues, please run:
CHCP 65001
in your commandline.
You can find a demo below:
bbQfoRT75oM