- Install Python packages
pip install -r ./requirements.txt
- Download checkpoint and default config
mkdir default_test_model
gdown --id 1FMoIxP_rQA4gXQ9395FZupQ4juKOu7LZ -O default_test_model/checkpoint.pth # checkpoint
gdown --id 1-VJb5kP2Pa7IL59WkQ38pLJkQLrxn6Bx -O default_test_model/config.json # default config
- Necessary resources are downloadable. If a class requires some external material, it downloads it. The list of such classes:
BackgroundNoise
fromhw_asr/augmentations/wave_augmentations
downloads noise for augmentations in line 22CTCBPETextEncoder
fromhw_asr/text_encoder
downloads pretrained BPE model in line 19CTCCharTextEncoder
fromhw_asr/text_encoder
downloads pretrained KenLM and a vocab for shallow fusion in line 42 and line 68
This repository includes implementations of:
- LSTM
- QuartzNet [1]
- Deep Speech 2 [2]
There are also implementations of multiple decoding strategies and vocabularies. Language model is KenLM.