Several ASR implementations. A full description of the project can be found in the paper.
- Download the AN4 dataset.
- run
pip install -r requirments.txt
- for conda: run
conda env create -f environment.yml
- experiment with the models!
All available models are cartesian product of the following acoustic models and ctc decoders configurations.
- Linear Layer
- LSTM
- DeepSpeech Toy
- DeepSpeech Small
- DeepSpeech Large
- Greedy CTC
- Lexicon CTC
- LM CTC
you can evaluate all models using evaluate.py
, or try a specific configuration with try.py
python train.py --conf confs/linear_acoustic.json --logger wandb
The following command enables you to evaluate and observer the wer of all available models on train, test and val data
python evaluate.py --conf confs/linear_acoustic.json
A CLI which enables you to construct the ASR model you want and try to transcribe with it :)
python try.py --conf confs/archive/try.json
Evaluation results on all data are detailed in the paper. we appended here the graphs on the validation data.