A collection of scripts for speech recognition based on Kaldi and meant to simplify the training process as much as possible.
-
- Data prep
-
- Lexicon generation
-
- Grammar generation (pocolm & srilm)
-
- Feature extraction
-
- HMM-GMM training
-
- Data augmentation (speed, volume, reverb, music, noise, babble)
-
- Embedding (i-vector, x-vector)
-
- DNN training
-
- RNNLM training
-
- Rescoring
English | Spanish |
---|---|
common voice | common voice |
heroico | |
dimex |
(c) 2020 Sylvain Le Groux slegroux@ccrma.stanford.edu