Skip to content

xieliang555/Automatic-Speech-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End-to-End Automatic Speech Recognition

This repository contrains implementations of end-to-end ASR system by LAS, CTC(w/o attention), and transducer(w/o attention).

Dependencies

  • torch >= 1.5.1
  • torchtext >= 0.6.0
  • torchaudio
  • warp-rnnt

Phoneme Error Rate on TIMIT

Model train/dev loss train/dev per Epoch
CTC 0.64/1.03 0.20/0.315 178
Transducer 12.0/- -/0.2662 13
Pretrained Transducer 0.7/- -/0.2670 195
LAS

language model train/dev loss: 2.68/2.80 train/dev ppl: 14.5/16.49 epoch: 292

Note

  1. Smaller vocabulary (due to phoneme mapping6) improves performance.
  2. VGG Feature extractor7 (ResNet even better) helps model to converge fast.
  3. Transducer converges faster and generalizes better than ctc.
  4. Weight noise8 is a useful regularizer for RNN/LSTM.
  5. Batch normalization helps model to converge fast.

To do

  1. pretrained transducer
  2. LAS
  3. beam search
  4. hybrid
  5. add visualize script plot.py

Reference

  1. A Comparison of Sequence-to-Sequence Models for Speech Recognition [Ref]
  2. Deep Learning for Human Language Processing (2020,Spring) [Ref]
  3. Alexander-H-Liu/End-to-end-ASR-Pytorch [Ref]
  4. Open Source Korean End-to-end Automatic Speech Recognition [Ref]
  5. Language Translation With TorchText [Ref]
  6. End-to-end automatic speech recognition system implemented in TensorFlow [Ref]
  7. Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM [Ref]
  8. Speech Recognition with Deep Recurrent Neural Networks [Ref]
  9. pretrained embedding [Ref]

Acknowledge

  • Thanks to warp-rnnt, a PyTorch bindings for CUDA-Warp RNN-Transducer. Note that it is better installed from source code.
  • Thanks to warp-transducer, a more general implementation of RNN transducer. Carefully set the environment variables as refered here before run python setup.py install .