- TensorFlow >= 1.2.0
- tqdm >= 4.14.0
- python-Levenshtein >= 0.12.0
- setproctitle >= 1.1.10
- seaborn >= 0.7.1
- phone-level (39, 48, 61 phones)
- character-level
- phone-level
- Japanese kana character-level
- Japanese grapheme-level (including kanji characters)
These corpuses will be added in the future.
- Switchboard
- WSJ
- LibriSpeech
- AMI
This repository does'nt include pre-processing and pre-processing is based on this repo. If you want to do pre-processing, please look at this repo.
Connectionist Temporal Classification (CTC) [Graves+ 2006]
- LSTM-CTC
- GRU-CTC
- Bidirectional LSTM-CTC (BLSTM-CTC)
- Bidirectional GRU-CTC (BGRU-CTC)
- Multitask CTC (you can set another CTC layer to the aubitrary layer.)
- weight decay
- dropout
- gradient clipping
- activation clipping
- multitask learning
- projection layer [Sak+ 2014]
- frame-stacking [Sak+ 2015]
- LSTM encoder
- BLSTM encoder
- GRU encoder
- BGRU encoder
Under implementation
- temperature in the softmax layer (Compute attention weights)
- temperature in the softmax layer (Output posteriors)
Under implementation
Comming soon
MIT