Speech-to-Text-WaveNet : End-to-end sentence level Chinese speech recognition using DeepMind's WaveNet
A tensorflow implementation for Chinese speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio. (Hereafter the Paper)
Current Version : 0.0.1
- python == 3.5
- tensorflow == 1.0.0
- librosa == 0.5.0
- cache: save data featrue and word dictionary
- data: wav files and related labels
- model: save the models
- Data random shuffle per epoch
- Xavier initialization
- Adam optimization algorithms
- Batch Normalization
python3 train.py
python3 test.py