End-to-End Automatic Speech Recognition use tensorflow

use tensorflow to implement a end-to-end algorithm according baidu deepspeech paperDeepSpeech,DeepSpeech2 and seq2seq listener-attention-speller modelListen, Attend and Spell Towards better decoding and language model integration in sequence to Sequence model

0 change log:

2018.8.23:(1)fix ds2 model validation need seqlen input errors (2)divide train parameters and neual network graph params. (3)add test script

1 prerequests.

see requirements.txt

2 data and preprocess

2.1 English corpus:LibriSpeech

2.2 Chinese corpus:THCHS-30

2.3 preprocess

1.librispeech

usage: libri_preprocess [-h] [-m {mfcc,fbank,log}] [-f {13,81,161}]
                        [-wl WINLEN] [-ws WINSTEP] [-s SPLIT]
                        [-n {dev-clean,dev-other,test-clean,test-other,train-clean-100,train-clean-360,train-other-500}]
                        path save jsonfile
param:
     -m, feature type, mfcc=13 dim,fbank=40dim,log=81or161 dim
     -f, feature dims, mfcc=13,log=81,161
     -n, librispeech corpus sub dirs
     path,corpus data dir
     save,feature save dir
     jsonfile, json file to index all wav feature and ground truth

sample script
$python libri_preprocess.py -m log -f 81 -n dev-clean ~/asr_corpus/librispeech/LibriSpeech ~/asr_corpus/librispeech_feat ~/asr_corpus/dev-clean-featlabel.json

3 train

run_train.py

usage: run_train.py [-h] [-rc {gru,lstm,rnn}] [-b BATCH_SIZE] [-n HIDDENS]
                    [-f {13,39,81,161}] [-c CLASSES] [-rl RNN_LAYERS]
                    [-cl CONV_LAYERS] [-g GPUS] [-a {relu,tanh,sigmod}]
                    [-o OPTIMIZER] [-lr LEARNING_RATE] [-k KEEP_PROB]
                    [-gc GRAD_CLIP] [-m MODE] [-r RESTOREMODEL]
                    [-bn BATCHNORM] [-p EPOCHS] [-i INITIAL_EPOCH] -t
                    TRAINFILES [TRAINFILES ...] -d DEVFILES [DEVFILES ...] -s
                    SAVEPATH [-gf GPU_FRACTION] [-md MODEL]
                    [-ub USE_BIDIRECTIONAL_RNN] [-us USE_SUMMARY]
                    [-v VOCABFILE] [--ps_hosts PS_HOSTS] [--ws_hosts WS_HOSTS]
                    [--job_name {ps,worker}] [--task_index TASK_INDEX]

3.1 single machine training

deepspeech2 model
$./run_libri_ds2_train.sh

las model 
$./run_libri_las_train.sh

3.2 clustering machine training

two gpu cards.
deepspeech2 model
$./run_libri_ds2_train_dist.sh ps 0
$./run_libri_ds2_train_dist.sh worker 0
$./run_libri_ds2_train_dist.sh worker 1

las model 
$./run_libri_las_train_dist.sh ps 0
$./run_libri_las_train_dist.sh worker 0
$./run_libri_las_train_dist.sh worker 1

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
conf		conf
images		images
model		model
preprocess		preprocess
trainer		trainer
utils		utils
ASRDBiRNN.ipynb		ASRDBiRNN.ipynb
README.md		README.md
plot.py		plot.py
requirements.txt		requirements.txt
run_libri_ds2_train.sh		run_libri_ds2_train.sh
run_libri_ds2_train_dist.sh		run_libri_ds2_train_dist.sh
run_libri_las_train.sh		run_libri_las_train.sh
run_libri_las_train_dist.sh		run_libri_las_train_dist.sh
run_libri_train.sh		run_libri_train.sh
run_libri_train_dist.sh		run_libri_train_dist.sh
run_predict.py		run_predict.py
run_test.py		run_test.py
run_train.py		run_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Automatic Speech Recognition use tensorflow

0 change log:

1 prerequests.

2 data and preprocess

2.1 English corpus:LibriSpeech

2.2 Chinese corpus:THCHS-30

2.3 preprocess

1.librispeech

3 train

3.1 single machine training

3.2 clustering machine training

4 test

5 other

refs. zzw922cn/Automatic_Speech_Recognition

About

Releases

Packages

Contributors 2

Languages

cdyangbo/end2endASR

Folders and files

Latest commit

History

Repository files navigation

End-to-End Automatic Speech Recognition use tensorflow

0 change log:

1 prerequests.

2 data and preprocess

2.1 English corpus:LibriSpeech

2.2 Chinese corpus:THCHS-30

2.3 preprocess

1.librispeech

3 train

3.1 single machine training

3.2 clustering machine training

4 test

5 other

refs. zzw922cn/Automatic_Speech_Recognition

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages