GitHub - syhw/timit_tools: tools around preparing TIMIT for HMM (with HTK) and deep learning (with Theano) methods

Preparing the dataset

With the TIMIT dataset (.wav sound files, .wrd words annotations and .phn phones annotations):

Encode the wave sound in MFCCs: run python mfcc_and_gammatones.py --htk-mfcc $DATASET/train and python mfcc_and_gammatones.py --htk-mfcc $DATASET/test producing the .mfc files with HCopy according to wav_config (.mfc_unnorm is no normalization)
Adapt the annotations given in .phn in frames into nanoseconds in .lab run python timit_to_htk_labels.py $DATASET/train and
python timit_to_htk_labels.py $DATASET/test producing the .lab files
Replace phones according to the seminal HMM paper of 1989: "Speaker-independant phone recognition using hidden Markov models", phones number (i.e. number of lines in the future labels dictionary) should go from 61 to 39. run python substitute_phones.py $DATASET/train and python substitute_phones.py $DATASET/test
run python create_phonesMLF_and_labels.py $DATASET/train and python create_phonesMLF_and_labels.py $DATASET/test

You can also do that with a make prepare dataset=DATASET_PATH.

You're ready for training with HTK (mfc and lab files)!

Train monophones HMM:

make train_monophones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_monophones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Or, train triphones:

TODO
make train_triphones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_triphones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Do full states forced alignment of the .mlf files with make align.
Do a first preparation of the dataset with src/timit_to_numpy.py or src/mocha_timit_to_numpy.py (depending on the dataset) on the above aligned .mlf files.
Train the deep belief networks on it, either using DBN/DBN_timit.py or DBN/DBN_Gaussian_timit.py or DBN/DBN_Gaussian_mocha_timit.py (see inside these files for parameters). Save (pickle at the moment) the DBN objects and the states/indices mappings.
Use the serialized DBN objects and states/indices mappings with viterbi.py, just cd to DBN and do:

python ../src/viterbi.py output_dbn.mlf /fhgfs/bootphon/scratch/gsynnaeve/TIMIT/test/test.scp ../tmp_train/hmm_final/hmmdefs --d ../dbn_5.pickle ../to_int_and_to_state_dicts_tuple.pickle

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
DBN		DBN
mocha-timit		mocha-timit
src		src
tmp_train/hmm_final		tmp_train/hmm_final
CSJ_foldings.json		CSJ_foldings.json
Makefile		Makefile
README.md		README.md
README_MOCHA_TIMIT.txt		README_MOCHA_TIMIT.txt
TODO.txt		TODO.txt
buckeye_foldings.json		buckeye_foldings.json
global.ded		global.ded
mapping_timit_allen.txt		mapping_timit_allen.txt
mktri.hed		mktri.hed
mktri.led		mktri.led
proto.hmm		proto.hmm
python_tricks.txt		python_tricks.txt
quests_example.hed		quests_example.hed
sil.hed		sil.hed
timit_foldings.json		timit_foldings.json
wav_config		wav_config