tools around preparing TIMIT for HMM (with HTK) and deep learning (with Theano) methods
Preparing the dataset

With the TIMIT dataset (.wav sound files, .wrd words annotations and .phn phones annotations):

  1. Encode the wave sound in MFCCs: run python --htk-mfcc $DATASET/train and python --htk-mfcc $DATASET/test producing the .mfc files with HCopy according to wav_config (.mfc_unnorm is no normalization)

  2. Adapt the annotations given in .phn in frames into nanoseconds in .lab run python $DATASET/train and
    python $DATASET/test producing the .lab files

  3. Replace phones according to the seminal HMM paper of 1989: "Speaker-independant phone recognition using hidden Markov models", phones number (i.e. number of lines in the future labels dictionary) should go from 61 to 39. run python $DATASET/train and python $DATASET/test

  4. run python $DATASET/train and python $DATASET/test

You can also do that with a make prepare dataset=DATASET_PATH.

You're ready for training with HTK (mfc and lab files)!

Training the HMM models

Train monophones HMM:

make train_monophones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_monophones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Or, train triphones:

make train_triphones dataset_train_folder=PATH_TO_YOUR_DATASET/train
make test_triphones dataset_test_folder=PATH_TO_YOUR_DATASET/test

Replacing the GMM by DBNs

  1. Do full states forced alignment of the .mlf files with make align.

  2. Do a first preparation of the dataset with src/ or src/ (depending on the dataset) on the above aligned .mlf files.

  3. Train the deep belief networks on it, either using DBN/ or DBN/ or DBN/ (see inside these files for parameters). Save (pickle at the moment) the DBN objects and the states/indices mappings.

  4. Use the serialized DBN objects and states/indices mappings with, just cd to DBN and do:

    python ../src/ output_dbn.mlf /fhgfs/bootphon/scratch/gsynnaeve/TIMIT/test/test.scp ../tmp_train/hmm_final/hmmdefs --d ../dbn_5.pickle ../to_int_and_to_state_dicts_tuple.pickle

