tf_speech_recognition

Simple convolutional neural network for words recognition

Put your dataset into ./dataset/, then run:

python train.py --words one,two,three

See argparser function for more arguments

You can see dataset directory structure in ./tests/dataset/ (testing model trained on tensorflow speech commands dataset).

After training you can find trained models in ./train/ directory.

python recognize.py \
    --wav_file audio_to_recognize.wav \
    --labels_file ./train/labels.txt \
    --model_file ./train/model.ckpt-10000

It will print something like this:

350 two
1000 one
1400 _unknown_
...

where number - offset in milliseconds, word - recognized label.

Provide feedback