Skip to content

A model based in Tensorflow to recognize words from the 30 word Speech Commands Dataset from Google using LSTM based Recurrent Neural Network.

Notifications You must be signed in to change notification settings

codersinthestorm/RecurrentNN_SpeechRecognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

RecurrentNN_SpeechRecognition

A model based in Tensorflow to recognize words from the 30 word Speech Commands Dataset from Google using LSTM based Recurrent Neural Network.

rnn_parser.py = { simply walks through all the individual .wav audio files in the training and 
test datasets and calculates MFCC values for each file and stores them in a 32x26 matrix. 
32 is the number of frames in the longest file and the files which do not yield 32 frames 
are padded with zeroes in the last few rows. The matrices are stacked on top of each other 
two yield one huge 3D array which hold all data files' features. It also extracts labels. 
In the end, the arrays are saved as numpy arrays for faster extraction duing training. }

model_rnn.py = { the model contains one initial layer with ReLU activation followed by 
two LSTM cells with 100 hidden units each, combined as a static RNN model. }

After 35,000 epochs,

TRAINING ACCURACY : 90 % TEST ACCURACY : 86 %

About

A model based in Tensorflow to recognize words from the 30 word Speech Commands Dataset from Google using LSTM based Recurrent Neural Network.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages