RecurrentNN_SpeechRecognition

A model based in Tensorflow to recognize words from the 30 word Speech Commands Dataset from Google using LSTM based Recurrent Neural Network.

rnn_parser.py = { simply walks through all the individual .wav audio files in the training and 
test datasets and calculates MFCC values for each file and stores them in a 32x26 matrix. 
32 is the number of frames in the longest file and the files which do not yield 32 frames 
are padded with zeroes in the last few rows. The matrices are stacked on top of each other 
two yield one huge 3D array which hold all data files' features. It also extracts labels. 
In the end, the arrays are saved as numpy arrays for faster extraction duing training. }

model_rnn.py = { the model contains one initial layer with ReLU activation followed by 
two LSTM cells with 100 hidden units each, combined as a static RNN model. }

After 35,000 epochs,

TRAINING ACCURACY : 90 % TEST ACCURACY : 86 %

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
model_rnn.py		model_rnn.py
rnn_parser.py		rnn_parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RecurrentNN_SpeechRecognition

About

Releases

Packages

Languages

codersinthestorm/RecurrentNN_SpeechRecognition

Folders and files

Latest commit

History

Repository files navigation

RecurrentNN_SpeechRecognition

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages