Speech_Recognition_with_Tensorflow

Implementation of a seq2seq model for speech recognition. Architecture similar to "Listen, Attend and Spell". https://arxiv.org/pdf/1508.01211.pdf

Created: ['S', 'E', 'V', 'E', 'N', 'T', 'E', 'E', 'N', '<SPACE>', 'T', 'W', 'E', 'N', 'T', 'Y', '<SPACE>', 'F', 'O', 'U', 'R']
Actual: ['S', 'E', 'V', 'E', 'N', 'T', 'E', 'E', 'N', '<SPACE>', 'T', 'W', 'E', 'N', 'T', 'Y', '<SPACE>', 'F', 'O', 'U', 'R']

Prerequisites

Tensorflow
numpy
pandas
librosa
python_speech_features

Datasets

The dataset I used is the LibriSpeech dataset. It contains about 1000 hours of 16kHz read English speech. It is available here: http://www.openslr.org/12/

Code

I uploaded three .py files and one .ipynb file. The .py files contain the network implementation and utilities. The Jupyter Notebook is a demo of how to apply the model.

Architecture

Seq2Seq model
As I mentioned above the model architecture is similar to the one used in "Listen, Attend and Spell", i.e. we are using pyramidal bidirectional LSTMs in the encoder. This reduces the time resolution and enhances the performance on longer sequences.

Encoder-Decoder
Pyramidal Bidirectional LSTM
Bahdanau Attention
Adam Optimizer
exponential or cyclic learning rate
Beam Search or Greedy Decoding

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
imgs		imgs
LICENSE		LICENSE
README.md		README.md
SpeechRecognizer.py		SpeechRecognizer.py
sr.ipynb		sr.ipynb
sr_data_utils.py		sr_data_utils.py
sr_model_utils.py		sr_model_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Speech_Recognition_with_Tensorflow

Prerequisites

Datasets

Code

Architecture

About

Uh oh!

Releases

Packages

Languages

Uh oh!

License

Uh oh!

ajmssc/Speech_Recognition_with_Tensorflow

Folders and files

Latest commit

History

Repository files navigation

Speech_Recognition_with_Tensorflow

Prerequisites

Datasets

Code

Architecture

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages