MFCC Automatic Speech Recognition Algorithm Implementation

A Python 2.7 implementation of Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Time Warping (DTW) algorithms for Automated Speech Recognition (ASR).

Method

Read audio data and sampling frequency from .wav file
Frame signal
Apply window function to frame (default=hamming)
Calculate DFT of frame
Calculate periodogram power spectral density estimate for each DFT bin
Apply Mel-Frequency filterbank to signal
Sum energies within each filter and take the base 10 logarithm
Take DCT of each filter
Keep coefficients [1:13]
Compute DTW best path and euclidean distance of reference vector and input vector

To-do

Noise gate
Pre-emphasis / Lifter
Feature vector database
Audio record / playback (audio.py)
Multithread MFCC extraction
Create MFCC extractor as class?

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
db		db
install		install
wavs		wavs
.gitignore		.gitignore
README.md		README.md
audio.py		audio.py
db.py		db.py
main.py		main.py
mfcc.py		mfcc.py
record_loop.sh		record_loop.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MFCC Automatic Speech Recognition Algorithm Implementation

Method

To-do

About

Releases

Packages

Languages

amitchone/ASR

Folders and files

Latest commit

History

Repository files navigation

MFCC Automatic Speech Recognition Algorithm Implementation

Method

To-do

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages