Skip to content

apply mfcc feature of waveform with the LSTM + CTC loss architecture

Notifications You must be signed in to change notification settings

ss87021456/mfcc_ctc_speech

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mfcc_ctc_speech

Using CTC loss function combine mfcc feature and LSTM architect
test on youtube dataset

Dependency:

  • for create label (gen_label.py)
    python3 - 3.6.1
    webvtt-py - 0.4.0
  • for cuting dataset (mp4_to_cut_wav.py)
    python2 - 2.7.14
    moviepy - 0.2.3.2
    cv2 - 3.3.0
  • for training (ctc_speech_recognition.py)
    python2 - 2.7.14
    tensorflow - 1.4.0

Usage:

Step 1 : Download youtube vedio with cc subtitle
Step 2 : python3 gen_label.py - to generate clear label
Step 3 : python mp4_to_cut_wav.py - to generate wav dataset
Step 4 : python ctc_speech_recognition.py - training

Dataset description:

Training Process:

About

apply mfcc feature of waveform with the LSTM + CTC loss architecture

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages