Skip to content
/ kaldi Public

Wavelet Transform Feature Extraction Module for Kaldi

Notifications You must be signed in to change notification settings

444ghost/kaldi

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

(검색용 텍스트: 칼디, 웨이블릿, 음성인식, 머신러닝, 뉴럴넷)

Wavelet Transform Feature Extraction Module for Kaldi

Kaldi[1] doesn't have any wavelet transform modules but files contained in this repository should help you extract acoustic features using wavelet transforms. This work is still in progress for better WERs.

The conf/wavelet.conf file allows you to change wavelet tranform options.

  • --wavelet-type // wavelet type currently supports haar and db4
  • --decomposition-level // decomposition level
  • --num-feats // number of features
  • --transform-type // discrete wavelet transform(dwt) or wavelet packet transform(wpt)
  • --dyadic-zoom // time-frequency resolution localization factor, can be 0 for no effect

The neural network training seems to give better WERs. Other machine learnings may be explored later.

Kaldi version: 5.5.268 77ac79f70

Data

TIdigits[2] data is used.

Scripts

./run_wavelet.sh -t=dwt

./run_wavelet.sh -t=wpt

-t options should match transform-type in conf/wavelet.conf

Papers and tutorials

Cubic-root log in PLP for intensity vs perceived loudness [3].

Wavelet features [4].

Kaldi tutorials [5], [6], [11], [12].

VM symbolic link on Windows [7].

Neural network tutorials for Kaldi [8], [15], [16], [17].

Daubechies family of wavelets [9], [10].

Weighted finite state transducer [13], [14].

Reference

[1] https://github.com/kaldi-asr/kaldi

[2] https://catalog.ldc.upenn.edu/LDC93S10

[3] https://pdfs.semanticscholar.org/b578/f4faeb00b808e8786d897447f2493b12b4e9.pdf

[4] http://ecsjournal.org/Archive/Volume37/Issue3/7.pdf

[5] https://www.eleanorchodroff.com/

[6] http://www.inf.ed.ac.uk/teaching/courses/asr/2018-19/lab1.pdf

[7] https://www.nextofwindows.com/virtualbox-unable-to-merge-not-enough-free-storage-space

[8] http://jrmeyer.github.io/asr/2016/12/15/DNN-AM-Kaldi.html

[9] http://bearcave.com/misl/misl_tech/wavelets/matrix/index.html

[10] http://bearcave.com/misl/misl_tech/wavelets/daubechies/index.html

[11] https://towardsdatascience.com/how-to-start-with-kaldi-and-speech-recognition-a9b7670ffff6

[12] https://github.com/YoavRamon/awesome-kaldi

[13] http://jrmeyer.github.io/

[14] https://cs.nyu.edu/~mohri/pub/hbka.pdf

[15] https://www.youtube.com/playlist?list=PLxbPHSSMPBeicXAHVfyFvGfCywRCq39Mp

[16] https://towardsdatascience.com/recurrent-neural-networks-and-lstm-4b601dd822a5

[17] https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0

About

Wavelet Transform Feature Extraction Module for Kaldi

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published