Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KALDI:apply-cmvn-sliding #535

Closed
wanglong001 opened this issue Apr 12, 2020 · 2 comments · Fixed by #540
Closed

KALDI:apply-cmvn-sliding #535

wanglong001 opened this issue Apr 12, 2020 · 2 comments · Fixed by #540

Comments

@wanglong001
Copy link
Contributor

🚀 Feature

Apply sliding-window cepstral mean (and optionally variance)
normalization per utterance.

Motivation

Acoustic features are extracted based on Kaldi. I want to use torchaudio instead, but there is no cmvn, I wrote a torch version of cmvn according to Kaldi

@wanglong001 wanglong001 mentioned this issue Apr 14, 2020
@vincentqb vincentqb changed the title KALID:apply-cmvn-sliding KALDI:apply-cmvn-sliding Apr 15, 2020
@stonelazy
Copy link

Dear @wanglong001 I fail to understand where exactly we would be making use of torchaudio.transforms.SlidingWindowCmn is it to normalize the output of STFT/MFCC at a window level ?
Would it be possible for you to explain on this ? Am not familiar with Kaldi.

@wanglong001
Copy link
Contributor Author

Dear @wanglong001 I fail to understand where exactly we would be making use of torchaudio.transforms.SlidingWindowCmn is it to normalize the output of STFT/MFCC at a window level ?
Would it be possible for you to explain on this ? Am not familiar with Kaldi.

Yes, normalize the output of cepstral (STFT/MFCC...) at a window level, Mainly to reduce the impact of environmental noise.

https://kaldi-asr.org/doc/apply-cmvn-sliding_8cc.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants