Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to implement VAD model with GMM? #13

Closed
kareemamrr opened this issue Jun 9, 2020 · 1 comment
Closed

How to implement VAD model with GMM? #13

kareemamrr opened this issue Jun 9, 2020 · 1 comment
Labels
question Further information is requested

Comments

@kareemamrr
Copy link

The appropriate paper Speaker Diarization with LSTM writes that the Voice Activity Detection model is a GMM model using the same PLP features as the i-vector and two full covariance Gaussians. How can I implement this using scikit-learn's GMM class?

@kareemamrr kareemamrr changed the title How to implement VAD model wit GMM? How to implement VAD model with GMM? Jun 9, 2020
@wq2012
Copy link
Owner

wq2012 commented Jun 14, 2020

We used a pretrained ASR model to generate force alignment for the data to get per-frame speech vs non-speech ground truth. Then you can simply fit a Gaussian to all speech frames, then fit another Gaussian to all non-speech frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants