You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The appropriate paper Speaker Diarization with LSTM writes that the Voice Activity Detection model is a GMM model using the same PLP features as the i-vector and two full covariance Gaussians. How can I implement this using scikit-learn's GMM class?
The text was updated successfully, but these errors were encountered:
kareemamrr
changed the title
How to implement VAD model wit GMM?
How to implement VAD model with GMM?
Jun 9, 2020
We used a pretrained ASR model to generate force alignment for the data to get per-frame speech vs non-speech ground truth. Then you can simply fit a Gaussian to all speech frames, then fit another Gaussian to all non-speech frames.
The appropriate paper Speaker Diarization with LSTM writes that the Voice Activity Detection model is a GMM model using the same PLP features as the i-vector and two full covariance Gaussians. How can I implement this using scikit-learn's GMM class?
The text was updated successfully, but these errors were encountered: