Is the model structure available as information? #387
Unanswered
helloooideeeeea
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I would like to make a VAD for a specific speaker.
Does silero-vad use MFCC or other analyzed data as a vector for feature extraction of speech waveforms, or does it take the raw speech waveforms and input them into the training model?
Also, does the training model use RNN or Transformer?
Is there a possibility that the output layer of the VAD alone is not accurate enough, and is a noise removal layer (like Rnnoise) added?
I would like to know such internal information.
I can't find it on GitHub.
thanks.
Beta Was this translation helpful? Give feedback.
All reactions