Is the model structure available as information? #387

helloooideeeeea · 2023-10-22T03:31:35Z

helloooideeeeea
Oct 22, 2023

I would like to make a VAD for a specific speaker.

Does silero-vad use MFCC or other analyzed data as a vector for feature extraction of speech waveforms, or does it take the raw speech waveforms and input them into the training model?

Also, does the training model use RNN or Transformer?

Is there a possibility that the output layer of the VAD alone is not accurate enough, and is a noise removal layer (like Rnnoise) added?

I would like to know such internal information.
I can't find it on GitHub.

thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the model structure available as information? #387

{{title}}

Replies: 0 comments

Select a reply

Is the model structure available as information? #387

helloooideeeeea Oct 22, 2023

Replies: 0 comments

helloooideeeeea
Oct 22, 2023