Skip to content

10, 20, 30ms VAD model #363

Answered by snakers4
siriusht asked this question in Q&A
Aug 3, 2023 · 1 comments · 1 reply
Discussion options

You must be logged in to vote

Hi,

Can I input 10ms or 20ms or 30ms audio data into this model?
Could you please share 30ms model?

The preferred chunk sizes are described here:

silero-vad/utils_vad.py

Lines 205 to 208 in 563106e

window_size_samples: int (default - 1536 samples)
Audio chunks of window_size_samples size are fed to the silero VAD model.
WARNING! Silero VAD models were trained using 512, 1024, 1536 samples for 16000 sample rate and 256, 512, 768 samples for 8000 sample rate.
Values other than these may affect model perfomance!!

Does it support speech language Chinese?

It should be language agnostic, and I believe Chinese was in the train dataset, albeit a small amou…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@siriusht
Comment options

Answer selected by snakers4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
help wanted Extra attention is needed
2 participants
Converted from issue

This discussion was converted from issue #362 on August 03, 2023 13:49.