Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VAD false-positive on long segments #50

Open
Macoron opened this issue Aug 14, 2023 · 0 comments
Open

VAD false-positive on long segments #50

Macoron opened this issue Aug 14, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@Macoron
Copy link
Owner

Macoron commented Aug 14, 2023

VAD tends to detect false-positive voice on long silence segments. It looks like, if whole audio chunk contains only silence, energy-based VAD detects speech in it.

You can fix this by changing VadContextSec in MicrophoneRecord to high values. While VAD is pretty cheap, its context can't grow indefinitely and will cause lag spikes.

Another solution is to replace VAD with something more robust (like silero-vad). But I don't want to include any extra dependencies, especially that not using ggml.

I think it can be fixed by some hack, but I don't know what that might be.

@Macoron Macoron added enhancement New feature or request bug Something isn't working and removed enhancement New feature or request labels Aug 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant