fix(vad): add support for vad reset directly without stream close#5687
fix(vad): add support for vad reset directly without stream close#5687chenghao-mou wants to merge 5 commits intomainfrom
Conversation
add vad stream reset method directly so that we don't have to close the stream or task this closes #5674
|
|
||
| Only effective when the underlying VAD advertises | ||
| `VADCapabilities.supports_reset`. For implementations that do not | ||
| support reset, the sentinel is silently dropped; callers that need a |
There was a problem hiding this comment.
should we raise an error when reset is not supported?
There was a problem hiding this comment.
good point. I'll add that.
| nonlocal input_copy_remaining_fract, extra_inference_time | ||
|
|
||
| self._model.reset() | ||
| self._exp_filter.reset() |
There was a problem hiding this comment.
π‘ ExpFilter.reset() called with no arguments is a no-op, leaving stale filter state after VAD reset
In _reset_state(), self._exp_filter.reset() is called at line 325 with no arguments. However, ExpFilter.reset() (livekit-agents/livekit/agents/utils/exp_filter.py:21-36) only updates fields whose corresponding parameter passes the is_given() check. Since all parameters default to NOT_GIVEN, none of the if is_given(...) branches execute β the call is a complete no-op. The internal self._filtered value (which holds the smoothed probability from the previous speech segment) is carried into the next segment.
After a VAD stream reset, the first inference probability will be blended with the stale filter value (filtered = 0.35 * old + 0.65 * new with alpha=0.35) instead of being accepted directly as with a fresh stream. This can cause slightly delayed or premature speech detection for the first ~1β2 inference windows after reset.
| self._exp_filter.reset() | |
| self._exp_filter = utils.ExpFilter(alpha=0.35) |
Was this helpful? React with π or π to provide feedback.
add vad stream reset method directly so that we don't have to close the stream or task
resetting directly with a new resampler and buffer zero-fill should take less than 0.1ms.
this closes #5674