Skip to content

fix(silero): include prefix-padding pre-roll in VAD speech buffer frame#1636

Merged
chenghao-mou merged 1 commit into
mainfrom
fix/vad-speech-buffer-frame-metadata
May 28, 2026
Merged

fix(silero): include prefix-padding pre-roll in VAD speech buffer frame#1636
chenghao-mou merged 1 commit into
mainfrom
fix/vad-speech-buffer-frame-metadata

Conversation

@chenghao-mou
Copy link
Copy Markdown
Member

Summary

  • The AudioFrame emitted on START_OF_SPEECH / END_OF_SPEECH in the silero VAD sliced off the prefix-padding samples (subarray(prefixPaddingSamples, speechBufferIndex)) but still reported samplesPerChannel = speechBufferIndex. Result: frame metadata claimed more samples than its data contained, and the pre-roll audio that the buffer machinery deliberately preserves was discarded before reaching downstream consumers (STT, transcription).
  • Slice from 0 instead so data length matches samplesPerChannel and the pre-roll is delivered. Matches the Python original (livekit-agents/livekit/agents/inference/vad.py: self._speech_buffer[:speech_buffer_index]).

The same bug exists in agents/src/inference/vad.ts on the feat/AGT-2520-multimodal-eou branch and will be fixed there before that PR merges — the file does not yet exist on main.

Test plan

  • pnpm build succeeds
  • pnpm test passes
  • Smoke test silero VAD: confirm START_OF_SPEECH AudioFrame data length equals samplesPerChannel and includes pre-roll
  • Run an STT-driven example to confirm transcription quality is unchanged (pre-roll now restored)

🤖 Generated with Claude Code

The AudioFrame emitted on START_OF_SPEECH / END_OF_SPEECH sliced off
the prefix-padding samples but still reported `samplesPerChannel =
speechBufferIndex`, so the frame's metadata claimed more samples than
its data contained and downstream consumers (STT, transcription) lost
the pre-roll context the surrounding buffer machinery is designed to
preserve.

Slice from 0 instead so data length matches samplesPerChannel and the
prefix-padding pre-roll is delivered, matching the Python original.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 28, 2026

🦋 Changeset detected

Latest commit: eea6a7e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 33 packages
Name Type
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-cartesia Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-perplexity Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-tavus Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 2 additional findings.

Open in Devin Review

@chenghao-mou chenghao-mou merged commit 390da2c into main May 28, 2026
9 checks passed
@chenghao-mou chenghao-mou deleted the fix/vad-speech-buffer-frame-metadata branch May 28, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants