Skip to content

v2.0.2-pre — VAD & End-of-Speech Engines, STT/TTS Test Suite, Go 1.25.8

Pre-release
Pre-release

Choose a tag to compare

@iamprashant iamprashant released this 17 Mar 09:31
· 443 commits to main since this release
Immutable release. Only release title and notes can be modified.

What's Changed in v2.0.2-pre

Voice Activity Detection (VAD) & End-of-Speech Engines

The voice pipeline now supports pluggable VAD and end-of-speech (EOS) detection, giving you fine-grained control over when the agent starts and stops listening.

New EOS Engines

  • LiveKit EOS — ONNX-based turn detection with custom tokenizer and chat template inference (livekit/turn_detector.go)
  • Pipecat EOS — Mel-spectrogram-based end-of-speech detection with platform-specific ONNX inference (pipecat/mel_spectrogram.go)
  • Silence-based EOS — Configurable silence threshold fallback (silence_based/silence_based_end_of_speech.go)

New VAD Providers

  • TEN VAD — lightweight voice activity detector
  • FireRed VAD — ONNX-based VAD with fbank feature extraction and postprocessor

All VAD/EOS ONNX models are now bundled in the repo and downloaded at Docker build time — no runtime model fetching required.

  • 48df33c0 1ef73aec 03332e79 41980364 f9c53e5a b047e755

Audio Heartbeat

Added an audio heartbeat mechanism to keep the speech pipeline active and optimize end-of-speech trigger timing, preventing premature cutoffs.

  • 03332e79 feat: audio heartbeat to optimize end of speech trigger

UI Configuration

New UI panels to configure VAD provider settings (FireRed, Silero, TEN) and EOS provider settings (LiveKit EOS) with sensible defaults.

  • 31c2d51d 31538388

Comprehensive STT/TTS Test Suite

Added integration and unit tests across all STT, TTS, and integration service providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics. Includes shared test utilities for audio fixtures, credential loading, and metric collection.

  • 0d96809c feat: added integration and unit test for all the stt, tts and integration service
  • 3328f404 (from v2.0.1-pre) testing and refactoring stt and tts integration

Google STT Auto-Reconnect

Google STT streams now automatically reconnect when hitting the "Stream timed out after receiving no more client requests" error, preventing silent STT failures during long calls.

  • ca9e1b8d feat: reconnect google stt for stream timeout

Infrastructure & Build

Go 1.25.8

Bumped Go across all services and base Docker images.

  • 949288ad 3b591ec0

CI

Updated CI workflow to align with new Go version and enabled knowledge/telemetry in dev config.

  • 3b591ec0 chore: bump Go to 1.25.8, fix formatting, and enable knowledge/telemetry in dev

Web Widget & Deployment

  • Added idle timeout backoff configuration on web plugin deployments (migration 000009)
  • Fixed typo: renamed ideal_timeoutidle_timeout across entities (migration 000010)
  • Web widget deployment production testing and fixes
  • a7b9707a 095b9400

UI Improvements

  • Card list design made consistent across all listing pages (assistants, knowledge base, integrations, credentials)
  • Config form multi-input select component fix
  • Datepicker styling fixes (flatpickr CSS alignment)
  • Integration bridge updated for document-api
  • 81983940 b42ef01b 2ef61448 4df96dc1

SDKs & Examples

Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.

  • bd7152a6 feat: updated sdks and examples

Bug Fixes

  • 55cb24b4 fix: stream fixes for static packet (ElevenLabs TTS, dispatch behavior)
  • 46d1e541 fix: gofmt formatting across all callers and transformers
  • 095b9400 refactor: typo fix on deployment entity, cleanup web-widget unused vars

Community

  • Added Discord and Cal.com booking badges to README
  • 936160f5

Upgrade Guide

Self-hosted:

git pull origin main
docker compose down
docker compose up -d --build

Note: This release includes database migrations 000009 and 000010 for assistant-api. They will run automatically on startup.

Rapida Cloud: No action required — already deployed.


Full diff: v2.0.1-pre...v2.0.2-pre