Skip to content

v2.1.0 — Built-In Observability

Choose a tag to compare

@iamprashant iamprashant released this 01 Apr 05:59
· 365 commits to main since this release
Immutable release. Only release title and notes can be modified.
ac19987

Rapida v2.1.0 — Built-In Observability. Richer Than Most Managed Platforms.

Rapida is the only open-source voice AI platform where you self-host the entire stack, see per-stage latency on every call, swap any provider via config, and own your data completely.

No more external media servers. No more fragmented systems stitched together with glue code. Just engineering.


Per-Stage Telemetry

Every call now tracks granular, per-stage latency across the entire voice pipeline. No external tooling required.

  • STT latency — time from audio frame to transcript token
  • LLM time-to-first-token (TTFT) — inference latency per turn
  • TTS time-to-first-byte (TTFB) — synthesis latency per utterance
  • Duration metrics — end-to-end call stage durations with drill-down
  • Configurable telemetry providers — CRUD APIs to plug your own telemetry exporters per assistant
  • Dashboard visualization — all metrics visible in the Rapida UI, per call and aggregated

Measure your own pipeline. Identify bottlenecks. Optimize with data.


Pipeline Architecture Rewrite

The executor layer has been refactored into a streaming pipeline architecture.

  • LLM executor abstraction — clean separation between AgentKit, model-based, and WebSocket LLM backends
  • Executor-to-pipeline refactoring — the dispatch loop now routes through a unified pipeline instead of discrete executors
  • Pipeline optimization — reduced allocation overhead and improved streaming throughput
  • Input normalizer — structured input preprocessing before LLM inference

JSON-Driven Provider Configuration

Adding a new STT, TTS, or LLM provider no longer requires deep codebase knowledge.

  • Provider configs defined declaratively in JSON
  • Eliminates boilerplate when integrating new providers
  • Validated and tested with the existing provider matrix

Inline Noise Reduction

  • Integrated noise reduction into the audio input pipeline
  • Denoising runs inline before VAD, improving speech detection accuracy in noisy environments
  • New DenoiseAudioPacket and DenoisedAudioPacket packet types in the dispatch system

UX Overhaul

  • Simplified assistant creation — fewer steps, better defaults, streamlined flow
  • Model settings modal — configure LLM parameters without leaving the assistant view
  • Simplified deployment workflow — get to production faster
  • Agent workplace management — manage multiple agents from a single workspace
  • Analysis UX — updated create-analysis flow with better visualization
  • System variable suggestions — autocomplete for reserved prompt variables
  • Argument suggestions — inline suggestions for tool/function arguments

Bug Fixes

  • Google STT timeout handling
  • Credential dropdown in telemetry provider configuration
  • Knowledge tool only loads when the feature is enabled
  • Whitespace preservation after sentence boundaries for TTS
  • Missing VAD configuration parameters
  • Gemini LLM parameter mapping
  • First-time startup onboarding flow
  • Notification settings layout
  • Source indicator design alignment

Testing

  • 142 test files changed across backend and UI
  • Unit tests for all critical path components
  • Provider config test coverage
  • Language fallback tests for STT
  • Model pipeline integration tests

Developer Experience

Skills Framework

New skills for AI-assisted development on the Rapida codebase:

  • Provider integration (LLM, STT, TTS, telephony, VAD)
  • Telemetry integration
  • Noise reduction integration
  • End-of-speech integration
  • System understanding and local setup

Each skill includes validation scripts, templates, and examples.

Hook Orchestration

  • Pre/post-implementation hooks for automated test validation
  • Changed-file test runners
  • Post-tool hints for test coverage gaps

Breaking Changes

None. Backwards-compatible with v2.0.2.


Upgrade

# Self-hosted (Docker Compose)
git pull origin main
docker compose pull
docker compose up -d

# Fresh install
git clone https://github.com/rapidaai/voice-ai.git
cd voice-ai
cp .env.example .env
docker compose up -d

What's Next

  • Lower latency and higher concurrency in the agent runtime
  • Local model deployment for on-prem and air-gapped environments
  • Extended telemetry: custom dashboards, alerting, export to Datadog/Grafana
  • Improved documentation at doc.rapida.ai

Full Changelog: v2.0.2...v2.1.0

Star the repo: https://github.com/rapidaai/voice-ai
Docs: https://doc.rapida.ai