Skip to content
/ ova Public

Outrageous Voice Assistant - Fully local end-to-end ASR + LLM + TTS pipeline using open weight models and a simple web based UI

License

Notifications You must be signed in to change notification settings

acatovic/ova

Repository files navigation

Outrageous Voice Assistant

Outrageous Logo

A local voice assistant demo with a FastAPI backend and a simple HTML front-end. All the models (ASR / LLM / TTS) are open weight and running locally.

Models used:

Why "Outrageous"? Because it was outrageously easy to create!

How it works:

sequenceDiagram
  autonumber
  participant FE as Frontend (UI / Client)
  participant BE as Backend (API)
  participant ASR as Local ASR Model
  participant LLM as Local LLM
  participant TTS as Local TTS Model

  Note over FE,BE: Audio input -> response audio (all models run locally)
  FE->>BE: HTTP POST /chat (wave bytes)
  activate BE

  BE->>BE: Receive wave bytes
  BE->>BE: Parse header (channels, sample rate, bit depth)
  alt Resampling needed?
    BE->>BE: Resample audio (optional)
  end
  BE->>BE: Convert audio samples -> tensor

  BE->>ASR: Transcribe(tensor)
  activate ASR
  ASR-->>BE: transcript (text)
  deactivate ASR

  BE->>LLM: Generate(system prompt + transcript)
  activate LLM
  LLM-->>BE: response text
  deactivate LLM

  BE->>TTS: Synthesize(response text)
  activate TTS
  TTS-->>BE: audio output (samples)
  deactivate TTS

  BE->>BE: Encode samples -> wave bytes
  BE-->>FE: HTTP 200 (wave bytes)
  deactivate BE

  FE->>FE: Play audio to user
Loading

Demo

ova-demo.mp4

Pre-requisites

  • Python >=3.13
  • uv installed and available in PATH
  • Ollama installed and running (ollama CLI available)

Install

Fetch Python deps and HF/Ollama models:

./ova install

Start

Start the front-end and back-end services (non-blocking):

./ova start

Logs and PIDs are stored under .ova/.

Stop

Stop all services:

./ova stop

Enjoy!


Disclaimer: This project is a proof-of-concept demonstration and is provided "as is" without any warranties or guarantees. It is intended for educational and experimental purposes only. Use at your own risk.

About

Outrageous Voice Assistant - Fully local end-to-end ASR + LLM + TTS pipeline using open weight models and a simple web based UI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published