EdgeVox

Sub-second local voice AI for robots and edge devices.

No cloud APIs. No internet after setup. Fully private. Powered by Gemma 4.

    ______    __         _    __
   / ____/___/ /___ ____| |  / /___  _  __
  / __/ / __  / __ `/ _ \ | / / __ \| |/_/
 / /___/ /_/ / /_/ /  __/ |/ / /_/ />  <
/_____/\__,_/\__, /\___/|___/\____/_/|_|
            /____/

Stack: Silero VAD -> faster-whisper (STT) -> Gemma 4 E2B IT via llama.cpp (LLM) -> Kokoro 82M (TTS)

Tested latency: 0.80s end-to-end on RTX 3080 (STT 0.40s + LLM 0.33s + TTS 0.08s)

Features

Streaming pipeline — speaks first sentence while LLM generates the rest
Interrupt support — speak while bot is talking to cut it off
Wake word detection — "Hey Jarvis" / "Lily" (optional, via OpenWakeWord)
Beautiful TUI — ASCII logo, sparkline waveform, latency history, GPU/RAM monitor, model info panel
ROS2 bridge — publishes STT/TTS/state to ROS2 topics for robotics integration
Slash commands — /reset, /lang, /voice, /say, /mictest, /model in the TUI
Chat export — Ctrl+S to save conversation as markdown
15 languages — English, Vietnamese, French, Spanish, Hindi, Italian, Portuguese, Japanese, Chinese, Korean, German, Thai, Russian, Arabic, Indonesian
Auto-detects hardware — GPU layers, model size, STT model

Hardware Requirements

Device	RAM	GPU	Expected Latency
PC (i9 + RTX 3080 16GB)	64GB	CUDA	~0.8s
Jetson Orin Nano	8GB	CUDA	~1.5-2s
MacBook Air M1	8GB	Metal	~2-3s
Any modern laptop	16GB+	CPU only	~2-4s

Quick Start

# 1. Install uv (fast Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Create virtual environment
uv venv --python 3.12
source .venv/bin/activate

# 3. Install llama-cpp-python with CUDA (prebuilt wheels)
uv pip install llama-cpp-python \
    --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124

# For Apple Silicon (Metal):
# CMAKE_ARGS="-DGGML_METAL=on" uv pip install llama-cpp-python

# For CPU only:
# uv pip install llama-cpp-python

# 4. Install EdgeVox
uv pip install -e .

# 5. Download all models (~3GB total)
edgevox-setup

# 6. Run!
edgevox

Usage

# TUI mode (default, recommended)
edgevox

# With wake word
edgevox --wakeword "hey jarvis"

# With ROS2 bridge (for robotics)
edgevox --ros2

# CLI mode (simpler, no TUI)
edgevox-cli

# Text mode (no microphone)
edgevox-cli --text-mode

# Custom options
edgevox \
    --whisper-model large-v3-turbo \
    --voice am_adam \
    --language en

TUI Controls

Key	Action
`Q`	Quit
`R`	Reset conversation
`M`	Mute/Unmute mic
`/`	Open command input
`Ctrl+S`	Export chat to markdown

Slash Commands

Command	Action
`/reset`	Reset conversation
`/lang XX`	Switch language (en, vi, fr, ko, ...)
`/langs`	List all supported languages
`/say TEXT`	TTS preview — speak text directly
`/mictest`	Record 3s + playback to test audio
`/model SIZE`	Switch Whisper model (small/medium/large-v3-turbo)
`/voice XX`	Switch TTS voice
`/voices`	List available voices
`/export`	Export chat to markdown
`/mute`	Mute microphone
`/unmute`	Unmute microphone
`/help`	Show all commands

ROS2 Integration

EdgeVox can publish voice pipeline events to ROS2 topics, making it easy to add voice interaction to any robot.

# Install with ROS2 support
uv pip install -e ".[ros2]"

# Run with ROS2 bridge
edgevox --ros2

Published Topics

Topic	Type	Description
`/edgevox/transcription`	`std_msgs/String`	User's speech (STT output)
`/edgevox/response`	`std_msgs/String`	Bot's response text
`/edgevox/state`	`std_msgs/String`	Pipeline state (listening, thinking, speaking)
`/edgevox/audio_level`	`std_msgs/Float32`	Mic level (0.0-1.0)
`/edgevox/metrics`	`std_msgs/String`	JSON latency metrics

Subscribed Topics

Topic	Type	Description
`/edgevox/tts_request`	`std_msgs/String`	Send text for the bot to speak
`/edgevox/command`	`std_msgs/String`	Commands: reset, mute, unmute

Example: Robot Integration

import rclpy
from std_msgs.msg import String

# Listen to what the user says
node.create_subscription(String, '/edgevox/transcription', on_user_speech, 10)

# Make the robot say something
pub = node.create_publisher(String, '/edgevox/tts_request', 10)
msg = String()
msg.data = "I detected an obstacle ahead."
pub.publish(msg)

Architecture

                        EdgeVox Pipeline
 +-----------+     +------------+     +----------------+
 | Microphone|---->| Silero VAD |---->| faster-whisper  |
 |           |     | (32ms)     |     | (STT)          |
 +-----------+     +------------+     +--------+-------+
                                               |
                                               v
                                      +----------------+
                                      | Gemma 4 E2B IT |
                                      | (streaming)    |
                                      +--------+-------+
                                               | sentence by sentence
                                               v
 +-----------+     +------------+     +----------------+
 |  Speaker  |<----| Kokoro 82M |<----| Sentence       |
 |           |     | (TTS)      |     | Splitter       |
 +-----------+     +------------+     +----------------+
                         |
                         v (optional)
                   +------------+
                   | ROS2 Bridge|----> /edgevox/* topics
                   +------------+

Model Sizes

Component	Model	Size	RAM
VAD	Silero VAD v6	~2MB	~10MB
STT	whisper-small	500MB	~600MB
STT	whisper-large-v3-turbo	1.5GB	~2GB
LLM	Gemma 4 E2B IT Q4_K_M	1.8GB	~2.5GB
TTS	Kokoro 82M	200MB	~300MB
Wake	OpenWakeWord	~2MB	~10MB

M1 Air (8GB): whisper-small + Q4_K_M = 3.4GB PC with GPU: whisper-large-v3-turbo + Q4_K_M = 5.8GB

Documentation

Full docs: EdgeVox Docs (built with VitePress)

cd website && npm run dev

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
docs		docs
edgevox		edgevox
scripts		scripts
tests		tests
voices		voices
website		website
webui		webui
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EdgeVox

Features

Hardware Requirements

Quick Start

Usage

TUI Controls

Slash Commands

ROS2 Integration

Published Topics

Subscribed Topics

Example: Robot Integration

Architecture

Model Sizes

Documentation

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EdgeVox

Features

Hardware Requirements

Quick Start

Usage

TUI Controls

Slash Commands

ROS2 Integration

Published Topics

Subscribed Topics

Example: Robot Integration

Architecture

Model Sizes

Documentation

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages