Skip to content

patelnav/voca

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ Voca

Give Claude Code the ability to hear and speak.

Apple Silicon Required 100% Local Low Latency


demo.mp4

Voca is a Claude Code plugin that adds an ambient voice interface to your coding sessions. It listens to your microphone, transcribes speech in real-time, and speaks back—all running locally on Apple Silicon. No cloud APIs, no latency, no data leaving your machine.

✨ Features

  • 🗣️ Natural Voice Interface: Talk to Claude Code naturally as you work.
  • ⚡️ Real-time Processing: Sub-second transcription ensures fluid conversations.
  • 🔒 Privacy First: 100% local inference. Your audio never leaves your machine.
  • 🎙️ Advanced Audio Routing: Seamless text-to-speech and voice activity detection built directly in.
  • 🔁 Instant Focus Handoff: Starting voice mode in a second session interrupts the active poll and transfers ownership immediately.

🚀 Getting Started

Prerequisites

All inference runs on-device using MLX.

⚠️ Requirement: macOS with Apple Silicon (M1+) is required to run Voca.

Installation

Add the marketplace and install the plugin:

/plugin marketplace add patelnav/voca
/plugin install voca@patelnav-voca

Setup & Usage

  1. Configure Devices: Run the setup command to configure your audio input/output devices and verify that everything is working:

    /voca:setup
  2. Start a Voice Session: Once setup is complete, start the ambient voice interface:

    /voca:start

Voca uses token-based HTTP polling under the hood. The first poll claims focus; every later poll in that session reuses the returned token. Starting voice mode in another session claims a new token and hands off listening immediately.

Claude will now listen in the background and respond seamlessly to your voice.

🛠️ Tech Stack

Voca is built on cutting-edge local AI technologies to ensure maximum performance and privacy:

Component Technology Description
Speech-to-Text Parakeet MLX v2 Blazing fast transcription (~0.1s on M-series)
Text-to-Speech Kokoro via mlx-audio High-quality, natural voice generation
Voice Activity Silero VAD Enterprise-grade voice activity detection
Runtime Python MCP Model Context Protocol server over stdio

About

Voice interface plugin for Claude Code — local STT, TTS, and VAD on Apple Silicon

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors