Speech-to-text plugin for OpenCode with support for Moonshine (fast, edge-optimized) and Whisper (OpenAI's model) backends.
- Voice Input Tool - Record audio from microphone and transcribe to text
- Automatic Silence Detection - Recording stops when you stop speaking
- Multiple Backends - Moonshine (recommended), Whisper, or Faster-Whisper
- Privacy-First - All processing is done locally, no cloud APIs
- Configurable - Choose model size, language, and recording duration
Add to your opencode.json:
{
"plugin": ["opencode-speech-to-text"]
}# Base dependencies (required)
uv pip install sounddevice soundfile numpy
# Choose ONE backend:
# Option A: Moonshine (Recommended - fastest, smallest)
uv pip install useful-moonshine-onnx@git+https://github.com/moonshine-ai/moonshine.git#subdirectory=moonshine-onnx
# Option B: Whisper (OpenAI's original)
uv pip install openai-whisper
# Option C: Faster-Whisper (Optimized Whisper)
uv pip install faster-whisperEnsure your terminal application has microphone access in your system settings.
OpenCode will automatically have access to the voice_input tool:
You: Record my voice input
Assistant: I'll record your voice now. Please speak...
[Recording starts, stops on silence]
Assistant: I heard: "Please help me refactor the authentication module"
Type /voice to get guidance on using voice input.
Use the voice_check tool to verify your setup:
You: Check my voice setup
Assistant: [Uses voice_check tool]
Available STT backends: moonshine
Current configuration:
- Backend: auto
- Model: tiny
- Language: en
- Max duration: 30s
Configure via environment variables:
| Variable | Description | Default |
|---|---|---|
STT_BACKEND |
Backend: moonshine, whisper, faster-whisper, auto |
auto |
STT_MODEL |
Model size: tiny, base, small, medium, large |
tiny |
STT_LANGUAGE |
Language code | en |
STT_MAX_DURATION |
Max recording seconds | 30 |
STT_PYTHON_PATH |
Path to Python | python3 |
| Backend | Model | Size | Speed | Quality |
|---|---|---|---|---|
| Moonshine | tiny | 27M (190MB) | Fastest | Good |
| Moonshine | base | 62M (400MB) | Very Fast | Better |
| Whisper | tiny | 39M | Slow | Good |
| Whisper | base | 74M | Slower | Better |
| Whisper | small | 244M | Much Slower | Great |
Recommendation: Use Moonshine tiny for best balance of speed and accuracy.
- Arabic (ar), Chinese (zh), English (en), Japanese (ja), Korean (ko), Spanish (es), Ukrainian (uk), Vietnamese (vi)
- 99+ languages (see OpenAI Whisper)
brew install portaudio ffmpegsudo apt install -y portaudio19-dev python3-dev ffmpegsudo dnf install portaudio-devel python3-devel ffmpegInstall one of the supported backends:
uv pip install useful-moonshine-onnx@git+https://github.com/moonshine-ai/moonshine.git#subdirectory=moonshine-onnx- Check system permissions for your terminal
- Verify microphone is connected:
python -c "import sounddevice; print(sounddevice.query_devices())"
- Use Moonshine instead of Whisper
- Use
tinymodel instead of larger variants - Ensure you have a capable CPU/GPU
# Install dependencies
bun install
# Build
bun run build
# Type check
bun run typecheckMIT
- Moonshine - Fast ASR for edge devices
- OpenAI Whisper - Robust speech recognition
- VoiceMode MCP - Inspiration for the voice interface