Control your terminal with your voice. No typing. No keyboard. Just speak.
Voice Shell is a browser-based terminal you operate entirely by voice. Tap the mic, say "show files" or "создай папку app", and it executes the command in real time. Built for developers who want to keep their hands on the code, not the keyboard.
Typing commands breaks flow. Switching between IDE and terminal, remembering flags, mistyping paths — it all adds friction. Voice Shell removes the keyboard from the equation entirely.
- Tap the mic in your browser
- Speak naturally — "create folder app", "go to src", "show files"
- Watch it execute — stdout/stderr streams back in real time via WebSocket
- Hear confirmations — dangerous commands ask for a verbal "yes" before running
| Layer | Tech |
|---|---|
| STT | ElevenLabs Scribe v2 (99+ languages, real-time streaming) |
| TTS | ElevenLabs Flash v2.5 (instant voice confirmations) |
| NLU | Regex-based natural language parser (25+ patterns, multilingual) |
| Shell | Node.js child_process.spawn with safety wrappers |
| Transport | WebSocket (ws) for real-time bidirectional streaming |
| UI | Vanilla JS + CSS (Matrix terminal aesthetic) |
| Editor | Cursor IDE (AI-assisted development) |
- Scribe v2 transcribes voice in 30+ languages with near-zero latency. The entire pipeline — speech → text → shell command → stdout — completes in under 2 seconds.
- Flash v2.5 generates instant TTS confirmations ("Folder created", "Confirm delete?"). No external audio files, fully dynamic.
- Multilingual by default — the parser handles English, Russian, Ukrainian, Spanish, German, and Japanese out of the box.
Dangerous commands (rm, sudo, kill) trigger a voice confirmation. The terminal literally asks "Confirm: rm -rf app. Say yes or no." No accidental deletions.
git clone https://github.com/checkra1neth/voice-shell.git
cd voice-shell
npm install
cp .env.example .env
# Add your ELEVENLABS_API_KEY
npm start
# Open http://localhost:3001| Variable | Required | Description |
|---|---|---|
ELEVENLABS_API_KEY |
Yes | ElevenLabs API key |
| Natural Language | Shell Command |
|---|---|
| "show files" / "покажи файлы" | ls -la |
| "create folder [name]" / "создай папку [name]" | mkdir [name] |
| "delete folder [name]" / "удали папку [name]" | rm -rf [name] (confirm) |
| "go to [name]" / "перейди в [name]" | cd [name] |
| "go back" / "вернись назад" | cd .. |
| "create file [name]" / "создай файл [name]" | touch [name] |
| "show content [file]" / "покажи содержимое [file]" | cat [file] |
| "clear screen" / "очисти экран" | clear |
| "git status" / "git статус" | git status |
| "npm start" | npm start |
- Hands-free terminal — the entire workflow is voice-driven, from navigation to file operations
- Real-time streaming — not polling, not batching. True WebSocket streaming of shell output
- Cross-language — switch languages mid-session, parser adapts automatically
- Production safety — verbal confirmation for destructive operations, not just a UI checkbox
| Format | File |
|---|---|
| TikTok / Reels (9:16) | demo/VoiceShell-Final-Hackathon.mp4 |
| YouTube / Twitter (16:9) | demo/VoiceShell-YouTube-Twitter.mp4 |
MIT