System-wide, privacy-focused voice typing powered by local AI.
Dictate anywhere. Instant transcription. Total privacy.
Wanna start right away? If you are on Windows, just run:
.\setup.batOr use npm:
npm run setupThen, launch the app:
npm run tauri dev
WhisperHook in action: instant transcription across multiple applications.
Left: The sleek, dark-themed floating indicator. Right: Transcription results in real-time.
- π 100% Local & Private: Your voice never leaves your computer. Powered by
faster-whisper, everything processes offline. - β‘ System-Wide Integration: Works with your browser, Word, IDEs, or any other application.
- π€ Zero-Friction Dictation:
- Push-to-Talk: Hold
Ctrl+Shift+Mto transcribe. - Real-Time: Seamless natural speaking, powered by Voice Activity Detection (VAD).
- Push-to-Talk: Hold
- π¨ Unobtrusive Design: Lives quietly in your system tray with a floating indicator.
- βοΈ Customizable: Choose your Whisper model (tiny to medium), hardware (CPU/CUDA), and language.
WhisperHook achieves its extreme performance and privacy by combining two state-of-the-art AI technologies:
faster-whisper: A high-performance reimplementation of OpenAI's Whisper model using CTranslate2. It is up to 4x faster than the original implementation and uses significantly less memory through techniques like 8-bit quantization and layer fusion.silero-vad: A professional-grade Voice Activity Detector (VAD) that filters out non-human noise and silence, ensuring the transcription engine only processes actual speech.
WhisperHook supports multiple model sizes, allowing you to balance speed and accuracy based on your hardware. Below are the typical requirements for faster-whisper (running in FP16/INT8 precision):
| Model Size | Parameters | VRAM (GPU) | RAM (CPU) | Speed (Rel.) | Recommended Use Case |
|---|---|---|---|---|---|
| Tiny | 39M | ~390 MB | ~500 MB | ~32x | Real-time, low-end hardware. |
| Base | 74M | ~500 MB | ~600 MB | ~16x | Fast transcription, good for English. |
| Small | 244M | ~1.0 GB | ~1.2 GB | ~6x | Best balance of speed and accuracy. |
| Medium | 769M | ~2.5 GB | ~3.0 GB | ~2x | High accuracy for complex audio. |
| Large-v3 | 1550M | ~3.1 GB | ~4.5 GB | 1x | State-of-the-art accuracy. |
| Turbo | 809M | ~3.1 GB | ~4.0 GB | ~8x | SOTA accuracy with near-Small speed. |
Note
Performance varies depending on your hardware (CPU vs. NVIDIA GPU with CUDA or MPS on Mac). For the best experience, an NVIDIA GPU with at least 4GB of VRAM is recommended for larger models.
- Zero cloud dependency: All transcription happens locally via
faster-whisper. - No data collection: No audio or transcription data is ever transmitted or stored outside your machine.
- No internet required: Works 100% offline. Zero telemetry.
While we recommend using .\setup.bat for a one-click experience, here are the manual steps if you prefer:
- Node.js (v18+)
- Rust & Cargo (
rustup) - Python (3.10+)
The application uses a Python sidecar for transcription. Prepare the virtual environment:
cd python-core
python -m venv venv
# Windows
.\venv\Scripts\activate
# Linux/macOS
source venv/bin/activate
pip install -r requirements.txt# In the project root
npm installnpm run tauri devTo build a production-ready installer, you must bundle both the frontend and the Python sidecar.
npm run buildWhisperHook expects the Python engine as a standalone executable. Use PyInstaller to build it:
cd python-core
# Ensure your venv is active and requirements are installed
pyinstaller whisper-engine.specThis generates the executable in python-core/dist/whisper-engine.
Finally, build the installer (NSIS on Windows):
npm run tauri buildThe final installer will be located in src-tauri/target/release/bundle/.
This project is licensed under the MIT License. See the LICENSE file for details.


