Skip to content

aralde/whisperhook

Repository files navigation

WhisperHook Logo

πŸŽ™οΈ WhisperHook

System-wide, privacy-focused voice typing powered by local AI.

License: MIT Windows Local AI Tauri

Dictate anywhere. Instant transcription. Total privacy.

WhisperHook Banner

πŸš€ Quickstart (Local Development)

Wanna start right away? If you are on Windows, just run:

.\setup.bat

Or use npm:

npm run setup

Then, launch the app:

npm run tauri dev

🎬 Demo

WhisperHook Demo WhisperHook in action: instant transcription across multiple applications.

πŸ“Έ Screenshots

WhisperHook Interface WhisperHook Settings

Left: The sleek, dark-themed floating indicator. Right: Transcription results in real-time.


✨ Why WhisperHook?

  • πŸ” 100% Local & Private: Your voice never leaves your computer. Powered by faster-whisper, everything processes offline.
  • ⚑ System-Wide Integration: Works with your browser, Word, IDEs, or any other application.
  • 🎀 Zero-Friction Dictation:
    • Push-to-Talk: Hold Ctrl+Shift+M to transcribe.
    • Real-Time: Seamless natural speaking, powered by Voice Activity Detection (VAD).
  • 🎨 Unobtrusive Design: Lives quietly in your system tray with a floating indicator.
  • βš™οΈ Customizable: Choose your Whisper model (tiny to medium), hardware (CPU/CUDA), and language.

🧠 The AI Engine

WhisperHook achieves its extreme performance and privacy by combining two state-of-the-art AI technologies:

  1. faster-whisper: A high-performance reimplementation of OpenAI's Whisper model using CTranslate2. It is up to 4x faster than the original implementation and uses significantly less memory through techniques like 8-bit quantization and layer fusion.
  2. silero-vad: A professional-grade Voice Activity Detector (VAD) that filters out non-human noise and silence, ensuring the transcription engine only processes actual speech.

πŸ“Š Model Comparison & Requirements

WhisperHook supports multiple model sizes, allowing you to balance speed and accuracy based on your hardware. Below are the typical requirements for faster-whisper (running in FP16/INT8 precision):

Model Size Parameters VRAM (GPU) RAM (CPU) Speed (Rel.) Recommended Use Case
Tiny 39M ~390 MB ~500 MB ~32x Real-time, low-end hardware.
Base 74M ~500 MB ~600 MB ~16x Fast transcription, good for English.
Small 244M ~1.0 GB ~1.2 GB ~6x Best balance of speed and accuracy.
Medium 769M ~2.5 GB ~3.0 GB ~2x High accuracy for complex audio.
Large-v3 1550M ~3.1 GB ~4.5 GB 1x State-of-the-art accuracy.
Turbo 809M ~3.1 GB ~4.0 GB ~8x SOTA accuracy with near-Small speed.

Note

Performance varies depending on your hardware (CPU vs. NVIDIA GPU with CUDA or MPS on Mac). For the best experience, an NVIDIA GPU with at least 4GB of VRAM is recommended for larger models.


πŸ”’ Privacy & Offline Execution

  • Zero cloud dependency: All transcription happens locally via faster-whisper.
  • No data collection: No audio or transcription data is ever transmitted or stored outside your machine.
  • No internet required: Works 100% offline. Zero telemetry.

πŸ› οΈ Local Development Setup

While we recommend using .\setup.bat for a one-click experience, here are the manual steps if you prefer:

Prerequisites

  1. Node.js (v18+)
  2. Rust & Cargo (rustup)
  3. Python (3.10+)

1. Python Environment Setup

The application uses a Python sidecar for transcription. Prepare the virtual environment:

cd python-core
python -m venv venv

# Windows
.\venv\Scripts\activate

# Linux/macOS
source venv/bin/activate

pip install -r requirements.txt

2. Install Frontend Dependencies

# In the project root
npm install

3. Run Development Mode

npm run tauri dev

πŸ“¦ Generating the Installer

To build a production-ready installer, you must bundle both the frontend and the Python sidecar.

1. Build the Frontend

npm run build

2. Freeze the Python Sidecar

WhisperHook expects the Python engine as a standalone executable. Use PyInstaller to build it:

cd python-core
# Ensure your venv is active and requirements are installed
pyinstaller whisper-engine.spec

This generates the executable in python-core/dist/whisper-engine.

3. Build the Tauri Application

Finally, build the installer (NSIS on Windows):

npm run tauri build

The final installer will be located in src-tauri/target/release/bundle/.


πŸ“„ License

This project is licensed under the MIT License. See the LICENSE file for details.

About

WhisperHook | Local, real-time AI dictation for any application. Powered by faster-whisper and silero-vad for ultra-fast, 100% private speech-to-text. πŸš€

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors