Voice to text, entirely on your machine. Hold Fn, speak, release — your words are transcribed, polished, and pasted right where you need them. No cloud, no account, no latency.
Built in a weekend because I kept getting ads for Wispr Flow and thought — why not build it myself?
- Hold Fn — OpenWhisp starts listening
- Speak — your voice is captured locally
- Release Fn — Whisper transcribes your speech, a local LLM polishes the text, and the result is pasted into whatever app you were using
The entire pipeline runs locally via Whisper (speech-to-text) and Ollama (text enhancement).
- Fully local — no data leaves your Mac
- Styles — switch between Conversation and Vibe Coding modes depending on context
- Enhancement levels — from raw transcription (No Filter) to professional polish (High)
- Intent resolution — if you change your mind mid-sentence ("make it white... actually, black"), OpenWhisp resolves to your final intent
- Auto-paste — refined text is pasted directly into the active app
- Auto-launch Ollama — if Ollama is installed, OpenWhisp starts it automatically
- Setup wizard — guided first-launch experience for permissions, models, and configuration
- Minimal overlay — a small audio-reactive grid appears at the bottom of your screen during dictation
| Style | Use case |
|---|---|
| Conversation | Messages, emails, notes, everyday writing |
| Vibe Coding | Developer communication — translates casual speech into proper engineering language |
Each style has four enhancement levels: No Filter, Soft, Medium, and High.
- macOS (Apple Silicon recommended)
- Ollama — OpenWhisp auto-launches it if installed
- ~10 GB disk space for models (downloaded on first launch)
1. Install Ollama and download the text model first:
# Install Ollama from https://ollama.com/download/mac, then:
ollama serve
# In a new terminal, pull the text enhancement model (~9.6 GB)
ollama pull gemma4:e4b2. Clone and run Openwhisp:
git clone https://github.com/giusmarci/openwhisp.git
cd openwhisp
npm install
npm run build:native
npm run devOn first launch, the setup wizard will walk you through:
- Ollama — verifies the connection. If Ollama is running, it connects automatically.
- Speech model — downloads Whisper Base Multilingual (~150 MB) automatically.
- Text model — detects the Gemma 4 model you already pulled.
- Permissions — microphone access for recording, plus Accessibility and Input Monitoring for Fn key listening and auto-paste.
After setup, click into the text field where you want the text to go (an email, chat, code editor, etc.), then hold Fn and speak. When you release, the transcribed and enhanced text is automatically pasted into that field. If you move away or no text field is selected, the text is still copied to your clipboard — just use Cmd+V to paste it wherever you need.
| Purpose | Model | Size |
|---|---|---|
| Speech-to-text | onnx-community/whisper-base |
~150 MB |
| Text enhancement | gemma4:e4b |
~9.6 GB |
You can switch to any Ollama-compatible model from the Models page.
- Electron + React + TypeScript — desktop shell and UI
- @huggingface/transformers — local Whisper inference
- Ollama — local LLM inference via API
- Swift — native macOS helper for Fn key listening, focus detection, and paste simulation
- electron-vite — build tooling
- Hugeicons — UI icons
npm run packageBuilds the Electron app, compiles the Swift helper, and packages everything into a .dmg and .zip in the release/ directory.
src/
main/ # Electron main process
dictation.ts # Transcription + rewrite pipeline
ollama.ts # Ollama API client + auto-launch
prompts.ts # Global rules + style + level prompt matrix
settings.ts # Settings persistence
windows.ts # Window creation and positioning
renderer/ # React UI
App.tsx # Sidebar layout, pages, setup wizard, overlay
styles.css # Complete styling
audio-recorder.ts # Web Audio recorder with level metering
preload/ # Electron preload bridge
shared/ # Shared types and constants
swift/
OpenWhispHelper.swift # Native macOS helper
MIT
Made by Raelume
