Double-tap a key. Talk. Tap once. Your words appear.
System-wide voice dictation for macOS, Windows, and Linux. No cloud lock-in — uses Groq's free Whisper API (or fully offline MLX Whisper on Apple Silicon).
You're typing in any app. Hands on the keyboard.
Step 1 — Double-tap ⌘ (or Ctrl on Win/Linux)
┌──────────────────────────────────────────────────────────────┐
│ │
│ Listening... │
│ │
└──▬▬▬▬▬▬▬▬▬▬▬▬▬ soft green glow at the bottom ▬▬▬▬▬▬▬▬▬▬▬▬──┘
Step 2 — Speak. Live text streams in as you talk.
┌──────────────────────────────────────────────────────────────┐
│ │
│ "let's ship this feature by friday" ... │
│ │
└──▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬──┘
Step 3 — Single-tap ⌘ (or Ctrl). Final text pastes at your cursor.
┌──────────────────────────────────────────────────────────────┐
│ │
│ "Let's ship this feature by Friday." ✓ │
│ │
└──────────────────── green flash ─────────────────────────────┘
sequenceDiagram
participant U as You
participant K as Keyboard hook
participant M as Microphone
participant G as Groq Whisper
participant A as Your app
U->>K: Double-tap ⌘ / Ctrl
K->>M: Start capture
Note over M: Live preview every 2s
M->>G: Stream audio chunks
G-->>K: "let's ship..."
U->>K: Single tap ⌘ / Ctrl
K->>M: Stop
M->>G: Final audio
G-->>K: "Let's ship this feature by Friday."
K->>A: Paste at cursor (⌘V / Ctrl+V)
Pick your OS:
|
Easiest — download .app:
From source: git clone https://github.com/murataslan1/stt
cd stt/macos
./install.sh
python3 stt.py |
From source: git clone https://github.com/murataslan1/stt
cd stt\windows
pip install -r requirements.txt
python stt_windows.pyBuild standalone .exe: build.bat→ |
git clone https://github.com/murataslan1/stt
cd stt/linux
./install.sh
python3 stt_linux.pyHandles apt / dnf / pacman automatically. X11 works out of the box. Wayland: see caveats below. |
| macOS | Windows | Linux | |
|---|---|---|---|
| Start recording | Double-tap ⌘ |
Double-tap Ctrl |
Double-tap Ctrl |
| Stop & paste | Single-tap ⌘ |
Single-tap Ctrl |
Single-tap Ctrl |
That's it. No menu, no clicks. Just the key.
The menu bar (macOS) lets you toggle between Groq (fast, cloud) and Local (offline MLX) mode or change your API key.
First launch shows this:
┌─────────────────────────────────────────┐
│ Welcome to STT │
│ │
│ Enter your Groq API key │
│ (free at console.groq.com/keys) │
│ │
│ ┌───────────────────────────────────┐ │
│ │ gsk_... │ │
│ └───────────────────────────────────┘ │
│ │
│ [ Save & Start ] │
└─────────────────────────────────────────┘
Key is stored at ~/.config/stt/settings.json (chmod 600 recommended). You can also set GROQ_API_KEY as an env var.
Get a free key: https://console.groq.com/keys (takes 30 seconds)
Skip the key? macOS falls back to local MLX Whisper (downloads ~1.5GB once, runs fully offline).
flowchart LR
A[Double-tap<br/>detection] --> B{Recording?}
B -->|no| C[Start<br/>InputStream]
B -->|yes| D[Stop &<br/>transcribe]
C --> E[Bottom-bar<br/>overlay]
C --> F[Live loop<br/>every 2s]
F -->|Groq API| G[Live text]
G --> E
D -->|Groq API| H[Final text]
H --> I[Clipboard]
I --> J[Synthesize<br/>⌘V / Ctrl+V]
J --> K[Paste at<br/>cursor ✓]
Key files per platform:
| Piece | macOS | Windows | Linux |
|---|---|---|---|
| Entry | macos/stt.py |
windows/stt_windows.py |
linux/stt_linux.py |
| Key hook | NSEvent.addGlobalMonitor |
pynput |
pynput (X11) |
| Overlay | AppKit NSWindow |
Tkinter Canvas |
Tkinter Canvas |
| Clipboard | NSPasteboard |
pyperclip |
xclip / wl-copy |
| Paste key | Quartz.CGEvent ⌘V |
pynput Ctrl+V |
xdotool / wtype |
Open the entry file for your OS — config is at the top:
DOUBLE_TAP_WINDOW = 0.4 # max seconds between taps
LIVE_INTERVAL = 2.0 # live transcription refresh
PREVIEW_LINGER = 3.0 # how long final text stays on screen
GROQ_MODEL = "whisper-large-v3-turbo"Different hotkey? Change the key check in handle_event (macOS) or _key_listener (Win/Linux).
Different STT engine? Replace transcribe_audio() — takes a numpy float32 array, returns a string.
macOS: nothing happens when I double-tap ⌘
Grant Accessibility permission:
System Settings → Privacy & Security → Accessibility → toggle on for STT.app (or Terminal if running from source).
Then restart the app.
macOS: "STT.app is damaged and can't be opened"
Gatekeeper is blocking the unsigned app. Run once:
xattr -dr com.apple.quarantine /Applications/STT.appLinux Wayland: key hook doesn't fire
pynput can't grab global keys on most Wayland compositors by design. Options:
- Switch to X11 session at login (GDM/SDDM session picker).
- Use a compositor shortcut → bind
Super+Space(or whatever) tokill -USR1 $(pgrep -f stt_linux.py)and patch the script to toggle on that signal. - Run under Xorg — most distros still ship an X session.
The paste side works fine on Wayland (uses wl-copy + wtype / ydotool).
Live transcription is blank / errors in logs
Probably an invalid Groq key. Menu bar → Set Groq API Key... (macOS), or edit ~/.config/stt/settings.json directly.
Audio is too quiet / not picking up
Check default input device: macOS System Settings → Sound → Input; Windows Sound panel; Linux pavucontrol. sounddevice uses the system default.
- Audio streams to Groq only while you're recording. Nothing persists server-side (per Groq's privacy policy).
- Local MLX mode (macOS) keeps audio on-device. Zero network.
- Only your API key is stored on disk (
~/.config/stt/settings.json).
stt/
├── macos/
│ ├── stt.py Main app (AppKit + MLX fallback)
│ ├── STT.app/ Double-clickable wrapper
│ ├── install.sh
│ └── requirements.txt
├── windows/
│ ├── stt_windows.py Main app (Tkinter + pynput)
│ ├── build.bat PyInstaller → standalone .exe
│ └── requirements.txt
└── linux/
├── stt_linux.py Main app (Tkinter + pynput)
├── install.sh apt / dnf / pacman auto-detect
├── stt.desktop Menu entry
└── requirements.txt
MIT — do whatever, attribution appreciated.
Contributions welcome. If you build a Windows/Linux binary, open a PR or attach it to the release.