Audio Input

Press a global hotkey, speak, and your words are transcribed into whatever is focused — any app, any input.

Open source (MIT). No account, no telemetry, no server of ours. Your Groq API key goes directly to Groq.

Free alternative to SuperWhisper.

Install

macOS

brew install --cask tonyyun/tap/audio-input

Or grab the .dmg from Releases. First launch: right-click → Open to bypass Gatekeeper.

Windows

Download Audio.Input_x.x.x_x64-setup.exe from Releases and run it.

First launch: Windows SmartScreen may say "Windows protected your PC". Click More info → Run anyway.

Setup

Get a free API key at console.groq.com (no credit card required)
Right-click the system tray mic icon → Configure API Key
Press Ctrl+Shift+Space (Windows) or ⌘⇧Space (macOS) anywhere and start talking

Features

Global hotkey — default ⌘⇧Space, fully customizable
Works everywhere — injects text into any focused input via Accessibility API
50+ languages — Whisper large-v3-turbo auto-detects your language
AI polish — optional LLM pass to clean up filler words and punctuation (toggle from menu bar). At recording start, a screenshot is taken and sent as context to a vision LLM (llama-4-scout on Groq) to improve accuracy of technical and domain-specific terms.
Tiny footprint — ~20 MB RAM, built with Rust + Tauri

Cost

Powered by Groq's Whisper large-v3-turbo — the fastest Whisper inference available.

$0.04 per hour of audio (~$0.00067/minute).

For typical use — a few minutes of voice input per day — that's well under $0.10/month. The Groq free tier alone covers most personal use.

How It Works

Press the global hotkey — a screenshot of the active screen is captured immediately
Speak; audio is recorded locally while you hold (or toggle) the hotkey
Audio is sent to Groq's Whisper large-v3-turbo for transcription
If AI polish is enabled, the transcript + screenshot are sent to a vision LLM (llama-4-scout) to fix technical terms, proper nouns, and punctuation
The final text is injected into whatever input is focused via the Accessibility API

Privacy

Audio is sent to Groq for transcription — Groq's data retention policy applies. Screenshots are taken locally and sent to Groq's vision API only when AI polish is enabled; neither audio nor screenshots are stored by this app. No analytics, no telemetry, no account required. See PRIVACY.md for full details.

Menu bar states

Icon	State
Black mic	Idle
Red mic	Recording
Blue mic	Transcribing
Orange mic	Error

Build from source

macOS

Prerequisites: Node 20+, Rust stable

# Install Rust if needed
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

git clone https://github.com/tyun08/audio-input
cd audio-input
npm install
npm run tauri dev    # dev mode
npm run tauri build  # release build → produces .dmg + .app in src-tauri/target/release/bundle/

Windows

Prerequisites:

Node.js 20+ — https://nodejs.org (LTS)
Rust — https://rustup.rs
Microsoft C++ Build Tools — https://visualstudio.microsoft.com/visual-cpp-build-tools/
- In the installer select "Desktop development with C++"
WebView2 Runtime — pre-installed on Windows 11; on Windows 10 get it from https://developer.microsoft.com/microsoft-edge/webview2/

git clone https://github.com/tyun08/audio-input
cd audio-input
npm install
npm run tauri dev

Stack

Tauri 2 · Rust (cpal, reqwest) · Svelte · Groq API (Whisper large-v3-turbo + LLM polish)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
.github		.github
Casks		Casks
docs		docs
src-tauri		src-tauri
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
PLAN.md		PLAN.md
PRIVACY.md		PRIVACY.md
README.md		README.md
ROADMAP.md		ROADMAP.md
SPEC.md		SPEC.md
TASKS.md		TASKS.md
eslint.config.js		eslint.config.js
fix-accessibility-permissions.sh		fix-accessibility-permissions.sh
fix-permissions.sh		fix-permissions.sh
index.html		index.html
install.sh		install.sh
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
whisper-example.py		whisper-example.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Input

Install

Setup

Features

Cost

How It Works

Privacy

Menu bar states

Build from source

macOS

Windows

Stack

License

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Audio Input

Install

Setup

Features

Cost

How It Works

Privacy

Menu bar states

Build from source

macOS

Windows

Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages