Skip to content

suvamsh/screamer

Repository files navigation

Screamer

Screamer

The fastest free speech to text AI in the world.

www.screamer.app

Push-to-talk transcription. Hold a key, speak, release, and your text is pasted instantly.

Built with Rust Metal GPU License: MIT 100% Offline Download for Mac

What is it?

Screamer is a free, open source, offline push-to-talk speech-to-text app for macOS with about ~55ms average release-to-paste-dispatch latency on Apple M2 Max.

  • Hold a key, speak, release, and your text is pasted into the app you are using
  • Runs locally with Whisper, so there is no cloud round-trip
  • Shows a live overlay with waveform and rolling preview while you talk
  • Built for low-latency dictation instead of full meeting transcription

Project docs:

How it works

Hold Left Control -> Speak -> See waveform + live text -> Release -> Text pastes instantly
  • Local Whisper transcription with no cloud round-trip
  • Live overlay with waveform and rolling preview while the hotkey is held
  • Final transcription and paste on release, so unstable partials are never typed into the target app
  • Free, offline, and open source

Performance

Screamer latency

Metric Result
Average release-to-paste-dispatch latency ~55ms
Benchmark path Stop, resample, transcription, clipboard write, and Cmd+V dispatch
Verification harness ./verify_latency.sh running app_path_latency --dispatch-paste on the current synthetic phrase set
Test setup Apple M2 Max with base.en

Under the hood

  • Two pre-warmed Whisper states stay ready in memory: one for final transcription and one for live preview, so release-time transcription does not have to build a fresh state.
  • Adaptive audio context keeps short dictation fast by shrinking Whisper's work to match the utterance instead of always using the model's full default window.
  • The decode path is tuned for push-to-talk, not long recordings: no timestamps, single-segment output, no rolling context, and a fast greedy decode.
  • Live preview is best-effort and never types into your app. Only the final transcript is pasted, which lets Screamer drop stale preview work instead of slowing down the critical path.
  • Silence is trimmed before inference, and the recorder keeps latency low with preallocated buffers, small input buffers, and a lightweight paste dispatch.

Model guidance

Screamer does not yet ship a Screamer-specific WER harness in this repo. Accuracy mainly follows the underlying Whisper model, so treat these as relative model tradeoffs rather than audited Screamer eval numbers.

Model Tradeoff Best for
tiny.en Fastest, lowest accuracy Maximum speed
base.en Balanced default Best default for most people
small.en Slower, more accurate Better accuracy for harder vocabulary
medium.en High accuracy, higher latency Higher accuracy
large-v3 Highest accuracy, highest latency Highest accuracy

All models are free to download with ./download_model.sh.

Speed vs. the competition

App Latency Source
Screamer ~55ms Local average from ./verify_latency.sh (app_path_latency --dispatch-paste) on Apple M2 Max with base.en
Dictato 80ms Dictato
Handy ~350ms estimated Handy README, Handy model config, Handy settings strings
SuperWhisper ~700ms estimated Superwhisper, App Store, MacSources review, Declom review
Wispr Flow ~600ms estimated Wispr Flow, App Store, Microsoft Store, AI Productivity Coach review, Letterly review
Otter.ai ~1500ms estimated Otter, App Store

Screamer's number is the approximate average from the current synthetic phrase set on Apple M2 Max. Competitor numbers are public claims or rough public estimates as of March 29, 2026.

Install

Requirements:

  • macOS 13+ on Apple Silicon and Intel Macs
  • Rust toolchain 1.94+
  • cmake via brew install cmake

Build from source:

git clone https://github.com/suvamsh/screamer.git
cd screamer
./download_model.sh bundled
GGML_NATIVE=OFF cargo build --release
./bundle.sh
open Screamer.app

After first launch, grant Accessibility permission in:

System Settings -> Privacy & Security -> Accessibility -> Screamer

This is required for the global hotkey and paste simulation. If it isn't enabled yet, Screamer will keep an in-app helper window visible and can open the exact Accessibility pane for you.

macOS will also prompt for Microphone permission the first time you record.

bundle.sh will automatically try the first installed Developer ID Application certificate if one is available, which helps macOS keep Accessibility approval across rebuilds. If no usable certificate is installed, it falls back to ad-hoc signing and macOS may ask you to re-enable Accessibility after rebuilds.

Configuration

Config lives at ~/Library/Application Support/Screamer/config.json:

{
  "model": "base",
  "hotkey": "left_control",
  "overlay_position": "center",
  "appearance": "dark",
  "live_transcription": true,
  "sound_effects": true,
  "show_accessibility_helper_on_launch": true,
  "accessibility_helper_dismissed": false
}

Key settings:

  • model: Whisper model to use
  • hotkey: push-to-talk key
  • overlay_position: overlay placement
  • appearance: app theme
  • live_transcription: live preview in the overlay
  • sound_effects: start and finish cue sounds
  • show_accessibility_helper_on_launch: whether the helper window should appear on first launch
  • accessibility_helper_dismissed: remembers whether the helper window was dismissed

Privacy and logging

  • Transcription runs locally. Screamer does not send audio or text to a cloud service.
  • Runtime logs are written to ~/Library/Logs/Screamer/screamer.log by default.
  • Transcript contents are not logged by default.
  • Set SCREAMER_LOG_TRANSCRIPTS=1 only when you explicitly want transcript text in logs for debugging.
  • Set SCREAMER_LOG_FILE=/custom/path.log to override the log file location.

Stack

  • Rust app with a single native binary
  • whisper.cpp via whisper-rs
  • CoreAudio capture
  • Metal GPU acceleration on Apple Silicon

License

MIT