Skip to content

Add ExecuWhisper macOS app for low-latency on-device Parakeet dictation#232

Open
seyeong-han wants to merge 2 commits intometa-pytorch:mainfrom
seyeong-han:execuwhisper-macos-app
Open

Add ExecuWhisper macOS app for low-latency on-device Parakeet dictation#232
seyeong-han wants to merge 2 commits intometa-pytorch:mainfrom
seyeong-han:execuwhisper-macos-app

Conversation

@seyeong-han
Copy link
Copy Markdown
Contributor

@seyeong-han seyeong-han commented Apr 13, 2026

Summary

Screenshot 2026-04-13 at 1 01 08 PM
  • Add ExecuWhisper, a native macOS app for on-device Parakeet transcription with manual recording and batch dictation flows.
  • Integrate the warm parakeet_helper path so repeated dictation requests do not relaunch the model, reload weights, or pay cold-start latency on every Ctrl+Space session.
  • Include overlay/system dictation, history, replacements, export, microphone selection, packaging, and regression coverage.

Why the helper matters

  • The app runs repeated record-then-transcribe requests, so using a one-shot runner for every recording would force a fresh process and model load each time.
  • Keeping a helper warm makes the stop-to-text path noticeably faster for dictation UX and enables explicit preload/unload behavior in the app.
  • Sending captured PCM directly to the helper avoids temporary WAV round-trips and fits the app's in-memory audio pipeline.

Dependencies

  • pytorch/executorch#18861 — adds the parakeet_helper binary and shared ParakeetTranscriber that this app builds against

Test plan

  • xcodebuild -project "ExecuWhisper.xcodeproj" -scheme "ExecuWhisper" -derivedDataPath build -destination 'platform=macOS' -only-testing:ExecuWhisperTests test

Known issue

  • The fresh full-scheme test run currently fails in ImportedAudioDecoderTests.decodeAudioFileNormalizesWavToFloat32Mono16kPCM with Could not decode audio frames from the file.

Package the warm-helper Parakeet Metal workflow into a native macOS app with dictation, history, snippets, export, and build scripts so it can be tested and shared outside the CLI.

Made-with: Cursor
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 13, 2026
@seyeong-han seyeong-han changed the title Add ExecuWhisper macOS app for on-device Parakeet transcription Add ExecuWhisper macOS app for low-latency on-device Parakeet dictation Apr 13, 2026
Replace AVCaptureSession-based mic capture with AVAudioEngine, which
captures in the hardware's native format and normalizes offline via
ImportedAudioDecoder. This eliminates per-chunk AVAudioConverter churn
and format ambiguity that caused garbled transcriptions on short
utterances. Temp capture files are cleaned up after normalization.

Remove the snippets feature (trigger-phrase text expansion) since the
Parakeet model's multilingual decoder produces unreliable results for
short dictated phrases. Remove Snippet model, SnippetStore, sidebar
page, management view, and all snippet wiring from TextPipeline and
TranscriptStore.

Add macOS app icon assets from Voxtral Realtime. Remove debug WAV dump
code that was used during the recording quality investigation.

Co-authored-by: Claude <noreply@anthropic.com>
Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant