Native macOS app for fully offline audio transcription with speaker diarization. Drop in audio files, get back VTT subtitles, plain text and CSV — every byte stays on your Mac.
- Transcription: whisper.cpp with Apple Metal GPU acceleration, OpenAI's
large-v3-turbomodel bundled. - Speaker diarization: silero-vad + SpeechBrain ECAPA-TDNN with cosine-distance clustering. No HuggingFace token required.
- Shell: Electron with a small Python (FastAPI + uvicorn) backend spawned on a free local port.
- Privacy: loopback only (
127.0.0.1), no network calls during transcription, all data stays on disk under your chosen folder.
This is a working prototype rebuilt from a previous py2app/PyWebView app. macOS Apple Silicon only, tested on macOS 14+.
- Download the latest
LocalTranscript-x.y.z-arm64.dmgfrom the Releases page. - Mount, drag
LocalTranscript.appinto/Applications. - First launch: right-click → Open → confirm (Gatekeeper, the build is not Apple-notarized).
- The app asks once where transcripts should be stored. Default is
~/Documents/LocalTranscript. Each batch creates ayymmdd_hhmm/subfolder with the original audio plus VTT, TXT and CSV.
The .dmg ships everything: standalone CPython 3.13, a venv with PyTorch, SpeechBrain and silero-vad, the Whisper model (≈1.5 GB), whisper-cli with all dylibs, and a static ffmpeg. Total ≈1.8 GB.
- macOS Apple Silicon
- Homebrew with
whisper-cppinstalled (brew install whisper-cpp) — supplieswhisper-cliand its dylibs - Node.js 18+ and npm
- Python 3 on
PATH(only used as the bootstrap to download the standalone CPython for the bundle) - A copy of the Whisper
large-v3-turbomodel atmodels/ggml-large-v3-turbo.bin(download from Hugging Face)
# one-time: create a project venv with the backend deps
python3 -m venv venv
./venv/bin/pip install -r backend/requirements.txt
# install electron deps
cd electron
npm install
# run
npm startThe Electron main process spawns python -m uvicorn backend.main:app against your local venv on a free port, then loads the UI.
cd electron
npm install
npm run build:bundle # downloads python-build-standalone, creates venv with backend/requirements.txt,
# copies whisper-cli + dylibs from Homebrew Cellar, copies ffmpeg-static,
# syncs backend/, frontend/, models/ into resources/
npm run dist # invokes electron-builder, produces dist/LocalTranscript-x.y.z-arm64.dmgTo launch the packaged layout from a checkout (without re-running the bundle every time):
WHISPER_USE_BUNDLE=1 npm startbackend/ FastAPI + uvicorn server, whisper.cpp wrapper, SpeechBrain diarization
frontend/ vanilla HTML/CSS/JS UI
electron/
├── main.js spawns the backend, owns the window, native menu, IPC handlers
├── preload.js contextBridge with first-run config + autosave + shell helpers
├── package.json electron-builder config (dmg, extraResources)
├── build/icon.icns app icon
└── scripts/
├── build-python-runtime.mjs downloads CPython 3.13 standalone and creates the venv
├── build-binaries.mjs copies whisper-cli + dylibs + ffmpeg-static into resources/
└── sync-resources.mjs mirrors backend/, frontend/, models/, .env into resources/
Application code is MIT (see LICENSE). The bundled Python runtime, models and binaries each come with their own licenses; the complete list with upstream links is in THIRD-PARTY-LICENSES.md. Required attributions for Apache 2.0 and LGPL components are in NOTICE.
You may redistribute the .dmg for free or against donations as long as you ship it together with LICENSE, NOTICE and THIRD-PARTY-LICENSES.md (which the bundle does automatically).
Built on top of:
- whisper.cpp by Georgi Gerganov / ggml.ai
- OpenAI Whisper model weights
- SpeechBrain
- silero-vad
- PyTorch, scikit-learn, FastAPI, Electron
- astral-sh/python-build-standalone for the relocatable CPython
- ffmpeg-static for the LGPL ffmpeg build