Releases: BenPohlBasel/LocalTranscript
Releases · BenPohlBasel/LocalTranscript
LocalTranscript 0.1.0
First public release of LocalTranscript — fully offline audio transcription with speaker diarization for macOS Apple Silicon.
Install
- Download
LocalTranscript-0.1.0-arm64.dmgbelow - Mount, drag
LocalTranscript.appinto/Applications - Right-click → Open → Bestätigen on first launch (Gatekeeper, the build is not Apple-notarized)
- The app asks once where transcripts should be stored. Default:
~/Documents/LocalTranscript. Each batch creates ayymmdd_hhmm/subfolder containing the original audio plus VTT, TXT and CSV.
What's bundled
- Transcription: whisper.cpp with Apple Metal GPU, OpenAI
large-v3-turbomodel - Speaker diarization: silero-vad + SpeechBrain ECAPA-TDNN with cosine clustering — no HuggingFace token required
- Speaker rename: per-file modal with 10-second audio loop preview
- Term correction: spaCy
de_core_news_smextracts named entities and proper nouns (e.g., place names, people, project names) sorted by frequency, editable in a modal — all occurrences replaced word-precisely in VTT/TXT/CSV - Pipeline: per-segment cuts at speaker boundaries → each whisper call sees exactly one speaker, word-accurate by construction (~99.8% frame-level agreement vs the previous pyannote reference on a 42-min benchmark audio)
- Standalone CPython 3.13, all required dylibs, static ffmpeg
Requirements
- macOS Apple Silicon (M1/M2/M3/M4) — arm64 only
- macOS 11 (Big Sur) or newer
- ~3 GB free disk
Bundle size: 1.8 GB compressed, 2.6 GB unpacked.
License
Application code is MIT (see LICENSE). The bundled Python runtime, models and binaries each come with their own licenses; the complete list with upstream links is in THIRD-PARTY-LICENSES.md. Required attributions for Apache 2.0 and LGPL components are in NOTICE.