Releases: Gremble-io/Detto
Releases · Gremble-io/Detto
Detto 2.0.4
Trimmed the bundled vocabulary list to improve refiner output. This is the downloadable build for the 2.0.x vocabulary update; supersedes 2.0.3, which shipped without an attached binary.
Detto 2.0.2 - Model download recovery
[2.0.2] — 2026-06-02
Fixed
- Model download recovery: if the initial Llama 3.2 3B download is interrupted (force-quit, network drop, sleep mid-download), the app now correctly detects the partial state on relaunch and resumes the download instead of failing silently. Previously affected users had to manually clear
~/Documents/huggingface/to recover. (#36)
Added
- Loading indicator in the main window during model download so users can see progress on first launch.
Detto 2.0.1 - First-launch crash fix
[2.0.1] — 2026-06-02
Fixed
- Crash on first launch during model initialization on clean installs. swift-transformers resource bundles (containing fallback tokenizer configs) were not being copied into the shipped app, causing
Bundle.moduleto fail when loading a tokenizer for the first time. Thanks to @krazos for the detailed bug report and crash log (#35).
Detto 2.0.0
[2.0.0] — 2026-06-02
Initial public release of Detto, a complete rewrite of Tome. All processing on-device.
Added
- Dictation mode: hold-to-speak with configurable hotkeys (Control, Fn, Right Option, Right Command, Hyper Key, custom combos). Floating overlay with waveform visualization. Text injection at the cursor.
- Client briefing system: vault-aware context loading, attendee tracking, per-client transcript folders. Reads client metadata from your Obsidian vault.
- Vocabulary correction: 3-pass post-ASR correction (explicit mappings, case normalization, fuzzy matching). Bundled Canadian Politics pack (53 terms, 30 corrections); custom vocabulary file support via Settings.
- On-device transcript refinement: optional post-session cleanup via Llama 3.2 3B (MLX). Fixes proper nouns and grammar with five validation guards against hallucination.
- Channel-based speaker separation: dual-stream mic + system audio with per-stream VAD and ASR. Improved diarization on call recordings.
- Confidence scores: Parakeet ASR confidence captured per utterance and used to skip refinement on high-confidence segments.
Changed
- ASR engine: Parakeet-TDT v3 via GrembleVoice (25 European languages, multilingual auto-detection).
- License: MIT → Business Source License 1.1 (converts to MIT 2030-05-12). Existing Tome v1 forks retain their MIT copy.
- CI: signed and notarized DMG builds via GitHub Actions.
Removed
- Tome v1 codebase (complete rewrite).
V1.2.2
v1.2.1
- Fixed crash on every button tap (SIGSEGV in MainActor.assumeIsolated) caused by audio level polling flooding SwiftUI's AttributeGraph
- Fixed audio input breaking when switching mic devices between recording sessions (AVAudioEngine cached stale device state)
v1.2.0
- Fixed Swift 6 build failure on Xcode 26.4+ (upgraded FluidAudio to actor-based AsrManager)
- Build script now fails on missing code signing identity instead of shipping unsigned
- Added Gatekeeper troubleshooting to README
- Added SECURITY.md
v1.1.0 — Multilingual Transcription
Upgraded from Parakeet-TDT v2 (English-only) to v3 (25 European languages with auto-detection).
Another one-time ~600MB model download is required on first launch after updating.
Pinned FluidAudio to 0.12.1 for stability.
v1.0.1 — Spectrum Visualizer
Reactive Winamp-style spectrum analyzer replaces static waveform. Bars respond to real audio levels with peak hold and dynamic glow.