v0.5.0

vitormf released this 01 Jun 21:14

· 364 commits to main since this release

32cf315

What's new

New features

Persistent transcription cache — Whisper transcriptions are now saved to ~/.cache/submatch/ keyed by video path, modification time, model, and segment count. Repeated runs on the same video skip audio extraction and Whisper entirely, making it fast to test multiple subtitles against the same video.
Audio-driven segment selection — Segments are now chosen using ffmpeg silencedetect to locate speech-rich regions, independent of any subtitle file. This lets the cache work across all subtitle files tested against the same video.
Transcription quality gate — After each Whisper call, segments are validated (no_speech_prob < 0.6, word count ≥ 3). If a candidate fails (silence, music, noise), the next candidate in the zone is tried automatically. The best available candidate is used as a fallback if all fail.
--no-cache — Bypass the cache entirely and use the original subtitle-driven segment selection for a single run.
--clear-cache — Delete all cached transcriptions and exit.
Cache configuration — Three new config keys: cache_ttl_days (default: 30), cache_max_mb (default: 200), cache_dir (default: ~/.cache/submatch). Cache is automatically evicted by TTL then LRU when limits are exceeded.

Bug fixes

Language detection across zones now requires a strict majority (>50% of zones) before setting audio_lang, preventing a false cross_language flag when some zones hit music or noise.
Cache hits are now returned correctly even when the last_used write-back fails (e.g. read-only filesystem).

Assets 4