Releases · vitormf/submatch

04 Jun 16:22

vitormf

v0.7.0

aff1670

v0.7.0 Latest

Latest

What's new

New features

Per-segment cross-language scoring — each segment now detects its own audio language via Whisper; cross-language scoring activates per segment rather than all-or-nothing, so dubbed or mixed-language files are handled correctly even when not every segment is cross-language
Per-segment audio language in output — --verbose now shows asr[lang] for each segment; segment audio languages are also included in CSV and HTML reports
Language confidence gate — segments where Whisper reports low confidence are excluded from audio language voting; unsupported languages (Basque, Filipino) bail out early to avoid wasting time on unreliable transcription

Bug fixes

Fixed crash on video containers that omit duration at the format level (e.g., raw MPEG-TS recordings)
Fixed audio candidate positions exceeding the audio track duration, which caused ffmpeg errors on recordings padded with video after audio ends
Fixed audio language voting to only count segments that pass the quality gate
Fixed segment_langs padding in cache store when fewer segments were transcribed than expected
Fixed quality gate not applying to the --no-cache transcription voting path

Assets 4

03 Jun 12:58

vitormf

v0.6.1

b32d6d1

v0.6.1

Bug fixes

Fix crash on embedded image track extraction — when ffmpeg failed to extract a VOBSUB or PGS subtitle track (e.g. corrupted stream, unsupported mux format), the CalledProcessError propagated and aborted extraction for all remaining tracks on that video. The fix catches the error per-track and skips the failing one, so other tracks continue normally. Text tracks (SRT, ASS) are unaffected.

Assets 2

03 Jun 09:00

vitormf

v0.6.0

fbfa6ef

v0.6.0

What's new

New features

Image-based subtitle support (VOBSUB/PGS): submatch can now score bitmap subtitle formats. pytesseract is bundled with pip install submatch; only the Tesseract engine binary needs to be installed separately. If Tesseract is missing when an image subtitle is processed, submatch exits with code 2 and prints installation instructions.
Cross-language threshold now defaults to 0.20: The --cross-threshold default has been recalibrated from 0.35 to 0.20, based on empirical data showing true positive cross-language pairs typically score 0.24–0.49 while false positives peak at 0.18. Use --cross-threshold to override.
Lazy sync: ffsubsync now runs only when the initial score is FAIL, cutting runtime for passing pairs.
GPU mismatch detection: warns when CPU-only PyTorch is installed on a machine with an NVIDIA GPU, with instructions for installing the CUDA-enabled build.
Crash telemetry: pipeline errors are reported to Sentry to help improve reliability. No file paths or personal data are transmitted. Opt out with SUBMATCH_NO_TELEMETRY=1 or telemetry = false in config.

Bug fixes

Audio language detection: plurality rule now accepts ≥50% (was >50%), fixing edge cases where the correct audio language was rejected in content with mixed-language segments (e.g. segments that confuse Whisper into tagging parts as a different language).
Temp file cleanup: resync temp files are cleaned up on copy failure.

Improvements

Embedded subtitle tracks are extracted in a single ffmpeg pass (faster batch processing).
Telemetry is automatically disabled on editable installs to avoid sending development errors to production.

Assets 4

01 Jun 21:14

vitormf

v0.5.0

32cf315

v0.5.0

What's new

New features

Persistent transcription cache — Whisper transcriptions are now saved to ~/.cache/submatch/ keyed by video path, modification time, model, and segment count. Repeated runs on the same video skip audio extraction and Whisper entirely, making it fast to test multiple subtitles against the same video.
Audio-driven segment selection — Segments are now chosen using ffmpeg silencedetect to locate speech-rich regions, independent of any subtitle file. This lets the cache work across all subtitle files tested against the same video.
Transcription quality gate — After each Whisper call, segments are validated (no_speech_prob < 0.6, word count ≥ 3). If a candidate fails (silence, music, noise), the next candidate in the zone is tried automatically. The best available candidate is used as a fallback if all fail.
--no-cache — Bypass the cache entirely and use the original subtitle-driven segment selection for a single run.
--clear-cache — Delete all cached transcriptions and exit.
Cache configuration — Three new config keys: cache_ttl_days (default: 30), cache_max_mb (default: 200), cache_dir (default: ~/.cache/submatch). Cache is automatically evicted by TTL then LRU when limits are exceeded.

Bug fixes

Language detection across zones now requires a strict majority (>50% of zones) before setting audio_lang, preventing a false cross_language flag when some zones hit music or noise.
Cache hits are now returned correctly even when the last_used write-back fails (e.g. read-only filesystem).

Assets 4

31 May 15:23

vitormf

v0.4.0

e048e0f

v0.4.0

What's new

New features

--json FILE, --csv FILE, --html FILE: write results to JSON, CSV, or self-contained HTML report files. Breaking change: --json previously printed JSON to stdout; it now requires a file path. Update scripts from --json to --json output.json.
--embedded: score subtitle tracks embedded in the video container (MKV, MP4, etc.) without needing external SRT files
--watch: monitor a directory for new video/subtitle pairs and score them as they appear; --poll and --interval for network mounts (NFS, SMB)
Config file support: set persistent defaults in ~/.config/submatch/config.toml or ./submatch.toml

Bug fixes

Terminate child process groups (ffmpeg, ffs) on Ctrl+C to prevent orphan processes
Fix config file validation for --model / --device choices and sub_lang string values

Assets 4

29 May 18:48

vitormf

v0.3.0

6c552a4

v0.3.0

What's new

New features

Fractional progress bar updates per segment and dynamic terminal resize support
Transcription caching to skip re-transcribing already-processed segments
ISO 639-2 language code normalisation
Batch report headers showing source directory and pair count

Improvements

Cross-language subtitle matching using multilingual sentence embeddings (paraphrase-multilingual-MiniLM-L12-v2)

Assets 4

29 May 00:19

vitormf

v0.2.0

5fca827

v0.2.0

What's new

New features

Flexible input: pass any mix of video files, subtitle files, and directories — submatch auto-pairs them
--no-recursive flag to disable recursive directory scanning (directories are scanned recursively by default)
ffmpeg is now bundled via static-ffmpeg — no system ffmpeg install required
--drift-threshold flag to control how many seconds of offset trigger a drift warning (default: 2.0)

Bug fixes

Chinese, Japanese, and Korean subtitles now score correctly (character-level tokenization)
Unknown file types (.DS_Store, .nfo, images) are no longer misclassified as video when scanning directories
Spurious "no subtitles found" warnings are suppressed when inputs come from directory scans
UTF-8 output on Windows no longer crashes with UnicodeEncodeError when piped

Improvements

Parallel batch workers now default to up to 4 regardless of device

Install / upgrade

pip install --upgrade submatch

Assets 4

28 May 13:46

vitormf

v0.1.0

1164255

v0.1.0 — Initial release

`submatch` verifies that a subtitle file actually matches the audio content of a video — catching the case where subtitle tools like subliminal or Bazarr return correctly-timed but wrong-content subtitles.

Install

```bash
pip install submatch
```

System dependencies: `ffmpeg` (`brew install ffmpeg`) and `ffsubsync` (`pip install ffsubsync`).

What's in this release

Core

Transcribes short audio segments with Whisper and scores against subtitle text using token F1
Dialogue-density segment sampling — picks the 30s windows with the most subtitle words per zone, skipping intros/credits
Timing drift detection via ffsubsync, flagging offsets > 2s
Three language signals: Whisper audio language, langdetect on subtitle text, filename convention + ffprobe metadata
4-state result system: PASS, DRIFT (content matches but timing drift detected), FAIL (wrong content), UNSURE (insufficient transcription data)
--resync: auto-correct drift in place on DRIFT; --pass-unsure: exit 0 for UNSURE results
--keep-synced: save the timing-corrected subtitle to disk; --delete-failures: remove subtitle files that fail the match check

Cross-language matching

When subtitle and audio languages differ (e.g. English audio + Portuguese subtitles), scoring automatically switches from token F1 to multilingual semantic similarity via paraphrase-multilingual-MiniLM-L12-v2. Use --cross-threshold to tune the cutoff independently.

Batch mode

Pair a directory of videos with their same-stem subtitles, or score one video against a subtitle directory
--recursive / -r for Plex/Kodi/Jellyfin nested library layouts
--sub-lang CODE to filter by language tag (e.g. pt, en, pt-BR)
--filter GLOB to filter by filename pattern
--workers for parallel processing; --device to target CPU, MPS (Apple Silicon), or CUDA
Live progress with ETA, in-place result lines, and --compact one-line-per-pair summary

Subtitle formats

SRT, WebVTT, and ASS/SSA — via pysubs2.

Output

Human-readable with ANSI colour, or --json for machine-readable output. Transcription results are cached per video so re-runs against a different subtitle skip re-transcription.

States and exit codes

State	Meaning	Exit code
`PASS`	Content matches, no timing drift	`0`
`DRIFT`	Content matches, but timing drift detected	`1` (use `--resync` to fix in place)
`FAIL`	Content does not match	`1`
`UNSURE`	Not enough transcription data to decide	`1` (use `--pass-unsure` to exit `0`)
—	Error (missing dependency, unreadable file, no audio track)	`2`

Assets 2

Releases: vitormf/submatch

v0.7.0

What's new

New features

Bug fixes

Uh oh!

v0.6.1

Bug fixes

Uh oh!

v0.6.0

What's new

New features

Bug fixes

Improvements

Uh oh!

v0.5.0

What's new

New features

Bug fixes

Uh oh!

v0.4.0

What's new

New features

Bug fixes

Uh oh!

v0.3.0

What's new

New features

Improvements

Uh oh!

v0.2.0

What's new

New features

Bug fixes

Improvements

Install / upgrade

Uh oh!

v0.1.0 — Initial release

Install

What's in this release

Core

Cross-language matching

Batch mode

Subtitle formats

Output

States and exit codes

Uh oh!