Skip to content

subarr 1.1.0 — speech-aware audio

Choose a tag to compare

@coaxk coaxk released this 04 Jun 06:50
· 186 commits to main since this release
9558ff4

🎙 subarr 1.1 — speech-aware audio

The headline: when you check a file's audio language by ear, subarr now lands the review clip on actual dialogue — instead of the old fixed 5-second window that hit silence or intro music most of the time. silero voice-activity detection picks a ~12s speech window, with a "🎙 speech-detected" badge. Opt-in (a "Speech detection" onboarding step + a Settings → System card; pulls a ~2 MB model). When it's off or undownloaded it falls back cleanly to the previous behaviour — nothing changes unless you enable it.

Added

  • Speech-aware audio (silero VAD) — review clips land on dialogue, not dead air. (#110, #111)
  • Config persistence — UI settings survive a container restart (env vars still authoritative). (#112)
  • Deterministic subtitle readability linter (CPS/CPL/timing). (#92, #108)
  • Whisper-tuning tournament — judging engine + reference-free quality judges (hallucination / looping / canned-phrase / coverage / cross-config consensus) + a Tier-B validation harness. Internal foundation this release, validated against professional-reference accuracy; the tuning lab surfaces as a user feature in v1.2. See docs/research/tournament-validation.md. (#65, #120, #121, #122)
  • Throttled library-backfill selection core (opt-in foundation). (#116)

Changed

  • Audio-review clips are now ~12s (was 5s) — long enough to reliably hear dialogue.

Upgrade

docker compose pull && docker compose up -d   # track :1.1 or :stable

New image dependency: the speech-detection runtime (onnxruntime + numpy, ~65 MB, no PyTorch) is baked in but inert until you opt in.

Full changelog: https://github.com/coaxk/subarr/blob/v1.1.0/CHANGELOG.md