v4.1.0 — audio quality, synthesis performance, and developer experience.
Audio quality
- Degenerate-chunk guard — all-silence chunks retry synthesis once before being dropped
- Crossfade joins — 100 ms linear fade-out at chunk tails smooths the voiced→silence transition, completing the post-processing chain alongside peak normalization
Performance
- Read cache — re-reads skip the entire pipeline (fetch → summarize → synthesize). Cache key
(url, mode, voice, llm_model)with a composite index; only hits when the WAV still exists on disk - Faster synthesis — fp32 → bf16 default (~6% faster), chunk cap 280 → 400 chars (~30% fewer CSM prefills), sampler cached per (temperature, top_k)
- New
llm_modelcolumn on thereadstable (auto-migrated)
CLI
- Generation timer — player shows "Xs to generate" for live reads
- Library UI revamp — inline
mode · duration · words · dateper row, space to preview audio without leaving the library, enter for the full player - Venv auto-detect — server spawns via
.venv/bin/python3 -m readback(no activation needed); startup stderr captured
Tests, CI & docs
- Test suite trimmed 59 → 38; new
docs/TESTS.mdcatalogue; CI JUnit summaries - All doc surfaces synced; JOURNEY.md + finetune README rewritten
- CLI screenshots refreshed for v4.1.0
Full changelog: #20