You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docker-talkies v0.7.0 — Qwen3-TTS PCM streaming + supply-chain
bump-on-mutation Makefile workflow.
Minor release. Two user-visible threads.
1. PCM streaming for Qwen3-TTS. response_format="pcm" against a
qwen3_tts model now streams the raw PCM body via HTTP/1.1
chunked transfer-encoding instead of buffering the full
utterance. First-audio latency drops from ~3-8 s (synthesise +
buffer) to ~200-700 ms (TTFA on first decoded chunk). Marked
WIP in the original development commit — surface is live, edge
cases still soaking. Other formats + Kokoro backends are
unchanged. New env var TALKIES_QWEN3_STREAM_CHUNK_SIZE (default
8) controls codec-steps-per-chunk.
2. pkg-* Makefile workflow. New make targets (pkg-lock / pkg-add /
pkg-update / pkg-upgrade / pkg-remove) call
scripts/bump_exclude_newer.sh before any uv operation so the
[tool.uv] exclude-newer age gate is always anchored to the
moment of the mutation. Closes the "silent drift forward" hole.
Plus housekeeping: .gitattributes enforces LF on shell scripts,
Dockerfile.cuda strips CRLF defensively, qwen3-tts xvec_only kwarg
fix landed (parallel patch — same content as v0.6.1's fix).
Caller code that assumed Content-Length on /v1/audio/speech needs
to adapt for the qwen3_tts + response_format=pcm case. Every
other code path is wire-compatible with v0.6.1.
v0.6.2 was a local-only tag (never published) — this is the next
public release.