Skip to content

v0.7.0

Choose a tag to compare

@github-actions github-actions released this 31 May 10:17
· 3 commits to main since this release
docker-talkies v0.7.0 — Qwen3-TTS PCM streaming + supply-chain

bump-on-mutation Makefile workflow.

Minor release. Two user-visible threads.

1. PCM streaming for Qwen3-TTS. response_format="pcm" against a
   qwen3_tts model now streams the raw PCM body via HTTP/1.1
   chunked transfer-encoding instead of buffering the full
   utterance. First-audio latency drops from ~3-8 s (synthesise +
   buffer) to ~200-700 ms (TTFA on first decoded chunk). Marked
   WIP in the original development commit — surface is live, edge
   cases still soaking. Other formats + Kokoro backends are
   unchanged. New env var TALKIES_QWEN3_STREAM_CHUNK_SIZE (default
   8) controls codec-steps-per-chunk.

2. pkg-* Makefile workflow. New make targets (pkg-lock / pkg-add /
   pkg-update / pkg-upgrade / pkg-remove) call
   scripts/bump_exclude_newer.sh before any uv operation so the
   [tool.uv] exclude-newer age gate is always anchored to the
   moment of the mutation. Closes the "silent drift forward" hole.

Plus housekeeping: .gitattributes enforces LF on shell scripts,
Dockerfile.cuda strips CRLF defensively, qwen3-tts xvec_only kwarg
fix landed (parallel patch — same content as v0.6.1's fix).

Caller code that assumed Content-Length on /v1/audio/speech needs
to adapt for the qwen3_tts + response_format=pcm case. Every
other code path is wire-compatible with v0.6.1.

v0.6.2 was a local-only tag (never published) — this is the next
public release.