Skip to content

v0.3.0

Choose a tag to compare

@github-actions github-actions released this 28 May 14:32
· 17 commits to main since this release
docker-talkies v0.3.0 — Kokoro TTS.

Adds OpenAI-compatible /v1/audio/speech with mp3/opus/aac/flac/wav/pcm
output, /v1/audio/voices discovery, kokoro-82m in both CPU and CUDA
images. New backend protocol split (BackendBase / ASRBackend /
TTSBackend). Cross-modality eviction shares one VRAM pool between ASR
and TTS; idle TTL sweeper applies to both.

Both runtime images now bundle en_core_web_sm so Kokoro's English G2P
never tries to pip-download at first call (runtime has no pip).

Integration suite gains a cross-modality round-trip test (Kokoro synth
→ fast ASR → assert expected words) plus CPU/memory caps on the test
container to keep the host responsive while inference is running.

Backwards-compatible: all existing ASR endpoints, model slugs, MCP
tools, and response shapes work identically.