Skip to content

v1.0.4 — token-aware TTS chunking

Choose a tag to compare

@dta121 dta121 released this 29 May 11:23
· 19 commits to main since this release

Fixed

  • TTS failed with Input of N tokens is over the maximum input limit of 2000 tokens on gpt-4o-mini-tts, especially for translated (CJK) narration. chunk_text() split purely on character count (3800), which suits the tts-1 family's 4096-character limit but ignores the newer model's 2000-token cap — 3800 dense CJK characters are ~3000 tokens.
    • Chunking is now token-aware: a new estimate_tokens() heuristic (wide/CJK characters ~1 token each, other text ~4 chars/token, rounded up as a safe ceiling) plus an 1800-token default cap applied alongside the character cap.
    • Sentence splitting now recognises CJK terminators (。.!?), which carry no trailing whitespace — so unspaced Japanese/Chinese narration splits at real sentence boundaries instead of falling through to a single oversized hard cut.
    • Added unit tests for the token cap and the estimator.

Full changelog: v1.0.3...v1.0.4