Skip to content

OpenQuack 2.0.0-alpha.18

Pre-release
Pre-release

Choose a tag to compare

@larryxiao larryxiao released this 03 Jun 14:02
· 13 commits to main since this release

Highlights

Mixed Chinese/English now holds up across a whole recording. 🌏

Start a thought in English and finish it in 中文 — your Chinese comes back as Chinese, not a botched English "translation," and it's normalised to your system's script (简体 or 繁體).

Before, the streaming engine locked onto the language of your first sentence. Switch to Chinese partway through a long recording and the Chinese tail was silently translated to English:

…continue speaking the next portion entirely in Chinese. I am still trying to read and organize my books with it. The effect is very satisfying…

Now each full chunk re-detects, so the switch is transcribed as Chinese:

…所以,在此刻,让我转过来,并继续在中文中的下一部分说。 最近我还在尝试用它来朗读和整理我的读书笔记…

The script normaliser then puts those characters in your system's script, so you stop getting a random mix of Traditional and Simplified.

Measured on medium (streaming auto, M4 / 16 GB), A/B against the old lock:

Case Old (lock) New (per-chunk)
Monolingual English (49 s) WER 3.7% 3.7% — byte-identical, no regression
Monolingual Mandarin (37 s) hanzi byte-identical, no regression
Code-switch EN→ZH (53 s) Chinese tail → English Chinese tail → correct hanzi

Known limitation: a switch crammed into the final <10 s of a recording lands in the short trailing chunk and is missed (it inherits the previous language). A switch you keep talking in is caught.

See SPEC-035 and #69.