feat: extend accuracy report to cover wavekat-zh backend by wavekat-eason · Pull Request #21 · wavekat/wavekat-turn

wavekat-eason · 2026-05-11T10:18:57Z

Summary

Reference probabilities in tests/fixtures/reference.json are now keyed by (backend, file), so multiple Smart Turn variants can share one fixture file. Older flat entries still load — backend defaults to pipecat via serde.
scripts/gen_reference.py scores both upstream Pipecat v3.2 and the WaveKat zh fine-tune (wavekat/smart-turn-ONNX, zh subfolder) and writes one entry per (backend, clip) pair.
crates/wavekat-turn/tests/accuracy.rs gains a wavekat module gated on wavekat-smart-turn that loads the zh variant once and contributes rows for the zh_*.wav fixtures. Shared load_wav_f32 / raw_prob helpers lifted out of the pipecat module.
make accuracy now builds with --features wavekat-smart-turn; the zh weights are pulled from HuggingFace on first run and cached under $HF_HOME/hub/.

Result

make accuracy (v0.0.8):

Backend	Clip	Python	Rust	Diff	Status
pipecat	`silence_2s.wav`	0.9870	0.9870	0.0000	PASS
pipecat	`speech_finished.wav`	0.9849	0.9849	0.0000	PASS
pipecat	`speech_mid.wav`	0.0477	0.0426	0.0051	PASS
pipecat	`zh_speech_finished.wav`	0.9865	0.9865	0.0000	PASS
pipecat	`zh_speech_finished_short.wav`	0.9823	0.9843	0.0019	PASS
pipecat	`zh_speech_mid.wav`	0.0717	0.0717	0.0000	PASS
wavekat-zh	`zh_speech_finished.wav`	0.7660	0.7498	0.0162	PASS
wavekat-zh	`zh_speech_finished_short.wav`	0.8751	0.8751	0.0000	PASS
wavekat-zh	`zh_speech_mid.wav`	0.1640	0.1640	0.0000	PASS

All rows within the existing 0.02 tolerance (max Δ = 0.0162).

Test plan

make accuracy — 9/9 PASS
cargo test -p wavekat-turn --no-default-features --features "" — clean
cargo test -p wavekat-turn --no-default-features --features pipecat --test accuracy — 3 pipecat regression tests pass against the new schema
cargo fmt --check, cargo clippy --features wavekat-smart-turn -- -D warnings — clean
CI green

🤖 Generated with Claude Code

Reference probabilities are now keyed by `(backend, file)` so multiple Smart Turn variants can coexist. `make accuracy` builds with `wavekat-smart-turn` and emits one row per pair, covering upstream Pipecat on both English and Mandarin fixtures plus the WaveKat zh fine-tune on the Mandarin fixtures. All 9 rows pass within the 0.02 tolerance; pipecat-only and no-feature builds remain green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

## 🤖 New release * `wavekat-turn`: 0.0.8 -> 0.0.9 (✓ API compatible changes) <details><summary>Changelog</summary> <blockquote> ## [0.0.9](v0.0.8...v0.0.9) - 2026-05-11 ### Added - extend accuracy report to cover wavekat-zh backend ([#21](#21)) </blockquote> </details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

wavekat-eason merged commit c6dbebd into main May 11, 2026
5 checks passed

wavekat-eason deleted the feat/accuracy-wavekat-zh branch May 11, 2026 10:21

github-actions Bot mentioned this pull request May 11, 2026

chore: release v0.0.9 #22

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: extend accuracy report to cover wavekat-zh backend#21

feat: extend accuracy report to cover wavekat-zh backend#21
wavekat-eason merged 1 commit into
mainfrom
feat/accuracy-wavekat-zh

wavekat-eason commented May 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wavekat-eason commented May 11, 2026

Summary

Result

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant