feat(audio-lab): wire wavekat-asr backend (M1) by wavekat-eason · Pull Request #46 · wavekat/wavekat-lab

wavekat-eason · 2026-05-14T23:08:22Z

Summary

M1 of the ASR plan — backend wiring only.

New backend/src/asr.rs module: AsrConfig, AsrServerEvent, run_asr_pipeline. Per-config OS thread owns a SherpaOnnxAsr; tokio task bridges the existing audio broadcast in, blocking_send bridges transcript events back.
WS surface: new ListAsrBackends / SetAsrConfigs client messages, AsrBackends + Asr server messages. Asr carries a kind discriminator (ready / speech_started / speech_ended / partial / final / warning) with the relevant optional payload fields.
Wired into both StartRecording (live mic) and LoadFile (WAV upload) paths so the same runner serves both flows.
Adds wavekat-asr = "0.0.4" with the sherpa-onnx feature. First record after pulling will download the ~75 MB bilingual Zipformer to \$HF_HOME; the worker emits a ready event once the model is loaded.

Out of scope (later milestones, per the plan):

Frontend AsrConfigPanel and AsrTranscript (M2)
README "ASR" section, log-panel batching, cold-start loading UI (M3)
Two-channel ASR, per-config preprocessing, benchmark table

Test plan

cargo check --workspace
cargo clippy --workspace -- -D warnings
cargo test --workspace (5 existing tests pass)
Manual: make dev-backend, send set_asr_configs + start_recording via wscat, confirm partials/finals print in logs

🤖 Generated with Claude Code

Adds a sherpa-onnx ASR backend that fans out alongside the existing VAD and turn-detection pipelines. Each AsrConfig runs in its own worker thread (sherpa-onnx is sync + holds model state); a tokio task bridges the audio broadcast in, and a blocking_send loop bridges transcript events back to the websocket. WS surface: ListAsrBackends / SetAsrConfigs client messages, AsrBackends + Asr server messages. Asr events carry a `kind` field (ready, speech_started, speech_ended, partial, final, warning) with optional ts_ms/end_ms/text/confidence/message. M1 scope: backend only — no frontend yet. cargo check + clippy clean, existing tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wavekat-eason · 2026-05-14T23:40:07Z

Superseded by #49 (consolidated ASR work).

## Summary Implements the [ASR plan](https://github.com/wavekat/wavekat-lab/blob/main/docs/05-plan-asr.md) end-to-end. Supersedes the original stacked PRs (#46, #47, #48 — all closed). ### Backend - New `tools/audio-lab/backend/src/asr.rs` module: `AsrConfig`, `AsrServerEvent`, `run_asr_pipeline`. Each config gets a dedicated OS worker thread that owns a `SherpaOnnxAsr`; a tokio task bridges the audio broadcast in, and `blocking_send` bridges transcript events back onto a tokio mpsc. - New WS messages: `ListAsrBackends` / `SetAsrConfigs` (client) and `AsrBackends` / `Asr` (server). `Asr` carries a `kind` discriminator: `ready` / `speech_started` / `speech_ended` / `partial` / `final` / `warning`, with optional `ts_ms` / `end_ms` / `text` / `confidence` / `message` fields populated per kind. - Wired into both `StartRecording` (live mic) and `LoadFile` (WAV upload) paths. - Adds `wavekat-asr = "0.0.4"` with the `sherpa-onnx` feature. ### Frontend - New `AsrConfigPanel` mirrors `TurnConfigPanel` (backend + preset dropdowns, editable label, add / clone / remove). - New `AsrTranscript` card per active ASR config: committed finals with `[mm:ss.s–mm:ss.s]` prefix, dimmed trailing partial that gets overwritten until the final lands, footer with last confidence / count / avg segment duration. Shows `loading model…` until the backend's `ready` event arrives. - `websocket.ts` types + log-panel batching of `asr.partial` messages so the log doesn't drown in partials. Finals and warnings still log inline. - `App.tsx`: `asrConfigs` persisted to `localStorage` (`lab-asr-configs`), pushed to backend on change + before every start / load_file, transcripts reset on each new session. - **2-column layout**: all config panels (VAD / Turn / Pipeline / ASR) moved into a left aside (`w-80` on lg+); waveform / spectrum / timelines / ASR transcript / preprocessed sections fill a flex-1 main column. Matches the layout sketch in `docs/05-plan-asr.md`. Single-column on narrower screens. ### Docs - `tools/audio-lab/README.md`: new "ASR" subsection with the sherpa-onnx preset table and a NOTE about the first-run ~75 MB HF model download. "Live transcripts" added to What It Does. - Top-level `README.md`: ASR mentioned in the audio-lab one-liner + tool-layout blurb. ### Out of scope (follow-up) - Loom / screenshot in the README video table — needs a recording session. - Transcript ticks on `VadTimeline` / `PipelineTimeline` at each `final`. - Two-channel ASR (`Channel::Remote`). - WER / latency benchmarking — wait for a second ASR backend. - Audio-lab release tag — release-please will cut it automatically on merge. ## Test plan - [x] `cargo check --workspace` (backend) - [x] `cargo clippy --workspace -- -D warnings` (backend, when M1 landed) - [x] `cargo test --workspace` (5 pre-existing tests still pass) - [x] `npm run lint` (no new warnings beyond pre-existing 7 in `FrequencySpectrum` / `Waveform`) - [x] `npm run build` (clean) - [ ] Manual smoke test: `make dev`, add an ASR config (sherpa-onnx · bilingual), record / load a WAV, confirm partials roll in and finals commit; toggle preset between bilingual / en / zh and verify model reload. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wavekat-eason mentioned this pull request May 14, 2026

feat(audio-lab): live ASR transcript panel (M2) #47

Closed

3 tasks

wavekat-eason closed this May 14, 2026

wavekat-eason deleted the feat/asr-backend branch May 14, 2026 23:40

wavekat-eason mentioned this pull request May 14, 2026

feat(audio-lab): live ASR backend + transcript UI #50

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audio-lab): wire wavekat-asr backend (M1)#46

feat(audio-lab): wire wavekat-asr backend (M1)#46
wavekat-eason wants to merge 1 commit into
mainfrom
feat/asr-backend

wavekat-eason commented May 14, 2026

Uh oh!

wavekat-eason commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wavekat-eason commented May 14, 2026

Summary

Test plan

Uh oh!

wavekat-eason commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant