feat(audio-lab): live ASR transcript panel (M2) by wavekat-eason · Pull Request #47 · wavekat/wavekat-lab

wavekat-eason · 2026-05-14T23:19:31Z

Summary

M2 of the ASR plan — the frontend that makes ASR visible to a user. Stacks on top of M1 (#46); base = `feat/asr-backend`.

New `AsrConfigPanel` mirrors `TurnConfigPanel` (backend dropdown, preset dropdown, editable label, add / clone / remove).
New `AsrTranscript` card per active config: committed finals with `[mm:ss.s–mm:ss.s]` prefix, dimmed trailing partial that gets overwritten until the final lands, footer with last confidence / count of finals / avg segment duration. Shows "loading model…" until the backend's `ready` event arrives, and "⚠" for warnings. "Copy all" concatenates final text to the clipboard.
Auto-scrolls the transcript list to the bottom unless the user has scrolled up.
`websocket.ts`: new `AsrConfig` / `AsrEventKind` types, `asr_backends` + `asr` server messages, `list_asr_backends` + `set_asr_configs` client messages. Log panel batches the spammy `partial` events (matching how `vad` is batched) and inlines finals / warnings verbatim.
`App.tsx`: `asrConfigs` persisted to `localStorage` (key `lab-asr-configs`), pushed to backend on change + before every `start_recording` / `load_file`, transcripts reset on each new session.

Out of scope (M3 follow-up):

README "ASR" section + first-run download note
Loom / screenshot in the lab README's video table
Cut a `v0.0.x` audio-lab release
Optional v2: transcript ticks on `VadTimeline` / `PipelineTimeline` at each `final`

Test plan

`npm run lint` (clean — no new warnings beyond the 7 pre-existing ones in `FrequencySpectrum` / `Waveform`)
`npm run build` (clean, 1.1s)
Manual smoke test: `make dev`, add an ASR config, record / load a WAV, confirm a transcript appears

🤖 Generated with Claude Code

Frontend half of the ASR integration: - New AsrConfigPanel mirrors TurnConfigPanel — backend + preset + label. - New AsrTranscript card renders finals (with [mm:ss.s–mm:ss.s] prefix) plus a dimmed trailing partial that overwrites until the final lands. Footer shows last confidence, count of finals, average segment duration. "loading model…" until the backend's `ready` event arrives. Copy-all button concatenates final text to the clipboard. - App.tsx wires list_asr_backends on connect, persists asr configs to localStorage, pushes set_asr_configs on change + before start / load_file, resets transcripts on new session. - websocket.ts: new AsrConfig / AsrEventKind types, asr_backends + asr server messages, list_asr_backends + set_asr_configs client messages. Log panel batches `partial` events (matching how `vad` is batched) and inlines finals / warnings. cargo isn't touched — backend already merged on feat/asr-backend. npm run lint clean (no new warnings); npm run build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wavekat-eason · 2026-05-14T23:40:10Z

Superseded by #49 (consolidated ASR work).

## Summary Implements the [ASR plan](https://github.com/wavekat/wavekat-lab/blob/main/docs/05-plan-asr.md) end-to-end. Supersedes the original stacked PRs (#46, #47, #48 — all closed). ### Backend - New `tools/audio-lab/backend/src/asr.rs` module: `AsrConfig`, `AsrServerEvent`, `run_asr_pipeline`. Each config gets a dedicated OS worker thread that owns a `SherpaOnnxAsr`; a tokio task bridges the audio broadcast in, and `blocking_send` bridges transcript events back onto a tokio mpsc. - New WS messages: `ListAsrBackends` / `SetAsrConfigs` (client) and `AsrBackends` / `Asr` (server). `Asr` carries a `kind` discriminator: `ready` / `speech_started` / `speech_ended` / `partial` / `final` / `warning`, with optional `ts_ms` / `end_ms` / `text` / `confidence` / `message` fields populated per kind. - Wired into both `StartRecording` (live mic) and `LoadFile` (WAV upload) paths. - Adds `wavekat-asr = "0.0.4"` with the `sherpa-onnx` feature. ### Frontend - New `AsrConfigPanel` mirrors `TurnConfigPanel` (backend + preset dropdowns, editable label, add / clone / remove). - New `AsrTranscript` card per active ASR config: committed finals with `[mm:ss.s–mm:ss.s]` prefix, dimmed trailing partial that gets overwritten until the final lands, footer with last confidence / count / avg segment duration. Shows `loading model…` until the backend's `ready` event arrives. - `websocket.ts` types + log-panel batching of `asr.partial` messages so the log doesn't drown in partials. Finals and warnings still log inline. - `App.tsx`: `asrConfigs` persisted to `localStorage` (`lab-asr-configs`), pushed to backend on change + before every start / load_file, transcripts reset on each new session. - **2-column layout**: all config panels (VAD / Turn / Pipeline / ASR) moved into a left aside (`w-80` on lg+); waveform / spectrum / timelines / ASR transcript / preprocessed sections fill a flex-1 main column. Matches the layout sketch in `docs/05-plan-asr.md`. Single-column on narrower screens. ### Docs - `tools/audio-lab/README.md`: new "ASR" subsection with the sherpa-onnx preset table and a NOTE about the first-run ~75 MB HF model download. "Live transcripts" added to What It Does. - Top-level `README.md`: ASR mentioned in the audio-lab one-liner + tool-layout blurb. ### Out of scope (follow-up) - Loom / screenshot in the README video table — needs a recording session. - Transcript ticks on `VadTimeline` / `PipelineTimeline` at each `final`. - Two-channel ASR (`Channel::Remote`). - WER / latency benchmarking — wait for a second ASR backend. - Audio-lab release tag — release-please will cut it automatically on merge. ## Test plan - [x] `cargo check --workspace` (backend) - [x] `cargo clippy --workspace -- -D warnings` (backend, when M1 landed) - [x] `cargo test --workspace` (5 pre-existing tests still pass) - [x] `npm run lint` (no new warnings beyond pre-existing 7 in `FrequencySpectrum` / `Waveform`) - [x] `npm run build` (clean) - [ ] Manual smoke test: `make dev`, add an ASR config (sherpa-onnx · bilingual), record / load a WAV, confirm partials roll in and finals commit; toggle preset between bilingual / en / zh and verify model reload. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wavekat-eason mentioned this pull request May 14, 2026

docs(audio-lab): README updates for ASR (M3) #48

Closed

2 tasks

wavekat-eason deleted the branch feat/asr-backend May 14, 2026 23:40

wavekat-eason closed this May 14, 2026

wavekat-eason deleted the feat/asr-frontend branch May 14, 2026 23:40

wavekat-eason mentioned this pull request May 14, 2026

feat(audio-lab): live ASR backend + transcript UI #50

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audio-lab): live ASR transcript panel (M2)#47

feat(audio-lab): live ASR transcript panel (M2)#47
wavekat-eason wants to merge 1 commit into
feat/asr-backendfrom
feat/asr-frontend

wavekat-eason commented May 14, 2026

Uh oh!

wavekat-eason commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wavekat-eason commented May 14, 2026

Summary

Test plan

Uh oh!

wavekat-eason commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant