fix(local-asr): Qwen3-ASR 长语音末段丢内容 + 长录音超时 by appergb · Pull Request #434 · Open-Less/openless

appergb · 2026-05-13T14:43:25Z

User description

背景

两个独立缺陷叠加，导致本地 Qwen3-ASR 在长语音 / 立即松键场景下丢内容或全段失败。

缺陷 1（主要，丢内容）

qwen_engine.rs:94 注释明确写出 transcribe_stream 内部按 2s chunk 切片。用户说完最后一个字立刻按快捷键时，录音缓冲里没有任何静默尾巴 → 最后一个不足 2s 的 chunk 拿不到静默帧 → C 引擎不会把它当作"语音已结束" → 该 chunk 的转写结果被丢弃，末段内容消失。

复现侧验证：等 5–10 秒静默再按快捷键，那段静默会随录音进入缓冲 → C 引擎见到静默 → 末 chunk 正常收尾 → 无丢失。

缺陷 2（次要，长录音超时）

COORDINATOR_GLOBAL_TIMEOUT_SECS = 15（coordinator.rs:3593）。本地 Qwen 路径走 asr_transcribe_uses_global_timeout 的默认 true 分支（coordinator.rs:87-93），命中 15s 全局超时。

用户实测 RTF ≈ 0.3、慢机器可达 0.5 → 60s 录音需要约 18s 转写 → 直接超时把整段结果丢弃。

修复

缺陷 1（`local_provider.rs`）

let mut samples_f32 = i16_le_bytes_to_f32(&pcm_bytes);
// 末 chunk 收尾信号：追加 0.5s 静默 = 8000 个 f32 零值 @ 16kHz
samples_f32.extend(std::iter::repeat(0.0f32).take(8_000));

duration_ms 仍按原始缓冲长度计算，padding 不计入。

缺陷 2（`coordinator.rs` + `coordinator/dictation.rs`）

新增 module-level helper：

fn local_qwen_transcribe_timeout(audio_secs: f64) -> std::time::Duration {
    let secs = ((audio_secs * 0.6).ceil() as u64)
        .saturating_add(10)
        .max(COORDINATOR_GLOBAL_TIMEOUT_SECS);
    std::time::Duration::from_secs(secs)
}

dictation.rs 的 ActiveAsr::Local 分支在调 transcribe() 前读 local.buffer_duration_ms() 算出 audio_secs，用新 helper 决定超时。其他 ASR 路径（Volcengine / Whisper / Bailian / Foundry / QA）全部未改，仍是固定 15s。

配套新增 LocalQwenAsr::buffer_duration_ms() -> u64，&self 不消费缓冲。

Test plan

cargo check --manifest-path openless-all/app/src-tauri/Cargo.toml 通过
cargo test --manifest-path openless-all/app/src-tauri/Cargo.toml --lib local_qwen_timeout — 4 条新单测全过：
- 短录音 5s 兜底返回 15s
- 60s 录音返回 46s（与用户给的公式示例一致）
- 10.1s 录音 ceil 到 17s
- 0s 边界返回 15s
cargo test ... --lib coordinator — 39 条现有测试全过，零回归
真机回归（macOS，必须）：
- 60s 一口气说完立刻松键 → 末段不应丢字（验证缺陷 1）
- 90s 长录音 → 不再超时失败（验证缺陷 2）
- 短录音（5s 内）行为不变

关联

issue #420 评论里 @aeoform 反馈触发延伸排查时发现的两条独立缺陷，与 #420 主线（Wayland CLI 触发）无直接关联。

PR Type

Bug fix, Tests

Description

Fix local Qwen trailing audio loss
Replace fixed timeout with dynamic scaling
Add buffer duration reader for coordinator
Cover timeout rules with unit tests

Diagram Walkthrough

flowchart LR
  A["Local audio buffer"] --> B["LocalQwenAsr transcribe"]
  B --> C["Append 0.5s silence padding"]
  A --> D["Read buffered duration"]
  D --> E["Dynamic Qwen timeout"]
  E --> F["Coordinator end_session"]
  F --> G["Transcribe with timeout"]

File Walkthrough

Relevant files

Bug fix

local_provider.rs `Add silence padding and buffer duration` openless-all/app/src-tauri/src/asr/local/local_provider.rs Added `buffer_duration_ms()` to read queued audio length without consuming it. Appended 0.5 seconds of silence before local Qwen transcription. Kept returned `duration_ms` based on original audio only. Documented why padding helps the C engine finish the last chunk.	+14/-1
coordinator.rs `Add dynamic local Qwen timeout logic` openless-all/app/src-tauri/src/coordinator.rs Added `local_qwen_transcribe_timeout(audio_secs)` helper. Uses `max(15, ceil(audio_s × 0.6) + 10)` for local Qwen ASR. Added unit tests for short audio, long audio, rounding, and zero length. Preserved existing timeout behavior for other ASR paths.	+47/-0
dictation.rs `Use audio-based timeout during transcription` openless-all/app/src-tauri/src/coordinator/dictation.rs Switched local Qwen transcription from fixed timeout to dynamic timeout. Read buffered audio duration before calling `transcribe()`. Added runtime logging for audio length and computed timeout. Updated timeout error handling to report dynamic values.	+15/-6

两个独立缺陷一并修：缺陷 1（主要，丢内容）：transcribe_stream 内部按 2s chunk 切片；用户说完最后一个字立刻松键时录音缓冲没有任何静默尾巴，末 chunk < 2s 拿不到静默帧 → C 引擎不收尾 → 该 chunk 转写结果被丢弃。等 5-10 秒静默再松键时由于尾部静默被录进缓冲反而正常。修复：local_provider.rs transcribe() 把 PCM 转 f32 后追加 0.5 秒（8000 个 f32 零值 @ 16kHz）静默，给 C 引擎收尾信号。duration_ms 仍按原始缓冲长度计算，padding 不计入。缺陷 2（次要，长录音超时）：COORDINATOR_GLOBAL_TIMEOUT_SECS = 15s 固定值；用户 RTF ≈ 0.3、慢机可达 0.5，60s 录音需 ~18s 转写就直接超时。修复：新增 local_qwen_transcribe_timeout(audio_secs) -> Duration，公式 max(15, ceil(audio_s × 0.6) + 10)；只在 Local Qwen 路径用（Volcengine / Whisper / Bailian / Foundry / QA 路径不动）。配套 LocalQwenAsr::buffer_duration_ms() 不消费缓冲地读取音频时长。加 4 条单测覆盖公式：短录音兜底 15s、长录音线性放大、ceil 部分秒、 0 秒边界。39 条 coordinator 测试全过。

github-actions · 2026-05-13T14:46:44Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis ❌ 420 - Not compliant Non-compliant requirements: Global recording shortcut support on Debian Wayland. Script/command guidance for configuring the shortcut in system settings. End-to-end behavior of the shortcut in the global desktop environment. Requires further human verification: Runtime validation on an actual Debian Wayland desktop environment.
⏱️ Estimated effort to review: 2 🔵🔵⚪⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ No major issues detected

- PR #433：Wayland callout 补全 --toggle-qa / --cancel-dictation 三命令并列 + 五语言同步翻译 - PR #434：本地 Qwen3-ASR 末段静默 padding 修末 chunk 丢内容；动态超时 max(15, ceil(audio_s × 0.6) + 10) 修长录音超时

github-actions Bot added the Review effort 2/5 label May 13, 2026

appergb merged commit 1ab0807 into beta May 13, 2026
4 checks passed

appergb deleted the fix/qwen-asr-long-audio-loss branch May 13, 2026 15:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(local-asr): Qwen3-ASR 长语音末段丢内容 + 长录音超时#434

fix(local-asr): Qwen3-ASR 长语音末段丢内容 + 长录音超时#434
appergb merged 1 commit into
betafrom
fix/qwen-asr-long-audio-loss

appergb commented May 13, 2026 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

appergb commented May 13, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

背景

缺陷 1（主要，丢内容）

缺陷 2（次要，长录音超时）

修复

缺陷 1（local_provider.rs）

缺陷 2（coordinator.rs + coordinator/dictation.rs）

Test plan

关联

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

github-actions Bot commented May 13, 2026

PR Reviewer Guide 🔍

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

appergb commented May 13, 2026 •

edited by github-actions Bot

Loading

缺陷 1（`local_provider.rs`）

缺陷 2（`coordinator.rs` + `coordinator/dictation.rs`）