Skip to content

fix(asr): ASR 数据不丢失三件套 — partial 缓存 + 末帧排队 + Whisper 失败保留 (closes #54 #55 #56)#74

Merged
appergb merged 1 commit into
developfrom
fix/asr-no-data-loss
Apr 30, 2026
Merged

fix(asr): ASR 数据不丢失三件套 — partial 缓存 + 末帧排队 + Whisper 失败保留 (closes #54 #55 #56)#74
appergb merged 1 commit into
developfrom
fix/asr-no-data-loss

Conversation

@appergb
Copy link
Copy Markdown
Collaborator

@appergb appergb commented Apr 30, 2026

单一原则:用户的话不能丢

三个独立但同源的丢字 bug,都在 ASR 边界上。集中一个 PR 因为它们共享 "asr/" 文件域、共享测试场景、互不冲突。

#54 partial 缓存兜底(volcengine.rs)

服务端在 final 帧前关连接 / 网络中断时,原本 `signal_error(NoFinalResult)` → 上层拿到空 → 已识别的话全丢。
新增 `last_partial_text`,关连接路径走 `fallback_to_partial_or_error`:有 partial 就 success(partial),没有才 error。

#55 末帧排队(volcengine.rs)

`consume_pcm_chunk` fire-and-forget spawn 发 chunk;`send_last_frame` 直接发 NegativeSequence 末帧 → 末帧可能先于尾部 chunk 到服务端 → 后续 chunk 被当成 "stream 已结束" 之后的多余数据丢弃 → 尾句吞掉。
`pending_sends: AtomicUsize` + `Notify`:末帧前等所有 chunk 发送完成(800ms 上限)。

#56 Whisper 失败保留(whisper.rs)

`transcribe` 一进函数就 `mem::take` 清空 buffer,凭证错 / 网络挂 / 解析失败都让 PCM 直接消失,无法重试。
改为 clone → `transcribe_inner` → 成功才 clear。

Test plan

  • cargo check 通过
  • 手动:拔网线录一段 → 看胶囊文字 + 历史是否保留 partial
  • 手动:录长一点(10s+)观察尾部不被截断
  • 手动:清空 Whisper API key 后录一段 → 错误返回但 PCM 留在内存

Summary by Sourcery

Ensure ASR never silently drops recognized or recorded user speech by adding fallbacks and send-order safeguards to Volcengine streaming ASR and preserving audio on Whisper batch failures.

Bug Fixes:

  • Return the latest partial transcript when the Volcengine ASR connection closes or fails before a final result instead of treating it as no result.
  • Prevent loss of trailing speech in Volcengine streaming ASR by waiting for all pending audio chunks to be sent before transmitting the terminal frame.
  • Keep Whisper batch ASR PCM buffers on transcription failure so recordings are not discarded and can be retried or logged.

Enhancements:

  • Track in-flight Volcengine audio sends with a pending counter and notification mechanism to coordinate orderly stream termination.

closes #54, #55, #56

三件事,同一目标:用户的口述内容不能因为网络抖动 / 服务端关连接 / 凭证错误
而消失。

## #54 partial cache (volcengine.rs)
- SyncState 加 last_partial_text;每次非 final result 缓存当前累积文本
- 服务端 close(无 final 帧)和 receive loop Err(网络中断)路径不再直接 NoFinalResult
- 新增 fallback_to_partial_or_error:有 partial 就 signal_success(RawTranscript),
  没有才 signal_error;外层 await_final_result 拿到的是已识别那段文字而非空
- 实测命中:网络抖动断线,用户至少拿到断线前那段已识别的话

## #55 audio finalize race (volcengine.rs)
- 新增 pending_sends: AtomicUsize + send_done: Notify
- consume_pcm_chunk 每个 spawn fetch_add(1),spawn 内发送完毕 fetch_sub(1) + notify
- send_last_frame 进函数先 await pending_sends == 0(800ms 上限避网络极端场景)
  再发 leftover + NegativeSequence 末帧
- 防止:fire-and-forget 的 chunk-send 与末帧并发,末帧先到服务端 → 后续 chunk
  被当成"流已结束"后多余帧丢弃 → 尾句吞掉

## #56 Whisper buffer 保留 (whisper.rs)
- 之前 transcribe 一进函数就 mem::take 把 buffer 清空,凭证错 / 网络挂 / 解析失败
  都让 PCM 直接消失,无法重试也无法 fallback
- 改为先 clone (~960 KB / 30s 音频) → transcribe_inner → 成功才 clear buffer
- 失败时 buffer 保持,给上层"再试一次"或"留失败历史记录"留余地
@appergb appergb merged commit fea402d into develop Apr 30, 2026
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 30, 2026

Reviewer's Guide

Adds robustness to ASR pipelines to ensure recognized speech is not lost by caching partial transcripts in Volcengine streaming ASR, enforcing send ordering for the last frame, and preserving Whisper batch PCM buffers on failures, with minimal API surface changes.

Sequence diagram for Volcengine fallback_to_partial_or_error on close or network error

sequenceDiagram
    participant Service as VolcengineASRService
    participant VolcengineStreamingASR as Volcengine
    participant Coordinator

    loop partial_results
        Service-->>Volcengine: partial_transcript(text, has_final=false)
        Volcengine->>Volcengine: on_transcript_message(text, has_final=false)
        Volcengine->>Volcengine: state.last_partial_text = text
    end

    alt normal_final
        Service-->>Volcengine: final_transcript(text, has_final=true)
        Volcengine->>Volcengine: on_transcript_message(text, has_final=true)
        Volcengine->>Coordinator: signal_success(RawTranscript{text, duration_ms})
    else close_without_final
        Service-->>Volcengine: websocket_close
        Volcengine->>Volcengine: fallback_to_partial_or_error(NoFinalResult)
        alt last_partial_text is not empty
            Volcengine->>Coordinator: signal_success(RawTranscript{last_partial_text, duration_ms})
        else no_partial_cached
            Volcengine->>Coordinator: signal_error(NoFinalResult)
        end
    else network_error
        Service--xVolcengine: connection_error(e)
        Volcengine->>Volcengine: fallback_to_partial_or_error(ConnectionFailed)
        alt last_partial_text is not empty
            Volcengine->>Coordinator: signal_success(RawTranscript{last_partial_text, duration_ms})
        else no_partial_cached
            Volcengine->>Coordinator: signal_error(ConnectionFailed)
        end
    end
Loading

Class diagram for updated ASR components (VolcengineStreamingASR, SyncState, WhisperBatchASR)

classDiagram
    class SyncState {
        +bool is_connected
        +Option~Sender_RawTranscript_or_VolcengineASRError~~ final_tx
        +Option~Handle~ runtime
        +Option~Instant~ start
        +String last_partial_text
    }

    class VolcengineStreamingASR {
        +ParkingMutex~SyncState~ state
        +SharedWriter writer
        +ParkingMutex~Option~Receiver_RawTranscript_or_VolcengineASRError~~ final_rx
        +Arc~AtomicUsize~ pending_sends
        +Arc~Notify~ send_done
        +start_stream() Result
        +handle_incoming() void
        +send_last_frame() Result
        +on_transcript_message(full_text String, has_final bool) void
        +signal_success(transcript RawTranscript) void
        +signal_error(err VolcengineASRError) void
        -fallback_to_partial_or_error(err VolcengineASRError) void
    }

    class WhisperBatchASR {
        +Mutex~Vec_u8~ buffer
        +String base_url
        +Option~String~ api_key
        +transcribe() Result~RawTranscript~
        -transcribe_inner(pcm BytesRef) Result~RawTranscript~
    }

    class AudioConsumer {
        <<trait>>
        +consume_pcm_chunk(pcm Vec_u8, seq u64) void
        +send_last_frame() Result
    }

    VolcengineStreamingASR ..|> AudioConsumer
    VolcengineStreamingASR --> SyncState
Loading

File-Level Changes

Change Details Files
Cache and return last partial transcript when Volcengine streaming ASR ends abruptly without a final result.
  • Extend SyncState with a last_partial_text field and clear it when starting a new stream.
  • Update transcript handling to store the latest non-final, non-empty partial transcript in last_partial_text.
  • Replace direct error signaling on server close or connection error with a fallback_to_partial_or_error helper that prefers returning the cached partial transcript as a successful RawTranscript, including duration, and only errors when no partial is cached.
  • Mark the connection as disconnected after fallback handling to keep SyncState consistent.
openless-all/app/src-tauri/src/asr/volcengine.rs
Ensure all audio chunks are sent before the Volcengine streaming ASR sends the final (negative sequence) frame, to avoid tail truncation.
  • Add pending_sends and send_done fields to VolcengineStreamingASR, initialize/reset pending_sends on start_stream.
  • In consume_pcm_chunk, increment pending_sends before spawning the send task, and decrement it when the send completes, notifying waiters when the count reaches zero.
  • In send_last_frame, wait (up to 800ms) for pending_sends to drain to zero using a Notify + timeout loop, logging a warning if pending sends remain before proceeding to drain leftover audio and send the last frame.
openless-all/app/src-tauri/src/asr/volcengine.rs
Preserve Whisper batch ASR PCM buffer on transcription failures to allow retries or later inspection.
  • Change transcribe to clone the buffered PCM instead of mem::take so the original buffer is retained during the request, while still short-circuiting on empty buffers.
  • Introduce transcribe_inner that takes a PCM slice and performs duration calculation, key check, WAV encoding, and HTTP POST, reusing the previous logic with a borrowed slice.
  • After calling transcribe_inner, clear the internal buffer only on success; on error, keep the PCM data intact for potential retries.
  • Adjust encode_wav_16k_mono invocation to accept a slice reference instead of an owned Vec.
openless-all/app/src-tauri/src/asr/whisper.rs

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@appergb appergb deleted the fix/asr-no-data-loss branch April 30, 2026 05:33
Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant