fix(bailian): 加固 DashScope ASR 累积文本重复修复#535
Conversation
- 跳过 heartbeat 事件(不含识别文本) - 改用 API 文档标注的 sentence_end 判断 finality,end_time > 0 作为兼容 fallback - 新增 partial_segments BTreeMap 追踪 interim 结果,final 时自动清理 - sentence_id == 0 的 final 结果不再丢弃,存入 final_segments - 新增 merge_segments() 带 2 字符最小重叠检测的拼接守卫 - finish_with_partial_or_error 同时检查 partial_segments - 测试从 3 个扩展到 20 个
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
User description
背景
Issue #530 报告 DashScope ASR 在片段累积时产生重复文本。之前的 commit 54f07d8 已修复了核心的
end_time判断和BTreeMap覆盖。本次是在其基础上的进一步加固。改动
sentence.heartbeat事件sentence_end判断sentence_end布尔字段,end_time > 0作为兼容 fallbackpartial_segments追踪sentence_id == 0不丢弃final_segments,不再 dropmerge_segments()函数,最小 2 字符重叠检测测试
审查
代码经 Claude Code 自动化审查 + 独立子代理审查,无严重问题。
PR Type
Bug fix, Tests
Description
Skip heartbeat-only ASR events
Commit finals by
sentence_endCache partials per
sentence_idMerge overlaps and expand tests
Diagram Walkthrough
File Walkthrough
bailian.rs
Harden DashScope ASR deduplication flowopenless-all/app/src-tauri/src/asr/bailian.rs
heartbeatevents before processing transcript text.sentence_endas the primary finality signal, withend_timeasfallback.
partial_segmentsand clears them when afinal arrives.
merge_segments()to avoid duplicated overlap when assembling thefinal transcript.
finals, zero
sentence_id, and ordered multi-segment assembly.