修复流式 SSE UTF-8 切分与部分输入丢失#415
Merged
appergb merged 1 commit intoMay 12, 2026
Merged
Conversation
Streaming LLM output can split UTF-8 codepoints across HTTP chunks, and unicode typing can fail after a prefix has already reached the focused app. The stream path now decodes SSE bytes incrementally and records only confirmed typed prefixes so history and clipboard match what appeared on screen. Constraint: Issue Open-Less#413 requires fixing both streaming polish and QA SSE paths plus partial-delta typing loss. Rejected: Decode each response chunk independently | valid UTF-8 can be split by HTTP frame boundaries. Rejected: Keep full delta on typing error | it records text that may never have reached the target app. Confidence: high Scope-risk: moderate Directive: Do not re-enable mutating final postprocess for already_streamed text without proving screen/history/clipboard consistency. Tested: cargo test --manifest-path src-tauri/Cargo.toml --lib Tested: cargo check --manifest-path src-tauri/Cargo.toml Tested: git diff --check
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
变更说明
non-utf8的问题。type_unicode_chunk改为返回已确认输入字符数;中途失败时只记录已实际落屏前缀,保证 history / clipboard / 屏幕内容一致。Closes #413
验证
cargo test --manifest-path src-tauri/Cargo.toml --libcargo check --manifest-path src-tauri/Cargo.tomlgit diff --checkPR Type
Bug fix, Enhancement, Tests
Description
Fix UTF-8 multi-byte splitting across HTTP chunks by incremental decoding in streaming polish/QA/Codex
Return typed character count from type_unicode_chunk to prevent history/clipboard mismatch on partial failure
Skip Chinese script convergence and correction rules for already-streamed text to avoid rewriting displayed content
Add comprehensive tests for UTF-8 edge cases and partial typing recovery
File Walkthrough
dictation.rs
Streamed text consistency and post-processing bypassopenless-all/app/src-tauri/src/coordinator/dictation.rs
append_typed_prefix()andfinalize_dictation_text()helperstyped_charsfromtype_unicode_chunkto record only confirmedprefix
content
polish.rs
Incremental UTF-8 decoding for SSE streaming endpointsopenless-all/app/src-tauri/src/polish.rs
utf8_pendingbuffer andappend_utf8_sse_chunk()forincremental decoding
from_utf8()calls in three streaming methodsfinish_utf8_sse_chunks()to validate final pending bytesunicode_keystroke.rs
Return typed char count on partial keystroke failureopenless-all/app/src-tauri/src/unicode_keystroke.rs
type_unicode_chunkreturn type toResultTypeError::Partialvariant carrying successfully typed char countPartialerror reporting