Skip to content

修复流式 SSE UTF-8 切分与部分输入丢失#415

Merged
appergb merged 1 commit into
Open-Less:betafrom
H-Chris233:fix/issue-413-streaming-sse-edges
May 12, 2026
Merged

修复流式 SSE UTF-8 切分与部分输入丢失#415
appergb merged 1 commit into
Open-Less:betafrom
H-Chris233:fix/issue-413-streaming-sse-edges

Conversation

@H-Chris233
Copy link
Copy Markdown
Collaborator

@H-Chris233 H-Chris233 commented May 12, 2026

User description

变更说明

  • 修复 OpenAI-compatible 流式润色、QA 流式、Codex OAuth 流式 SSE 在 HTTP chunk 中切开 UTF-8 多字节字符时误报 non-utf8 的问题。
  • type_unicode_chunk 改为返回已确认输入字符数;中途失败时只记录已实际落屏前缀,保证 history / clipboard / 屏幕内容一致。
  • 流式已落屏文本跳过最终简繁收敛与纠正规则,避免后处理改写用户屏幕上已经出现的内容。
  • Linux enigo 路径改为逐字符输入,失败时也尽量报告 partial typed chars。

Closes #413

验证

  • cargo test --manifest-path src-tauri/Cargo.toml --lib
  • cargo check --manifest-path src-tauri/Cargo.toml
  • git diff --check

PR Type

Bug fix, Enhancement, Tests


Description

  • Fix UTF-8 multi-byte splitting across HTTP chunks by incremental decoding in streaming polish/QA/Codex

  • Return typed character count from type_unicode_chunk to prevent history/clipboard mismatch on partial failure

  • Skip Chinese script convergence and correction rules for already-streamed text to avoid rewriting displayed content

  • Add comprehensive tests for UTF-8 edge cases and partial typing recovery


File Walkthrough

Relevant files
Bug fix
dictation.rs
Streamed text consistency and post-processing bypass         

openless-all/app/src-tauri/src/coordinator/dictation.rs

  • Add append_typed_prefix() and finalize_dictation_text() helpers
  • Use typed_chars from type_unicode_chunk to record only confirmed
    prefix
  • Skip mutating post-processing for streamed text to preserve on-screen
    content
  • Add unit tests for prefix truncation and post-processing bypass
+135/-43
polish.rs
Incremental UTF-8 decoding for SSE streaming endpoints     

openless-all/app/src-tauri/src/polish.rs

  • Introduce utf8_pending buffer and append_utf8_sse_chunk() for
    incremental decoding
  • Replace direct from_utf8() calls in three streaming methods
  • Add finish_utf8_sse_chunks() to validate final pending bytes
  • Add integration tests simulating split UTF-8 across HTTP chunks
+220/-16
Enhancement
unicode_keystroke.rs
Return typed char count on partial keystroke failure         

openless-all/app/src-tauri/src/unicode_keystroke.rs

  • Change type_unicode_chunk return type to Result
  • Add TypeError::Partial variant carrying successfully typed char count
  • Implement per-character accounting on macOS, Windows, and Linux
  • Add tests for Partial error reporting
+127/-22

Streaming LLM output can split UTF-8 codepoints across HTTP chunks, and unicode typing can fail after a prefix has already reached the focused app. The stream path now decodes SSE bytes incrementally and records only confirmed typed prefixes so history and clipboard match what appeared on screen.

Constraint: Issue Open-Less#413 requires fixing both streaming polish and QA SSE paths plus partial-delta typing loss.

Rejected: Decode each response chunk independently | valid UTF-8 can be split by HTTP frame boundaries.

Rejected: Keep full delta on typing error | it records text that may never have reached the target app.

Confidence: high

Scope-risk: moderate

Directive: Do not re-enable mutating final postprocess for already_streamed text without proving screen/history/clipboard consistency.

Tested: cargo test --manifest-path src-tauri/Cargo.toml --lib

Tested: cargo check --manifest-path src-tauri/Cargo.toml

Tested: git diff --check
@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🎫 Ticket compliance analysis ✅

413 - Fully compliant

Compliant requirements:

  • UTF-8 chunk 切分兼容:流式润色、QA 流式、Codex OAuth 流式均已支持。
  • 部分输入成功时仅记录已输入前缀,避免 history / clipboard 与屏幕不一致。
  • 已流式显示文本跳过最终后处理。
  • 已补充相关回归测试。
⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ No major issues detected

@appergb appergb merged commit 71f2792 into Open-Less:beta May 12, 2026
4 checks passed
@H-Chris233 H-Chris233 deleted the fix/issue-413-streaming-sse-edges branch May 12, 2026 09:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

streaming SSE: UTF-8 chunk parsing + partial-delta loss (follow-up to PR #412)

2 participants