fix(polish): 过滤模型深度思考输出 by H-Chris233 · Pull Request #45 · Open-Less/openless

H-Chris233 · 2026-04-30T04:02:26Z

摘要

Fixes #25。

本 PR 解决 MiniMax 等带深度思考输出的模型会把 reasoning / thinking 内容一并插入到最终文本里的问题。

部分 OpenAI-compatible 思考模型会在最终回答前返回显式的 <think>...</think> 内容。此前 polish 流程会把这部分内容当作普通文本继续处理，导致用户输入框中被插入很长的思考块，影响实际输入体验。

本次改动在 response parsing 之后、最终插入文本之前，增加保守的清洗逻辑：只移除明确的 <think>...</think> 块，保留最终润色文本。

修复 / 新增 / 改进

在 clean_polish_output 中增加 thinking block 清洗步骤。
新增 strip_thinking_blocks：
- 只处理显式 <think>...</think> 块
- 支持移除模型返回的 reasoning 内容
- 保留 thinking block 后面的最终文本
保持现有 provider 请求逻辑不变。
保持现有 UI 行为不变。
保持现有 fallback 行为不变。
新增测试覆盖：
- 验证 <think>...</think> 内容会被移除
- 验证最终润色文本会被保留

兼容

不包含：
- 不关闭模型的 thinking 能力。
- 不新增 provider-specific 开关。
- 不修改模型配置 UI。
- 不修改请求参数。
- 不按中文标题、自然语言段落或非稳定文本模式做激进清洗。
对现有用户 / 本地环境 / 构建流程的影响：
- 对使用 MiniMax 等思考模型的用户，插入文本中不再包含显式 <think>...</think> 思考块。
- 对不返回 thinking block 的模型无影响。
- 对现有 provider 配置、fallback 逻辑和 UI 使用方式无影响。
- 构建流程无变化。

测试计划

命令：cargo test polish::tests -- --nocapture
结果：通过
证据路径：本地测试输出
命令：cargo check
结果：通过
证据路径：本地检查输出

主要改动文件

openless-all/app/src-tauri/src/polish.rs

备注

本 PR 采用保守清洗策略，只移除有稳定标签边界的 <think>...</think> 内容。没有选择按“深度思考”“思考过程”等本地化标题文本清洗，避免误删正常用户可见内容。

Summary by Sourcery

Filter out model reasoning blocks from polish output so only the final user-facing text is inserted.

Bug Fixes:

Remove explicit <think>...</think> reasoning sections from polish results to prevent deep-thinking model internals from being inserted into the user input field.

Tests:

Add unit tests covering stripping of simple, attributed, mixed-case, multiple, and malformed <think> blocks while preserving the final polished text.

Thinking-capable OpenAI-compatible models can return tagged reasoning before the final polished answer. The cleanup layer now strips only explicit think-tag blocks after response parsing so existing provider requests, UI, and fallback behavior stay unchanged. Constraint: Issue Open-Less#25 asks to adapt thinking output rather than disabling model thinking. Rejected: Add a provider-specific thinking toggle | broader UI and settings change than needed for the bug. Rejected: Strip localized heading text | too likely to remove normal user-facing content without a stable provider contract. Confidence: medium Scope-risk: narrow Tested: cargo test polish::tests -- --nocapture Tested: cargo check

sourcery-ai · 2026-04-30T04:02:32Z

Reviewer's Guide

Adds conservative stripping of <think>...</think> reasoning blocks from polish outputs so only the final user-facing text is kept, plus tests covering these cases.

Flow diagram for clean_polish_output with strip_thinking_blocks

flowchart TD
  A[Start_clean_polish_output_with_model_content] --> B[Call_strip_thinking_blocks]
  B --> C{Any_think_blocks_found?}
  C -- Yes --> D[Build_output_without_think_blocks]
  D --> E[Return_cleaned_text_from_strip_thinking_blocks]
  C -- No --> F[Return_original_text_as_borrowed_Cow]
  E --> G[Trim_whitespace]
  F --> G[Trim_whitespace]
  G --> H[Strip_markdown_fence_if_present]
  H --> I[Remove_leading_trailing_boilerplate_sentences]
  I --> J[Return_final_polish_output_without_reasoning_blocks]
  J --> K[End]

File-Level Changes

Change	Details	Files
Strip `<think>...</think>` reasoning blocks from polish output before further cleanup so only final user text is inserted.	Call a new `strip_thinking_blocks` helper at the start of `clean_polish_output` before trimming and markdown fence stripping. Return a borrowed `Cow` when no `<think>` tags are found to avoid unnecessary allocation, and an owned string when cleaning is applied. Ensure existing polish post-processing (trimming, markdown fence stripping, boilerplate removal) runs on the cleaned text.	`openless-all/app/src-tauri/src/polish.rs`
Implement robust parsing for `<think>` open/close tags, supporting attributes, ASCII case-insensitivity, and multiple blocks.	Scan the text for `<think>` open tags with `find_think_open`, then pair them with closing tags via `find_think_close`, copying only non-thinking segments into the output buffer. Use `parse_think_open_at` / `parse_think_close_at` to validate positions as actual `<think>` tags and skip non-matching `<` occurrences. Implement `parse_think_tag_end` to match `think` in a case-insensitive way, allow optional attributes on the opening tag, and require proper `>` termination while ignoring malformed or unclosed tags.	`openless-all/app/src-tauri/src/polish.rs`
Add unit tests verifying stripping behavior and safety around `<think>` blocks.	Test that a simple `<think>...</think>` block before the final sentence is removed while the final Chinese text is preserved. Test that `<THINK>` tags with attributes and mixed case are removed correctly, leaving only the final text. Test handling of multiple `<think>` blocks interleaved with visible text. Test that non-`<think>` tags and unclosed `<think>` tags are left untouched, and that `strip_thinking_blocks` can return a borrowed `Cow` for unchanged input.	`openless-all/app/src-tauri/src/polish.rs`

Assessment against linked issues

Issue	Objective	Addressed	Explanation
#25	Filter out MiniMax and other models' deep-thinking / reasoning content (e.g., ... blocks) so that only the final polished answer is inserted into the input field.	✅

Possibly linked issues

[macos] MiniMax 模型会把深度思考也一并输入进去 #25: PR 在 clean_polish_output 中移除 <think>...</think> 块，避免 MiniMax 深度思考内容插入输入框

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've left some high level feedback:

In strip_thinking_blocks, consider operating on a mutable String passed in (or using a single pass over &str with indices) to avoid repeated to_string allocations and potential O(n²) behavior when many <think> blocks are present.
If you expect models to sometimes include attributes or casing variants on the thinking tag (e.g. <think reason="true"> or <THINK>), you may want to extend strip_thinking_blocks to handle those forms as well while still keeping the conservative behavior.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `strip_thinking_blocks`, consider operating on a mutable `String` passed in (or using a single pass over `&str` with indices) to avoid repeated `to_string` allocations and potential O(n²) behavior when many `<think>` blocks are present.
- If you expect models to sometimes include attributes or casing variants on the thinking tag (e.g. `<think reason="true">` or `<THINK>`), you may want to extend `strip_thinking_blocks` to handle those forms as well while still keeping the conservative behavior.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Review feedback pointed out that repeated replace_range passes could become quadratic and that provider tags may vary by casing or attributes. The parser now scans the response once, allocates only when a complete think block is removed, and recognizes explicit think tags with optional attributes and ASCII casing variants. Constraint: Keep cleanup conservative and avoid localized heading stripping. Rejected: Regex dependency | unnecessary for one small tag parser. Rejected: Strip broad reasoning headings | can remove valid user-facing prose. Confidence: medium Scope-risk: narrow Tested: cargo test polish::tests -- --nocapture Tested: cargo check Tested: npm run build

H-Chris233 · 2026-04-30T05:11:15Z

@sourcery-ai review

sourcery-ai

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai Bot reviewed Apr 30, 2026

View reviewed changes

H-Chris233 merged commit af9b2b6 into Open-Less:main Apr 30, 2026
2 checks passed

appergb pushed a commit that referenced this pull request Apr 30, 2026

chore: merge origin/main into develop — 拉回 PR #45 的 polish 推理清理

d0d2da2

appergb mentioned this pull request Apr 30, 2026

release(v1.2.0): i18n + ASR 数据不丢失 + 状态机修复 + 单实例 + UI 动画补齐 #78

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(polish): 过滤模型深度思考输出#45

fix(polish): 过滤模型深度思考输出#45
H-Chris233 merged 2 commits into
Open-Less:mainfrom
H-Chris233:main

H-Chris233 commented Apr 30, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Apr 30, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

H-Chris233 commented Apr 30, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

H-Chris233 commented Apr 30, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

摘要

修复 / 新增 / 改进

兼容

测试计划

主要改动文件

备注

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Flow diagram for clean_polish_output with strip_thinking_blocks

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

H-Chris233 commented Apr 30, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

H-Chris233 commented Apr 30, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Apr 30, 2026 •

edited

Loading