Skip to content

fix(polish): 过滤模型深度思考输出#45

Merged
H-Chris233 merged 2 commits into
Open-Less:mainfrom
H-Chris233:main
Apr 30, 2026
Merged

fix(polish): 过滤模型深度思考输出#45
H-Chris233 merged 2 commits into
Open-Less:mainfrom
H-Chris233:main

Conversation

@H-Chris233
Copy link
Copy Markdown
Collaborator

@H-Chris233 H-Chris233 commented Apr 30, 2026

摘要

Fixes #25

本 PR 解决 MiniMax 等带深度思考输出的模型会把 reasoning / thinking 内容一并插入到最终文本里的问题。

部分 OpenAI-compatible 思考模型会在最终回答前返回显式的 <think>...</think> 内容。此前 polish 流程会把这部分内容当作普通文本继续处理,导致用户输入框中被插入很长的思考块,影响实际输入体验。

本次改动在 response parsing 之后、最终插入文本之前,增加保守的清洗逻辑:只移除明确的 <think>...</think> 块,保留最终润色文本。

修复 / 新增 / 改进

  • clean_polish_output 中增加 thinking block 清洗步骤。

  • 新增 strip_thinking_blocks

    • 只处理显式 <think>...</think>
    • 支持移除模型返回的 reasoning 内容
    • 保留 thinking block 后面的最终文本
  • 保持现有 provider 请求逻辑不变。

  • 保持现有 UI 行为不变。

  • 保持现有 fallback 行为不变。

  • 新增测试覆盖:

    • 验证 <think>...</think> 内容会被移除
    • 验证最终润色文本会被保留

兼容

  • 不包含:

    • 不关闭模型的 thinking 能力。
    • 不新增 provider-specific 开关。
    • 不修改模型配置 UI。
    • 不修改请求参数。
    • 不按中文标题、自然语言段落或非稳定文本模式做激进清洗。
  • 对现有用户 / 本地环境 / 构建流程的影响:

    • 对使用 MiniMax 等思考模型的用户,插入文本中不再包含显式 <think>...</think> 思考块。
    • 对不返回 thinking block 的模型无影响。
    • 对现有 provider 配置、fallback 逻辑和 UI 使用方式无影响。
    • 构建流程无变化。

测试计划

  • 命令:cargo test polish::tests -- --nocapture

  • 结果:通过

  • 证据路径:本地测试输出

  • 命令:cargo check

  • 结果:通过

  • 证据路径:本地检查输出

主要改动文件

  • openless-all/app/src-tauri/src/polish.rs

备注

本 PR 采用保守清洗策略,只移除有稳定标签边界的 <think>...</think> 内容。没有选择按“深度思考”“思考过程”等本地化标题文本清洗,避免误删正常用户可见内容。

Summary by Sourcery

Filter out model reasoning blocks from polish output so only the final user-facing text is inserted.

Bug Fixes:

  • Remove explicit <think>...</think> reasoning sections from polish results to prevent deep-thinking model internals from being inserted into the user input field.

Tests:

  • Add unit tests covering stripping of simple, attributed, mixed-case, multiple, and malformed <think> blocks while preserving the final polished text.

Thinking-capable OpenAI-compatible models can return tagged reasoning before the final polished answer. The cleanup layer now strips only explicit think-tag blocks after response parsing so existing provider requests, UI, and fallback behavior stay unchanged.

Constraint: Issue Open-Less#25 asks to adapt thinking output rather than disabling model thinking.

Rejected: Add a provider-specific thinking toggle | broader UI and settings change than needed for the bug.

Rejected: Strip localized heading text | too likely to remove normal user-facing content without a stable provider contract.

Confidence: medium

Scope-risk: narrow

Tested: cargo test polish::tests -- --nocapture

Tested: cargo check
@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Apr 30, 2026

Reviewer's Guide

Adds conservative stripping of <think>...</think> reasoning blocks from polish outputs so only the final user-facing text is kept, plus tests covering these cases.

Flow diagram for clean_polish_output with strip_thinking_blocks

flowchart TD
  A[Start_clean_polish_output_with_model_content] --> B[Call_strip_thinking_blocks]
  B --> C{Any_think_blocks_found?}
  C -- Yes --> D[Build_output_without_think_blocks]
  D --> E[Return_cleaned_text_from_strip_thinking_blocks]
  C -- No --> F[Return_original_text_as_borrowed_Cow]
  E --> G[Trim_whitespace]
  F --> G[Trim_whitespace]
  G --> H[Strip_markdown_fence_if_present]
  H --> I[Remove_leading_trailing_boilerplate_sentences]
  I --> J[Return_final_polish_output_without_reasoning_blocks]
  J --> K[End]
Loading

File-Level Changes

Change Details Files
Strip <think>...</think> reasoning blocks from polish output before further cleanup so only final user text is inserted.
  • Call a new strip_thinking_blocks helper at the start of clean_polish_output before trimming and markdown fence stripping.
  • Return a borrowed Cow when no <think> tags are found to avoid unnecessary allocation, and an owned string when cleaning is applied.
  • Ensure existing polish post-processing (trimming, markdown fence stripping, boilerplate removal) runs on the cleaned text.
openless-all/app/src-tauri/src/polish.rs
Implement robust parsing for <think> open/close tags, supporting attributes, ASCII case-insensitivity, and multiple blocks.
  • Scan the text for <think> open tags with find_think_open, then pair them with closing tags via find_think_close, copying only non-thinking segments into the output buffer.
  • Use parse_think_open_at / parse_think_close_at to validate positions as actual <think> tags and skip non-matching < occurrences.
  • Implement parse_think_tag_end to match think in a case-insensitive way, allow optional attributes on the opening tag, and require proper > termination while ignoring malformed or unclosed tags.
openless-all/app/src-tauri/src/polish.rs
Add unit tests verifying stripping behavior and safety around <think> blocks.
  • Test that a simple <think>...</think> block before the final sentence is removed while the final Chinese text is preserved.
  • Test that <THINK> tags with attributes and mixed case are removed correctly, leaving only the final text.
  • Test handling of multiple <think> blocks interleaved with visible text.
  • Test that non-<think> tags and unclosed <think> tags are left untouched, and that strip_thinking_blocks can return a borrowed Cow for unchanged input.
openless-all/app/src-tauri/src/polish.rs

Assessment against linked issues

Issue Objective Addressed Explanation
#25 Filter out MiniMax and other models' deep-thinking / reasoning content (e.g., ... blocks) so that only the final polished answer is inserted into the input field.

Possibly linked issues


Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • In strip_thinking_blocks, consider operating on a mutable String passed in (or using a single pass over &str with indices) to avoid repeated to_string allocations and potential O(n²) behavior when many <think> blocks are present.
  • If you expect models to sometimes include attributes or casing variants on the thinking tag (e.g. <think reason="true"> or <THINK>), you may want to extend strip_thinking_blocks to handle those forms as well while still keeping the conservative behavior.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `strip_thinking_blocks`, consider operating on a mutable `String` passed in (or using a single pass over `&str` with indices) to avoid repeated `to_string` allocations and potential O(n²) behavior when many `<think>` blocks are present.
- If you expect models to sometimes include attributes or casing variants on the thinking tag (e.g. `<think reason="true">` or `<THINK>`), you may want to extend `strip_thinking_blocks` to handle those forms as well while still keeping the conservative behavior.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Review feedback pointed out that repeated replace_range passes could become quadratic and that provider tags may vary by casing or attributes. The parser now scans the response once, allocates only when a complete think block is removed, and recognizes explicit think tags with optional attributes and ASCII casing variants.

Constraint: Keep cleanup conservative and avoid localized heading stripping.

Rejected: Regex dependency | unnecessary for one small tag parser.

Rejected: Strip broad reasoning headings | can remove valid user-facing prose.

Confidence: medium

Scope-risk: narrow

Tested: cargo test polish::tests -- --nocapture

Tested: cargo check

Tested: npm run build
@H-Chris233
Copy link
Copy Markdown
Collaborator Author

@sourcery-ai review

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@H-Chris233 H-Chris233 merged commit af9b2b6 into Open-Less:main Apr 30, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[macos] MiniMax 模型会把深度思考也一并输入进去

1 participant