[BugFix] Fix limit_thinking early return logic in CUDA kernels by Copilot · Pull Request #5471 · PaddlePaddle/FastDeploy

Copilot · 2025-12-09T13:41:20Z

Motivation

Cherry-pick fix from PR #5469. The limit_thinking kernels incorrectly used AND (&&) logic for early return conditions, requiring both current_limit_think_status match AND stop_flags[bid] to be true. This prevented proper early return when only one condition was met.

Modifications

Changed condition logic from AND to OR in 4 CUDA kernel files:

custom_ops/gpu_ops/limit_thinking_content_length_v1.cu
custom_ops/gpu_ops/limit_thinking_content_length_v2.cu
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v1.cu
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v2.cu

Before:

if (current_limit_think_status == 2 && stop_flags[bid]) {
    return;  // Only returns if BOTH conditions true
}

After:

if (current_limit_think_status == 2 || stop_flags[bid]) {
    return;  // Returns if EITHER condition true
}

Updated Chinese comments from "且" (AND) to "或者" (OR) to match corrected logic.

Usage or Command

No API changes. Bug fix only affects internal kernel behavior.

Accuracy Tests

No unit tests added - cherry-pick of verified fix from #5469.

Checklist

Add at least a tag in the PR title.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

Original prompt

This section details on the original issue you should resolve

<issue_title>把这个 #5469 改动也给develop以及release/online/20251131 分别提交一下</issue_title>
<issue_description></issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes 把这个 https://github.com/PaddlePaddle/FastDeploy/pull/5469 改动也给develop以及release/online/20251131 分别提交一下 #5470

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

CLAassistant · 2025-12-09T13:41:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

paddle-bot · 2025-12-09T13:41:34Z

Thanks for your contribution!

Co-authored-by: yuanlehome <23653004+yuanlehome@users.noreply.github.com>

Copilot

Pull request overview

这个PR将来自PR #5469的bug修复应用到develop和release分支。主要修复了4个GPU CUDA内核中的条件逻辑错误,将AND操作符改为OR操作符,并更新了相应的中文注释以匹配新的逻辑。

修复了思考内容长度限制的early-exit条件,从 && (AND) 改为 || (OR)
在所有4个相关文件中一致地应用了此修复
更新了中文注释以准确反映OR逻辑

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
custom_ops/gpu_ops/limit_thinking_content_length_v1.cu	修复early-exit条件逻辑 (status==2的情况)
custom_ops/gpu_ops/limit_thinking_content_length_v2.cu	修复early-exit条件逻辑 (status==3的情况)
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v1.cu	修复speculative decoding版本的early-exit条件逻辑 (status==2)
custom_ops/gpu_ops/speculate_decoding/speculate_limit_thinking_content_length_v2.cu	修复speculative decoding版本的early-exit条件逻辑 (status==3)

Initial plan

20a17c4

Copilot AI assigned Copilot and yuanlehome Dec 9, 2025

paddle-bot bot added the contributor External developers label Dec 9, 2025

Copilot started work on behalf of yuanlehome December 9, 2025 13:41 View session

Copilot AI and others added 2 commits December 9, 2025 13:43

[BugFix] Fix limit_thinking bug - change AND to OR in condition checks

bee8308

Co-authored-by: yuanlehome <23653004+yuanlehome@users.noreply.github.com>

Update Chinese comments to reflect OR logic instead of AND

e07b93f

Co-authored-by: yuanlehome <23653004+yuanlehome@users.noreply.github.com>

yuanlehome marked this pull request as ready for review December 9, 2025 13:45

Copilot AI review requested due to automatic review settings December 9, 2025 13:45

Copilot started reviewing on behalf of yuanlehome December 9, 2025 13:45 View session

Copilot AI changed the title ~~[WIP] Apply changes from PR 5469 to develop and release branches~~ [BugFix] Fix limit_thinking early return logic in CUDA kernels Dec 9, 2025

Copilot AI requested a review from yuanlehome December 9, 2025 13:46

Copilot finished work on behalf of yuanlehome December 9, 2025 13:46

Copilot AI reviewed Dec 9, 2025

View reviewed changes

freeliuzc approved these changes Dec 9, 2025

View reviewed changes

yuanlehome merged commit e38709b into develop Dec 10, 2025
26 of 38 checks passed

yuanlehome deleted the copilot/update-develop-and-release-branches branch December 16, 2025 07:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix limit_thinking early return logic in CUDA kernels#5471

[BugFix] Fix limit_thinking early return logic in CUDA kernels#5471
yuanlehome merged 3 commits intodevelopfrom
copilot/update-develop-and-release-branches

Copilot AI commented Dec 9, 2025 •

edited

Loading

Uh oh!

CLAassistant commented Dec 9, 2025

Uh oh!

paddle-bot bot commented Dec 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Copilot AI commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Comments on the Issue (you are @copilot in this section)

Uh oh!

CLAassistant commented Dec 9, 2025

Uh oh!

paddle-bot bot commented Dec 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Copilot AI commented Dec 9, 2025 •

edited

Loading