[Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler#7402
Merged
freeliuzc merged 1 commit intoPaddlePaddle:developfrom Apr 15, 2026
Conversation
…n SpeculativeSampler
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review | 2026-04-14
📋 Review 摘要
PR 概述:修复 SpeculativeSampler 中 reasoning_phase_token_constraint 函数调用参数不匹配的 Bug
变更范围:fastdeploy/model_executor/layers/sample/sampler.py
影响面 Tag:[Speculative Decoding]
📝 PR 规范检查
- ✅ 标题包含有效 Tag:
[Speculative Decoding] - ✅ PR 描述包含 Motivation 和 Modifications
- ℹ️ Usage/Command 和 Accuracy Tests 为空,但作为 Bug 修复可接受
问题
未发现阻塞性问题。
总体评价
PR 正确修复了函数调用参数不匹配的问题。修改后的参数传递与 reasoning_phase_token_constraint 函数签名完全匹配,修复合理有效。
freeliuzc
approved these changes
Apr 14, 2026
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7402 +/- ##
==========================================
Coverage ? 74.44%
==========================================
Files ? 383
Lines ? 53617
Branches ? 8412
==========================================
Hits ? 39917
Misses ? 10980
Partials ? 2720
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
lonelygsh
added a commit
to lonelygsh/FastDeploy
that referenced
this pull request
Apr 16, 2026
…n SpeculativeSampler (PaddlePaddle#7402)
freeliuzc
pushed a commit
that referenced
this pull request
Apr 16, 2026
…7402, #7445 to release/online/20260415 (#7447) * [Speculate Decoding] Fix step_idx semantics in limit_thinking and set_stop_value kernels (#7166) - speculate_limit_thinking_content_length: update current_base_step to step_idx+1 (step_idx now records history count before current round); remove incorrect step_idx decrement on accept_num truncation; mark step_idx param as const. - speculate_set_stop_value_multi_seqs: fix can_stop gate to use step_idx_now+accept_num>=min_token_limit; fix skip check and pre_ids_idx formula (remove stale -accept_num offset); use <= condition so accept_idx maps directly to the accepted token that ends the stop sequence; fix accept_tokens index (remove -1). - Update unit tests for speculate_set_stop_value_multi_seqs kernel. * [Speculate Decoding] Fix bug of reasoning_phase_token_constraint kernel (#7349) Co-authored-by: guanshihui] <guanshihui@baidu.com> * [Speculate Decoding] Fix reasoning_phase_token_constraint call args in SpeculativeSampler (#7402) * [Interrupt reasoning] Add interrupt_requests control command support --------- Co-authored-by: guanshihui] <guanshihui@baidu.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
在
SpeculativeSampler中调用reasoning_phase_token_constraint时,传入的参数与函数签名不匹配:sampling_metadata.pre_token_idsprompt_lens导致推理约束逻辑无法正确执行,提测失败。
Modifications
修复
SpeculativeSampler中reasoning_phase_token_constraint的调用参数:sampling_metadata.pre_token_ids替换为token_ids_allprompt_lens参数Usage or Command
无新功能,为 Bug 修复。
Accuracy Tests
无模型输出变更,仅修复参数传递错误。
Checklist
pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.