[Cherry-Pick][BugFix] prevent requests from entering running state without a slot(#7141) by liyonghua0910 · Pull Request #7181 · PaddlePaddle/FastDeploy

liyonghua0910 · 2026-04-03T06:21:04Z

Motivation

Cherry-pick the waiting-list preempted-task counting fix from develop PR #7141 into release/2.6.

This cherry-pick keeps the release branch aligned with the scheduler slot-accounting fix that prevents requests from being admitted when effective occupied slots have already reached max_num_seqs.

Modifications

Update fastdeploy/engine/sched/resource_manager_v1.py
Count RequestStatus.PREEMPTED requests in self.waiting when checking max_num_seqs
Preserve the existing release/2.6 slot accounting for running, abort-pending, and reschedule-pending requests

Usage or Command

No new usage is introduced.

Validation performed during git cherry-pick --continue via pre-commit hooks:

black
isort
flake8
ruff
check for merge conflicts
fix end of files
trim trailing whitespace
detect private key
check for added large files

Accuracy Tests

This change only affects scheduler accounting logic and does not change model forward, kernel logic, or numerical outputs.

No accuracy impact is expected.

Checklist

Add at least a tag in the PR title.
Format your code, run pre-commit before commit.
Add unit tests. No dedicated unit test is added because this is a targeted scheduler guard fix being cherry-picked to the release branch.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

* Set MC_MAX_MR_SIZE to avoid register hang * up

paddle-bot · 2026-04-03T06:21:13Z

Thanks for your contribution!

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-03 15:33 CST

📋 Review 摘要

PR 概述：Cherry-pick scheduler slot 计算修复到 release/2.6 分支
变更范围：resource_manager_v1.py（调度器）、common_engine.py（引擎）、mooncake_store.py（缓存）
影响面 Tag：Scheduler Engine

📝 PR 规范检查

PR 描述与实际变更不一致：

描述中提到：只修改 resource_manager_v1.py
实际变更：还包含 common_engine.py 和 mooncake_store.py 的修改

请确认这些额外变更是否应该包含在此 Cherry-pick PR 中，如果不是，建议分离到单独的 PR。

问题

级别	文件	概述
🔴 Bug	`common_engine.py:1155`	异常处理逻辑变更导致 shutdown 行为改变，且会导致现有测试失败

总体评价

resource_manager_v1.py 的核心修复逻辑正确，正确地将 to_be_rescheduled、to_be_aborted 和 PREEMPTED 状态的请求计入 slot 占用检查。但 common_engine.py 的变更存在问题，将原本的优雅 shutdown 处理改为直接抛出异常，这与现有测试 test_schedule_request_to_worker_v1_threadpool_shutdown_breaks 的预期行为冲突，需要修复或确认是否为有意变更。

PaddlePaddle-bot · 2026-04-03T07:33:34Z

            except RuntimeError as e:
-                if "cannot schedule new futures after shutdown" in str(e):
-                    break
+                raise e


🔴 Bug 异常处理逻辑变更导致行为不兼容

问题分析：

原代码：捕获 RuntimeError，如果是 "cannot schedule new futures after shutdown" 则优雅退出循环 (break)

新代码：直接 raise e，会导致异常向上传播

影响：

现有测试 test_schedule_request_to_worker_v1_threadpool_shutdown_breaks 期望此异常被优雅处理（不抛出），此变更会导致测试失败

生产环境中 ThreadPool shutdown 时的行为会从「优雅退出」变为「异常抛出」

建议：
此变更与 PR 描述的 "scheduler slot-accounting fix" 目标无关，疑似为 cherry-pick 过程中的冲突解决引入。请确认：

如果是有意变更，需同步更新测试用例

如果是误引入，建议恢复原逻辑：

except RuntimeError as e: if "cannot schedule new futures after shutdown" in str(e): break raise e

codecov-commenter · 2026-04-03T07:52:26Z

Codecov Report

❌ Patch coverage is 33.33333% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (release/2.6@b24765a). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
.../transfer_factory/mooncake_store/mooncake_store.py	0.00%	1 Missing ⚠️
fastdeploy/engine/common_engine.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff               @@
##             release/2.6    #7181   +/-   ##
==============================================
  Coverage               ?   73.75%           
==============================================
  Files                  ?      376           
  Lines                  ?    52886           
  Branches               ?     8249           
==============================================
  Hits                   ?    39007           
  Misses                 ?    11152           
  Partials               ?     2727

Flag	Coverage Δ
GPU	`73.75% <33.33%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…thout a slot(PaddlePaddle#7141) (PaddlePaddle#7181) * [BugFix] Set MC_MAX_MR_SIZE to avoid register hang (PaddlePaddle#7163) * Set MC_MAX_MR_SIZE to avoid register hang * up * [fix] prevent requests from entering running state without a slot * [fix] count abort set * [fix] count preempted task in waiting list --------- Co-authored-by: jc <52520497+juncaipeng@users.noreply.github.com>

…state without a slot(PaddlePaddle#7141) (PaddlePaddle#7181)" This reverts commit 80f4a72.

modify test modify test support empty tensor and modify test fix test_linear config issues modify test name add edge test case modify format fix conflict modify default max token num in trtllm_allreduce_fusion add max token num branch for trtllm_allreduce_fusion fix format fix rmsnorm config issue modify 2025 to 2026 enable trtllm_allreduce fusion Revert "[Cherry-Pick][CI] Use GPU-Build-RL runner for _build_linux_rl.yml (PaddlePaddle#7186) (PaddlePaddle#7195)" This reverts commit ca2f38b. Revert "[Cherry-Pick][BugFix] prevent requests from entering running state without a slot(PaddlePaddle#7141) (PaddlePaddle#7181)" This reverts commit 80f4a72. clean flashinfer cache and modify test fix dumpy patch issue fix some issues

… glm model (#6660) (#7228) * enable trtllm_all_reduce fusion kernel in glm model * update flashinfer paddle version * format update modify test modify test support empty tensor and modify test fix test_linear config issues modify test name add edge test case modify format fix conflict modify default max token num in trtllm_allreduce_fusion add max token num branch for trtllm_allreduce_fusion fix format fix rmsnorm config issue modify 2025 to 2026 enable trtllm_allreduce fusion Revert "[Cherry-Pick][CI] Use GPU-Build-RL runner for _build_linux_rl.yml (#7186) (#7195)" This reverts commit ca2f38b. Revert "[Cherry-Pick][BugFix] prevent requests from entering running state without a slot(#7141) (#7181)" This reverts commit 80f4a72. clean flashinfer cache and modify test fix dumpy patch issue fix some issues * remove redundent * enable moe reduce fusion * fix test * fix cuda context issue * update flashinfer version

juncaipeng and others added 4 commits April 3, 2026 10:51

[BugFix] Set MC_MAX_MR_SIZE to avoid register hang (PaddlePaddle#7163)

acdb89c

* Set MC_MAX_MR_SIZE to avoid register hang * up

[fix] prevent requests from entering running state without a slot

1017523

[fix] count abort set

e1a74fd

[fix] count preempted task in waiting list

7d4324e

liyonghua0910 had a problem deploying to Metax_ci April 3, 2026 06:21 — with GitHub Actions Error

Merge branch 'release/2.6' into release/2.6+20260403_fix_schedule

597a76d

liyonghua0910 had a problem deploying to Metax_ci April 3, 2026 06:25 — with GitHub Actions Failure

PaddlePaddle-bot suggested changes Apr 3, 2026

View reviewed changes

Jiang-Jia-Jun merged commit 55dbc83 into PaddlePaddle:release/2.6 Apr 3, 2026
34 of 38 checks passed

BingooYang added a commit to BingooYang/FastDeploy that referenced this pull request Apr 11, 2026

Revert "[Cherry-Pick][BugFix] prevent requests from entering running …

3bdfbde

…state without a slot(PaddlePaddle#7141) (PaddlePaddle#7181)" This reverts commit 80f4a72.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Cherry-Pick][BugFix] prevent requests from entering running state without a slot(#7141)#7181

[Cherry-Pick][BugFix] prevent requests from entering running state without a slot(#7141)#7181
Jiang-Jia-Jun merged 5 commits into
PaddlePaddle:release/2.6from
liyonghua0910:release/2.6+20260403_fix_schedule

liyonghua0910 commented Apr 3, 2026

Uh oh!

paddle-bot Bot commented Apr 3, 2026

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 3, 2026

Uh oh!

codecov-commenter commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

liyonghua0910 commented Apr 3, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot Bot commented Apr 3, 2026

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

📝 PR 规范检查

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Apr 3, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants