[Optimization] Auto set num_max_dispatch_tokens_per_rank by RichardWooSJTU · Pull Request #7237 · PaddlePaddle/FastDeploy

RichardWooSJTU · 2026-04-08T04:16:36Z

Motivation

当前低时延EP通信所需要的num_max_dispatch_tokens_per_rank参数只能通过model目录中的config.json指定，在max_num_seqs发生改变时还需要手动设置不太友好

💡 If this PR is a Cherry Pick, the PR title needs to follow the format by adding the [Cherry-Pick] label at the very beginning and appending the original PR ID at the end. For example, [Cherry-Pick][CI] Add check trigger and logic(#5191)

💡 如若此PR是Cherry Pick，PR标题需遵循格式，在最开始加上[Cherry-Pick]标签，以及最后面加上原PR ID，例如[Cherry-Pick][CI] Add check trigger and logic(#5191)

Modifications

考虑到当前需要改变这个参数的场景仅有

关闭投机解码时，设置为max_num_seqs即可
打开投机解码是，设置为max_num_seqs * (num_spec_tokens+1)

Usage or Command

Accuracy Tests

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

paddle-bot · 2026-04-08T04:16:45Z

Thanks for your contribution!

codecov-commenter · 2026-04-09T11:26:14Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@e0a1653). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #7237   +/-   ##
==========================================
  Coverage           ?   73.52%           
==========================================
  Files              ?      383           
  Lines              ?    53644           
  Branches           ?     8421           
==========================================
  Hits               ?    39440           
  Misses             ?    11524           
  Partials           ?     2680

Flag	Coverage Δ
GPU	`73.52% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…into auto_dispatch_tokens

PaddlePaddle-bot

🤖 AI Code Review | 2026-04-15

📋 Review 摘要

PR 概述：自动计算并设置 num_max_dispatch_tokens_per_rank 参数，根据 max_num_seqs 和投机解码配置动态调整

变更范围：fastdeploy/config.py

影响面 Tag：[FDConfig]

问题

级别	文件	概述
🔴 Bug	`fastdeploy/config.py:2173`	变量 `num_spec_tokens` 作用域错误，投机解码关闭时会导致 `NameError`

总体评价

PR 的优化意图合理，能够提升用户体验（无需手动同步配置），但存在一个阻塞性 Bug 需要修复：变量 num_spec_tokens 在投机解码关闭时未定义，会导致运行时错误。修复后建议添加单元测试覆盖各种场景。

PaddlePaddle-bot · 2026-04-15T06:54:28Z

+                f"Auto-setting num_max_dispatch_tokens_per_rank from "
+                f"{self.model_config.num_max_dispatch_tokens_per_rank} to {auto_dispatch_tokens} "
+                f"(max_num_seqs={self.scheduler_config.max_num_seqs}"
+                f"{f', num_speculative_tokens={num_spec_tokens}' if self.speculative_config is not None and self.speculative_config.method is not None else ''})."


🔴 Bug 变量作用域错误

num_spec_tokens 变量只在投机解码开启时（第 2161 行）才被定义，但在第 2173 行的日志中使用了该变量。当投机解码关闭时，num_spec_tokens 未定义会导致 NameError。

虽然 f-string 中有条件判断 if ... else ''，但 Python 在编译时会先评估所有表达式，所以即使条件为 False，num_spec_tokens 也必须存在。

修复建议：

# 在使用前先定义 num_spec_tokens num_spec_tokens = getattr( self.speculative_config, "num_speculative_tokens", 0 ) if self.speculative_config is not None and self.speculative_config.method is not None else 0 if self.speculative_config is not None and self.speculative_config.method is not None: num_spec_tokens = self.speculative_config.num_speculative_tokens auto_dispatch_tokens = self.scheduler_config.max_num_seqs * (num_spec_tokens + 1) else: auto_dispatch_tokens = self.scheduler_config.max_num_seqs

或者在日志中使用更简单的表达式：

logger.info( f"Auto-setting num_max_dispatch_tokens_per_rank from " f"{self.model_config.num_max_dispatch_tokens_per_rank} to {auto_dispatch_tokens} " f"(max_num_seqs={self.scheduler_config.max_num_seqs}" f"{', num_speculative_tokens=' + str(self.speculative_config.num_speculative_tokens) if self.speculative_config is not None and self.speculative_config.method is not None else ''})." )

…e#7237) * auto set num_max_dispatch_tokens_per_rank * fix ci * fix ci * fix ci

)(#7426) (#7436) * [Optimization] Auto set num_max_dispatch_tokens_per_rank (#7237) * auto set num_max_dispatch_tokens_per_rank * fix ci * fix ci * fix ci * fix deep gemm import (#7425) * allow parallel dp starting (#7426)

auto set num_max_dispatch_tokens_per_rank

f192ce2

RichardWooSJTU had a problem deploying to Metax_ci April 8, 2026 04:16 — with GitHub Actions Failure

RichardWooSJTU closed this Apr 8, 2026

RichardWooSJTU reopened this Apr 8, 2026

This comment was marked as outdated.

Sign in to view

Merge branch 'develop' into auto_dispatch_tokens

0283cd6

RichardWooSJTU had a problem deploying to Metax_ci April 9, 2026 08:05 — with GitHub Actions Error

This comment was marked as outdated.

Sign in to view

RichardWooSJTU added 2 commits April 15, 2026 10:13

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

6b5db0b

…into auto_dispatch_tokens

fix ci

519c9df

RichardWooSJTU had a problem deploying to Metax_ci April 15, 2026 02:18 — with GitHub Actions Error

fix ci

e639c6f

RichardWooSJTU had a problem deploying to Metax_ci April 15, 2026 02:20 — with GitHub Actions Failure

This comment was marked as outdated.

Sign in to view

fix ci

434ecef

RichardWooSJTU had a problem deploying to Metax_ci April 15, 2026 06:42 — with GitHub Actions Failure

PaddlePaddle-bot suggested changes Apr 15, 2026

View reviewed changes

freeliuzc approved these changes Apr 15, 2026

View reviewed changes

freeliuzc merged commit dec0b06 into PaddlePaddle:develop Apr 15, 2026
35 of 38 checks passed

RichardWooSJTU added a commit to RichardWooSJTU/FastDeploy that referenced this pull request Apr 16, 2026

[Optimization] Auto set num_max_dispatch_tokens_per_rank (PaddlePaddl…

91025c1

…e#7237) * auto set num_max_dispatch_tokens_per_rank * fix ci * fix ci * fix ci

PaddlePaddle-bot mentioned this pull request Apr 16, 2026

[Cherry-Pick] CP 3 prs which is about fixing and optimizing(#7237)(#7425)(#7426) #7436

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Optimization] Auto set num_max_dispatch_tokens_per_rank#7237

[Optimization] Auto set num_max_dispatch_tokens_per_rank#7237
freeliuzc merged 6 commits intoPaddlePaddle:developfrom
RichardWooSJTU:auto_dispatch_tokens

RichardWooSJTU commented Apr 8, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Apr 8, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented Apr 9, 2026 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Uh oh!

PaddlePaddle-bot Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

RichardWooSJTU commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Apr 8, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

codecov-commenter commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

PaddlePaddle-bot left a comment

Choose a reason for hiding this comment

📋 Review 摘要

问题

总体评价

Uh oh!

PaddlePaddle-bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RichardWooSJTU commented Apr 8, 2026 •

edited

Loading

codecov-commenter commented Apr 9, 2026 •

edited

Loading