[Feature] 为 FusedMoE 添加 hidden_size 显式参数支持#7361
[Feature] 为 FusedMoE 添加 hidden_size 显式参数支持#7361chang-wenbin merged 2 commits intoPaddlePaddle:developfrom
Conversation
|
Thanks for your contribution! |
|
“liuruian” seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-13
📋 Review 摘要
PR 概述:为 FusedMoE 类添加 hidden_size 参数,使调用方可以显式指定 hidden_size 值,而不再从 fd_config.model_config.hidden_size 自动获取。
变更范围:model_executor/layers/moe/moe.py、model_executor/models/ 下的 6 个 MoE 模型文件、4 个测试文件
影响面 Tag:[Models] [OP]
📝 PR 规范检查
问题:
- PR 标题缺少官方 Tag
- PR 描述未填写 Motivation 和 Modifications
标题建议(可直接复制):
[Feature] 为 FusedMoE 添加 hidden_size 显式参数支持[OP] 修改 FusedMoE 构造函数,支持传入 hidden_size 参数
描述模板(可直接复制):
Motivation
[填写修改原因,例如:解耦 FusedMoE 对 fd_config.model_config.hidden_size 的强依赖,使类设计更加灵活,便于未来扩展...]
Modifications
- 在
FusedMoE.__init__中新增hidden_size参数,默认值为 -1 - 移除原有从
fd_config.model_config.hidden_size获取 hidden_size 的逻辑 - 更新所有调用方显式传入
hidden_size=fd_config.model_config.hidden_size - 更新相关测试文件
问题
| 级别 | 文件 | 概述 |
|---|---|---|
| 🔴 Bug | moe.py:156 |
hidden_size 参数默认值为 -1(magic number),且缺少验证逻辑 |
总体评价
PR 的变更意图合理,将 hidden_size 从自动获取改为显式传入,提高了接口的灵活性。所有现有的调用方都已正确更新,测试文件也同步修改。
但存在一个潜在的安全问题:hidden_size 参数默认值为 -1,且没有验证逻辑。如果未来有新代码忘记传入该参数,会导致 self.hidden_size = -1,可能在后续代码(如 forward_chunked_moe 中的 paddle.empty)中引发错误或意外行为。
| def __init__( | ||
| self, | ||
| fd_config, | ||
| hidden_size: int = -1, |
There was a problem hiding this comment.
🔴 Bug hidden_size 参数默认值为 -1(magic number),且没有验证逻辑。
如果未来有代码忘记传入 hidden_size 参数,会导致 self.hidden_size = -1,这可能在后续代码中(如 forward_chunked_moe 中的 paddle.empty(shape=[0, self.hidden_size], ...))引发错误或意外行为。
建议修复方式:
方案 1(推荐):移除默认值,强制所有调用方显式传入
def __init__(
self,
fd_config,
hidden_size: int, # 移除默认值
...
):方案 2:添加验证逻辑
def __init__(
self,
fd_config,
hidden_size: int = -1,
...
):
super().__init__()
...
assert hidden_size > 0, f"hidden_size must be positive, got {hidden_size}"
self.hidden_size = hidden_size
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #7361 +/- ##
==========================================
Coverage ? 74.10%
==========================================
Files ? 383
Lines ? 53591
Branches ? 8406
==========================================
Hits ? 39713
Misses ? 11181
Partials ? 2697
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
解耦 FusedMoE 对 fd_config.model_config.hidden_size 的强依赖,使类设计更加灵活,便于未来扩展...]
Modifications
在 FusedMoE.init 中新增 hidden_size 参数,默认值为 -1
移除原有从 fd_config.model_config.hidden_size 获取 hidden_size 的逻辑
更新所有调用方显式传入 hidden_size=fd_config.model_config.hidden_size
更新相关测试文件
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.