[BugFix] Fix deep gemm import#7425
Conversation
|
Thanks for your contribution! |
PaddlePaddle-bot
left a comment
There was a problem hiding this comment.
🤖 AI Code Review |
2026-04-16
📋 Review 摘要
PR 概述:修复 DeepSeekV3 模型中 deep_gemm 的导入问题,统一使用 fp8_utils 中加载的 deep_gemm 模块。
变更范围:model_executor/models/deepseek_v3.py
影响面 Tag:[Models] [Quantization]
📝 PR 规范检查
PR 描述不完整,建议补充以下内容:
Motivation 建议(可直接复制):
原代码直接使用 `import deep_gemm` 导入第三方 deep-gemm 包,可能与 FastDeploy 内置的 deep_gemm 实现冲突。修改为从 `fp8_utils` 导入可确保使用 FastDeploy 内置的实现。同时删除了冗余的 `paddle.enable_compat(scope={"deep_gemm"})` 调用,因为 `fp8_utils.load_deep_gemm()` 已根据 SM 版本正确处理兼容性设置。
Modifications 建议(可直接复制):
1. 删除模块级别的 `paddle.enable_compat(scope={"deep_gemm"})` 调用
2. 将 forward() 函数内的 `import deep_gemm` 修改为 `from fastdeploy.model_executor.layers.quantization.fp8_utils import deep_gemm`
Checklist 建议:
- Add at least a tag in the PR title.
- Format your code, run
pre-commitbefore commit. - Add unit tests. (建议补充说明:此 PR 主要是导入路径修正,无需新增测试)
- Provide accuracy results.
- N/A(未提交到 release 分支)
问题
未发现阻塞性问题。
总体评价
这是一个合理的 BugFix 修复。修改正确地统一了 deep_gemm 的导入方式,避免直接导入第三方 deep-gemm 包可能导致的冲突。删除冗余的 enable_compat 调用也是正确的,因为 fp8_utils 模块在加载 deep_gemm 时已根据硬件架构正确处理了兼容性设置。建议完善 PR 描述后合并。
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7425 +/- ##
==========================================
Coverage ? 74.15%
==========================================
Files ? 394
Lines ? 54760
Branches ? 8578
==========================================
Hits ? 40610
Misses ? 11411
Partials ? 2739
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Motivation
Modifications
Usage or Command
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.