Skip to content

Conversation

@gzy19990617
Copy link
Contributor

@gzy19990617 gzy19990617 commented Mar 10, 2025

Before submitting

  • Lint code. If there are lint issues, please format the code first.
# Install and register `pre-commit` in the project folder
pip install pre-commit && pre-commit install

# Process previous code files separately
pre-commit run --file XXXX.py
  • Add test cases into tests folder. If there are codecov issues, please add tests cases first.

PR types

PR changes

Description

  • Paddle移除moe PR#71610

  • 将框架中的moe算子迁移为paddle_nlp自定义算子,优化部分代码进行精减;有利于框架编译速度提升;便于应用cutlass3.x;便于在此基础支持其他精度moe kernel;便于精度profile等等。

  • 部分头文件引用paddle内部头文件

  • 单测精度可对齐;

  • 模型deepseek-v2-lite wint4输出正常

image
  • 修改tune后逻辑在长输入下性能约提升20-30%
image
  • paddle内部实现与paddle_nlp自定义算子moe静态图服务化吞吐持平,可以证明迁移出去,不会影响服务静态图吞吐性能。

Qwen/Qwen1.5-MoE-A2.7B模型,单卡H卡/128并发/1000条数据/wint8/input-1152-output-201/block-bs 28
内部moe算子:

Pasted Graphic 64

迁移后算子:

Pasted Graphic 65

@paddle-bot
Copy link

paddle-bot bot commented Mar 10, 2025

Thanks for your contribution!

@codecov
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes missing coverage. Please review.

Project coverage is 49.93%. Comparing base (79dbdbb) to head (45b5e0e).
Report is 265 commits behind head on develop.

Files with missing lines Patch % Lines
...erimental/transformers/fused_transformer_layers.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #10063      +/-   ##
===========================================
- Coverage    49.97%   49.93%   -0.04%     
===========================================
  Files          757      757              
  Lines       122498   122586      +88     
===========================================
+ Hits         61217    61218       +1     
- Misses       61281    61368      +87     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@CLAassistant
Copy link

CLAassistant commented Mar 11, 2025

CLA assistant check
All committers have signed the CLA.

@gzy19990617 gzy19990617 changed the title Migration moe ffn 【Inference】Migration moe ffn Mar 12, 2025
@gzy19990617 gzy19990617 changed the title 【Inference】Migration moe ffn 【Inference】Migration MoE Kernel from Paddle Inner Mar 12, 2025
@gzy19990617 gzy19990617 changed the title 【Inference】Migration MoE Kernel from Paddle Inner 【Inference】Migrate MoE Kernel from Paddle Inner Mar 12, 2025
Copy link
Collaborator

@yuanlehome yuanlehome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZHUI ZHUI merged commit 63c4ea4 into PaddlePaddle:develop Mar 18, 2025
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants