Skip to content

Conversation

@LMX-xin
Copy link
Collaborator

@LMX-xin LMX-xin commented Aug 21, 2025

  • Change 1: Modify the down weight in deepseek_v2_decoder_layer.cpp to NZ format.
  • Change 2: In the ATB plugin, activate the shape based on the input to select different operators.
  • Performance:

When using groupmatmul with dp=1 and ep=1, tpop decreased by 5%, ttft decreased by 5%, and throughput improved by 5%.
image

@read-the-docs-community
Copy link

Documentation build overview

📚 xllm_pub | 🛠️ Build #29273701 | 📁 Comparing 82d1feb against latest (43befa4)


🔍 Preview build

Show files changed (3 files in total): 📝 2 modified | ➕ 0 added | ➖ 1 deleted
File Status
404.html 📝 modified
index.html 📝 modified
zh/features/moe_params/index.html ➖ deleted

Copy link
Collaborator

@yq33victor yq33victor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@LMX-xin LMX-xin merged commit 822fe34 into jd-opensource:main Aug 21, 2025
1 check passed
@LMX-xin LMX-xin deleted the feat/groupgemm branch August 21, 2025 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants