-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
【Infer】MLA matrix absorption separation #10249
【Infer】MLA matrix absorption separation #10249
Conversation
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
❌ Your patch status has failed because the patch coverage (0.00%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #10249 +/- ##
===========================================
+ Coverage 49.70% 49.96% +0.25%
===========================================
Files 761 761
Lines 124218 124105 -113
===========================================
+ Hits 61744 62009 +265
+ Misses 62474 62096 -378 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
辛苦同步修改一下,bf16/wint8的组网 |
80fea35
to
63f3e2f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* bf16 batch gemm * bf16 and wint8 matrix_absorption
Before submitting
tests
folder. If there are codecov issues, please add tests cases first.PR types
Performance optimization
PR changes
Others
Description
DeepSeeK MLA矩阵吸收分离,降低显存占用,提高极限吞吐性能。