Skip to content

Conversation

nikhil-arm
Copy link
Collaborator

@nikhil-arm nikhil-arm commented Jan 2, 2025

KleidiAI Groupwise GEMM Kernel was not 2D Blocked. This change adds supports for 2D blocking of GEMM kernel to efficiently split workload & speedup GEMM kernel over multiple threads.

Performance improvements:
7B model Pre-fill speedup from 145 t/s to 175 t/s

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov @BoyuanFeng

Copy link

pytorch-bot bot commented Jan 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/144074

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 85a3252 with merge base 0431d47 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Signed-off-by: Nikhil Gupta <nikhil.gupta2@arm.com>
Change-Id: I2cb782e8a8414adbee6bfe317bee5bb040f4f982
@nikhil-arm nikhil-arm force-pushed the groupwise_gemm_multithreading branch from b8f0a14 to 1dfb724 Compare January 2, 2025 15:03
@nikhil-arm nikhil-arm changed the title Add Multithreading support for kleidiai groupwise GEMM kernels [Feat]: Add Multithreading support for kleidiai groupwise GEMM kernels Jan 2, 2025
@nikhil-arm nikhil-arm requested review from digantdesai and removed request for IvanYashchuk and lezcano January 3, 2025 11:22
@pytorch pytorch deleted a comment from github-actions bot Jan 3, 2025
@cpuhrsch cpuhrsch added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 7, 2025
@digantdesai digantdesai added the ciflow/linux-aarch64 linux aarch64 CI workflow label Jan 9, 2025
Copy link
Contributor

@digantdesai digantdesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, left some nit comments. Thanks.

@nikhil-arm nikhil-arm requested a review from cfRod January 13, 2025 14:46
Signed-off-by: Nikhil Gupta <nikhil.gupta2@arm.com>
Change-Id: I307c21fbe0fad0dd9793f39cef55167d553c091b
@digantdesai
Copy link
Contributor

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 13, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/linux-aarch64 linux aarch64 CI workflow ciflow/trunk Trigger trunk jobs on your pull request Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: dynamo module: inductor open source release notes: linalg_frontend release notes category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants