{Executorch][llm] quantized sdpa. update attn_scores @ v gemm #10108

kimishpatel · 2025-04-11T14:05:29Z

Summary: Dequantize gemm when doing prefill like op, else use custom kernel

Reviewed By: metascroy

Differential Revision: D71833065

Summary: Dequantize gemm when doing prefill like op, else use custom kernel Reviewed By: metascroy Differential Revision: D71833065

pytorch-bot · 2025-04-11T14:05:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10108

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 056a4b7 with merge base a073668 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-04-11T14:05:39Z

This pull request was exported from Phabricator. Differential Revision: D71833065

Differential Revision: D71833065 Pull Request resolved: pytorch#10108

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm

056a4b7

Summary: Dequantize gemm when doing prefill like op, else use custom kernel Reviewed By: metascroy Differential Revision: D71833065

kimishpatel requested review from iseeyuan, jackzhxng, larryliu0820 and swolchok as code owners April 11, 2025 14:05

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2025

facebook-github-bot added the fb-exported label Apr 11, 2025

kimishpatel added the release notes: examples Changes to any of our example LLMs integrations, such as Llama3 and Llava label Apr 11, 2025

digantdesai approved these changes Apr 11, 2025

View reviewed changes

facebook-github-bot merged commit 4022ff1 into pytorch:main Apr 14, 2025
83 of 87 checks passed

keyprocedure pushed a commit to keyprocedure/executorch that referenced this pull request Apr 21, 2025

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm

08e8e17

Differential Revision: D71833065 Pull Request resolved: pytorch#10108

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm #10108

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm #10108

Uh oh!

kimishpatel commented Apr 11, 2025

Uh oh!

pytorch-bot bot commented Apr 11, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Apr 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm #10108

{Executorch][llm] quantized sdpa. update attn_scores @ v gemm #10108

Uh oh!

Conversation

kimishpatel commented Apr 11, 2025

Uh oh!

pytorch-bot bot commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10108

✅ No Failures

Uh oh!

facebook-github-bot commented Apr 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Apr 11, 2025 •

edited

Loading