Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

Chillee · 2024-08-10T02:40:20Z

Stack from ghstack (oldest at bottom):

-> Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

This PR changes the "contract" of flex_attention_hop to return LSE in base 2. However, we undo that and return LSE in base e from the flex_attention frontend.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

…ating through it) [ghstack-poisoned]

pytorch-bot · 2024-08-10T02:40:24Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133159

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 46c331c with merge base e890d88 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… differentiating through it)" This PR changes the "contract" of `flex_attention_hop` to return LSE in base 2. However, we undo that and return LSE in base e from the `flex_attention` frontend. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

…ating through it) ghstack-source-id: bef5100 Pull Request resolved: #133159

yanboliang · 2024-08-10T04:20:20Z

test/inductor/test_flex_attention.py

-                kernel_options,
-            )
+        def sdpa_hop(q, k, v, score_mod):
+            return flex_attention(q, k, v, score_mod, return_lse=True)


Finally fixed these annoying tests, lol!

… differentiating through it)" This PR changes the "contract" of `flex_attention_hop` to return LSE in base 2. However, we undo that and return LSE in base e from the `flex_attention` frontend. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

…ating through it) ghstack-source-id: f94bb38 Pull Request resolved: #133159

Chillee · 2024-08-11T08:15:08Z

@pytorchbot merge

pytorchmergebot · 2024-08-11T08:16:46Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Add support for returning LSE from FlexAttention (and also differenti…

7f2d435

…ating through it) [ghstack-poisoned]

Chillee requested review from albanD, jbschlosser and mikaylagawarecki as code owners August 10, 2024 02:40

pytorch-bot bot added ciflow/inductor module: inductor labels Aug 10, 2024

github-actions bot requested a review from ezyang August 10, 2024 02:40

Chillee added a commit that referenced this pull request Aug 10, 2024

Add support for returning LSE from FlexAttention (and also differenti…

5e26638

…ating through it) ghstack-source-id: bef5100 Pull Request resolved: #133159

Chillee requested review from drisspg and yanboliang August 10, 2024 03:10

yanboliang approved these changes Aug 10, 2024

View reviewed changes

Chillee added a commit that referenced this pull request Aug 10, 2024

Add support for returning LSE from FlexAttention (and also differenti…

230d3d6

…ating through it) ghstack-source-id: f94bb38 Pull Request resolved: #133159

Chillee added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Aug 11, 2024

pytorchmergebot added the merging label Aug 11, 2024

pytorchmergebot added the Merged label Aug 11, 2024

pytorchmergebot closed this in 04f37ed Aug 11, 2024

pytorchmergebot removed the merging label Aug 11, 2024

github-actions bot deleted the gh/chillee/339/head branch September 12, 2024 02:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

Uh oh!

Chillee commented Aug 10, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 10, 2024 •

edited

Loading

Uh oh!

yanboliang Aug 10, 2024

Uh oh!

Chillee commented Aug 11, 2024

Uh oh!

pytorchmergebot commented Aug 11, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

Add support for returning LSE from FlexAttention (and also differentiating through it) #133159

Uh oh!

Conversation

Chillee commented Aug 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133159

✅ No Failures

Uh oh!

yanboliang Aug 10, 2024

Choose a reason for hiding this comment

Uh oh!

Chillee commented Aug 11, 2024

Uh oh!

pytorchmergebot commented Aug 11, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Chillee commented Aug 10, 2024 •

edited

Loading

pytorch-bot bot commented Aug 10, 2024 •

edited

Loading