Require less alignment for attn bias (#114173) #114837

drisspg · 2023-11-30T04:02:41Z

Improved Fix for Attention Mask Alignment Issue (#112577)

This PR addresses Issue #112577 by refining the previously implemented fix, which was found to be incorrect and causes un-needed memory regressions. The update simplifies the approach to handling the alignment of the attention mask for mem eff attention.

Alignment Check and Padding: Initially, the alignment of the attention mask is checked. If misalignment is detected, padding is applied, followed by slicing. During this process, a warning is raised to alert users.

Should this be warn_once?

We only call expand, once on the aligned mask.

Reference
https://github.com/facebookresearch/xformers/blob/main/xformers/ops/fmha/cutlass.py#L115

@albanD, @mruberry, @jbschlosser, @walterddr, and @mikaylagawarecki.

Pull Request resolved: #114173
Approved by: https://github.com/danthe3rd

Fixes #ISSUE_NUMBER

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler

pytorch-bot · 2023-11-30T04:02:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/114837

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6009c5e with merge base 138e289 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg · 2023-11-30T14:43:15Z

Failure are a bad cherry pick I will fix up this morning

atalman · 2023-11-30T14:46:33Z

test/inductor/test_torchinductor.py

@@ -6441,6 +6441,52 @@ def fn_or(x, y):
            (torch.randn(32), torch.randn(32)),
        )

+    @requires_cuda()
+    @unittest.skipIf(
+        not PLATFORM_SUPPORTS_FUSED_SDPA,


On main PR this is called: PLATFORM_SUPPORTS_MEM_EFF_ATTENTION, why on cherry pick we name this as : PLATFORM_SUPPORTS_FUSED_SDPA ?

This flag was added inbetween releases and is not in this cherry pick. That being said PLATFORM_SUPPORTS_FUSED_SDPA is equivalent to PLATFORM_SUPPORTS_MEM_EFF_ATTENTION in 2.1 release

drisspg · 2023-11-30T23:35:41Z

Merge in this into this patch:
#114909

@albanD

Improved Fix for Attention Mask Alignment Issue (pytorch#112577) This PR addresses Issue pytorch#112577 by refining the previously implemented fix, which was found to be incorrect and causes un-needed memory regressions. The update simplifies the approach to handling the alignment of the attention mask for mem eff attention. Alignment Check and Padding: Initially, the alignment of the attention mask is checked. If misalignment is detected, padding is applied, followed by slicing. During this process, a warning is raised to alert users. Should this be warn_once? We only call expand, once on the aligned mask. Reference https://github.com/facebookresearch/xformers/blob/main/xformers/ops/fmha/cutlass.py#L115 @albanD, @mruberry, @jbschlosser, @walterddr, and @mikaylagawarecki. Pull Request resolved: pytorch#114173 Approved by: https://github.com/danthe3rd

This reverts commit 5965649.

github-actions bot added module: inductor ciflow/inductor labels Nov 30, 2023

drisspg mentioned this pull request Nov 30, 2023

[v2.1.2] Release Tracker #113962

Closed

atalman reviewed Nov 30, 2023

View reviewed changes

drisspg force-pushed the release/2.1 branch from 8eae67e to 39dadc8 Compare November 30, 2023 22:09

drisspg force-pushed the release/2.1 branch from 39dadc8 to 6009c5e Compare November 30, 2023 23:42

atalman approved these changes Dec 1, 2023

View reviewed changes

huydhn approved these changes Dec 4, 2023

View reviewed changes

atalman merged commit 5965649 into pytorch:release/2.1 Dec 5, 2023
122 checks passed

drisspg mentioned this pull request Dec 12, 2023

[Breaking change 2.1] Passing non-contiguous inputs to SDPA on CUDA device with the mem-efficient attention backend returns garbage #112577

Closed

atalman added a commit that referenced this pull request Dec 12, 2023

Revert "Require less alignment for attn bias (#114173) (#114837)"

a8e7c98

This reverts commit 5965649.

drisspg mentioned this pull request Mar 19, 2024

Compiled model raises error "attn_bias is not correctly aligned" in pytorch 2.2 #121943

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require less alignment for attn bias (#114173) #114837

Require less alignment for attn bias (#114173) #114837

drisspg commented Nov 30, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Nov 30, 2023 •

edited

drisspg commented Nov 30, 2023

atalman Nov 30, 2023 •

edited

drisspg Nov 30, 2023

drisspg commented Nov 30, 2023

Require less alignment for attn bias (#114173) #114837

Require less alignment for attn bias (#114173) #114837

Conversation

drisspg commented Nov 30, 2023 • edited by pytorch-bot bot

pytorch-bot bot commented Nov 30, 2023 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/114837

✅ No Failures

drisspg commented Nov 30, 2023

atalman Nov 30, 2023 • edited

Choose a reason for hiding this comment

drisspg Nov 30, 2023

Choose a reason for hiding this comment

drisspg commented Nov 30, 2023

drisspg commented Nov 30, 2023 •

edited by pytorch-bot bot

pytorch-bot bot commented Nov 30, 2023 •

edited

atalman Nov 30, 2023 •

edited