-
Notifications
You must be signed in to change notification settings - Fork 21.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[2.0.1] Disable SDPA FlashAttention backward and mem eff attention on sm86+ for head_dim above 64 #99736
Conversation
cf3534a
to
cb45ca9
Compare
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/99736
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ⏳ No Failures, 2 PendingAs of commit d3c53c4: This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cb45ca9
to
6c31735
Compare
6c31735
to
85810e3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure to remove pull.yml change, but otherwise looks good to me
.github/workflows/pull.yml
Outdated
|
||
linux-bionic-cuda11_8-py3_10-gcc7-sm89-build: | ||
name: linux-bionic-cuda11.8-py3.10-gcc7-sm89 | ||
uses: ./.github/workflows/_linux-build.yml | ||
with: | ||
build-environment: linux-bionic-cuda11.8-py3.10-gcc7-sm89 | ||
docker-image-name: pytorch-linux-bionic-cuda11.8-cudnn8-py3-gcc7 | ||
cuda-arch-list: '8.9' | ||
test-matrix: | | ||
{ include: [ | ||
{ config: "default", shard: 1, num_shards: 4, runner: "linux.gcp.l4" }, | ||
{ config: "default", shard: 2, num_shards: 4, runner: "linux.gcp.l4" }, | ||
{ config: "default", shard: 3, num_shards: 4, runner: "linux.gcp.l4" }, | ||
{ config: "default", shard: 4, num_shards: 4, runner: "linux.gcp.l4" }, | ||
]} | ||
linux-bionic-cuda11_8-py3_10-gcc7-sm89-test: | ||
name: linux-bionic-cuda11.8-py3.10-gcc7-sm89 | ||
uses: ./.github/workflows/_linux-test.yml | ||
needs: linux-bionic-cuda11_8-py3_10-gcc7-sm89-build | ||
with: | ||
build-environment: linux-bionic-cuda11.8-py3.10-gcc7-sm89 | ||
docker-image: ${{ needs.linux-bionic-cuda11_8-py3_10-gcc7-sm89-build.outputs.docker-image }} | ||
test-matrix: ${{ needs.linux-bionic-cuda11_8-py3_10-gcc7-sm89-build.outputs.test-matrix }} | ||
use-gha: anything-non-empty-to-use-gha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be removed before cherry-pick is landed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I thought you wanted to include this so we can get sm89 tests on release branch. I'll remove it.
…or head_dim above 64 (pytorch#99105) Expand sdpa_utils.h check to disable FlashAttention when using autograd and mem eff attention for the following cases - head_dim > 64 - sm86 or newer Previously we only disable these kernels on sm86 and for head_dim equal to 128. Pull Request resolved: pytorch#99105 Approved by: https://github.com/malfet
The new workflow has been removed, rebased and force pushed. |
2.0.1 submission of #99105 plus sm89 CI enablement.
Expand sdpa_utils.h check to disable FlashAttention when using autograd and mem eff attention for the following cases
Previously we only disable these kernels on sm86 and for head_dim equal to 128.