New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[SDPA] Update SDPA API and make function Public #92189

Closed

drisspg wants to merge 15 commits into pytorch:master from drisspg:update_sdpa_api

Contributor

drisspg commented Jan 14, 2023 •

edited by pytorch-bot bot

Summary

In preparation for pt 2.0 launch this PR updates SDPA's API and makes the function a nn.funcitonal public function.

Changes

API

Previously the the function signature was:
scaled_dot_product_attention(query, key, value, attn_mask=None, need_attn_weights=False, dropout_p=0.0, is_causal=False) -> (Tensor, Tensor)
Updated signature:
scaled_dot_product_attention(query, key, value, attn_mask=None, dropout_p=0.0, is_causal=False) -> Tensor

This PR removes the need_attn_weights optional boolean variable and updates the return type to a singular tensor.

Reasoning:

The main goal of this function is to provide an easy interface for users to call into fused attention kernels e.g. (FlashAttention). The fused kernels do not currently support arbitrary attn_mask or dropout but there is a PR to mem-efficient attention to enable these. We want to have the API surface ready for when the backing kernels get updated.

The fused kernels save on memory usage by not materializing the weights and it is unlikely that a fast fused implementation will enable this feature so we are removing.

Discussed with folks at FAIR/Xformers and +1 this API change.

Make function Public

In preparation for the pt 2.0 launch we make the function public to start to generate user feedback

cc @mcarilli @ptrblck @leslie-fang-intel @jgong5 @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 @EikanWang @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @desertfire

pytorch-bot bot commented Jan 14, 2023 •

edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/92189

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0587d66:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg force-pushed the update_sdpa_api branch from c6d2e44 to 13112a6 Compare

January 17, 2023 17:09

drisspg marked this pull request as ready for review

January 17, 2023 17:10

drisspg requested review from mruberry, ngimel, albanD and jbschlosser as code owners

January 17, 2023 17:10

drisspg force-pushed the update_sdpa_api branch from 13112a6 to 1128121 Compare

January 17, 2023 17:44

drisspg removed request for albanD, ngimel, mruberry and jbschlosser

January 17, 2023 21:35

github-actions bot added the module: amp (automated mixed precision) label

drisspg requested a review from zou3519 as a code owner

January 18, 2023 00:03

drisspg requested review from ezyang, Chillee, mrshenli, zhaojuanmao, rohan-varma, H-Huang, awgu, kwen2501 and wanchaol as code owners

January 18, 2023 18:12

github-actions bot added the module: inductor label

drisspg removed request for zou3519, ezyang, Chillee, mrshenli, zhaojuanmao and rohan-varma

January 18, 2023 18:33

drisspg added 9 commits

January 20, 2023 17:59


          closer


          test refinments

315a971


          Make function public and update all the call sites

5e1acfe


          batch_registrations

413c702


          update public bindings

14b1153


          skip aot dispatch

edaed99


          add skip

ee96ed4


          guard nt on multiple ragged dims to fallback to math

c4d8d2c


          nits

19d219c

drisspg force-pushed the update_sdpa_api branch from b8fd513 to 19d219c Compare

January 20, 2023 18:24

cpuhrsch approved these changes

View reviewed changes

drisspg added the ciflow/trunk label


          not sure why this was green before

827cd73

Contributor

facebook-github-bot commented Jan 20, 2023

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          add dummy function for pr purposes

b6e1952

Contributor

facebook-github-bot commented Jan 20, 2023

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          why was this included

55b9ba9

Contributor

facebook-github-bot commented Jan 20, 2023

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          skip out test since we can't xfail on runner arch

09af794

Contributor

facebook-github-bot commented Jan 21, 2023

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.


          add back in builtin func so that models still work

0587d66

Contributor

facebook-github-bot commented Jan 21, 2023

@drisspg has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

drisspg commented

View reviewed changes

aten/src/ATen/native/native_functions.yaml

@@ @@ -13943,21 +13943,27 @@ @@
                   CUDA, NestedTensorCUDA: native_multi_head_attention_cuda
                 autogen: _native_multi_head_attention.out
+              # TODO: THIS NEEDS TO BE REMOVED BUT PEOPLE HAVE TRAINED THEIR MODELS WITH THIS OP BUILTIN
               - func: _scaled_dot_product_attention(Tensor query, Tensor key, Tensor value, Tensor? attn_mask=None, float dropout_p=0.0, bool need_attn_weights=False, bool is_causal=False) -> (Tensor, Tensor)

Contributor Author

drisspg Jan 22, 2023

@cpuhrsch added this back in since your review, appears some models may have been packaged with this builtin aten op

drisspg commented

View reviewed changes

aten/src/ATen/native/transformers/attention.cpp Show resolved Hide resolved

Contributor Author

drisspg commented Jan 23, 2023

@pytorchbot merge

Collaborator

pytorchmergebot commented Jan 23, 2023

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot added the Merged label

pytorchmergebot closed this in

df14650

drisspg added the release notes: nn label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment