Disable check for dropout in MultiheadAttention fast_path #88831

ecly · 2022-11-10T18:35:05Z

Since we already enforce eval mode for the fast_path, we do not need to also check for a falsy dropout value, as a model trained with dropout will have a non-zero dropout during eval mode, even though it won't be applied.

Fixes #88806

Since we already enforce eval mode for the fast_path, we do not need to also check for a falsy dropout value, as a model trained with dropout will have a non-zero dropout during eval mode, even though it won't be applied. Fixes pytorch#88806

pytorch-bot · 2022-11-10T18:35:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88831

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit a7396dc:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2022-11-10T18:35:09Z

The committers listed above are authorized under a signed CLA.

✅ login: ecly / name: Emil Lynegaard (a7396dc)

ecly · 2022-11-10T18:37:09Z

Will sign CLA ASAP.

ecly · 2022-11-10T18:40:58Z

I didn't think this path was unit test worthy, but if you want I can port over the test from the related issue.

ngimel · 2022-11-10T19:28:56Z

@drisspg please reassign if someone else from BT should review this

erichan1 · 2022-11-10T21:16:23Z

Looks fine to me as per discussion in #88806. Will leave for @drisspg to approve since I've been out of the loop for BT.

Edit: The only possible issue is if there is a case when we're in eval mode and allow dropout. Is there such a case?

ecly · 2022-11-10T21:25:46Z

@erichan1 looking at the implementation in F.multi_head_attention_forward, which I suppose should serve as the reference for the fast path in terms of behavior, this does not seem to be supported, as dropout_p is force set to 0 when in eval mode:

pytorch/torch/nn/functional.py

Lines 5167 to 5168 in 1ae772a

    
           if not training: 
        
               dropout_p = 0.0

drisspg

Yup, nice catch LGTM!

ngimel · 2022-11-11T00:46:54Z

@pytorchbot merge

pytorchmergebot · 2022-11-11T00:48:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

) Since we already enforce eval mode for the fast_path, we do not need to also check for a falsy dropout value, as a model trained with dropout will have a non-zero dropout during eval mode, even though it won't be applied. Fixes pytorch#88806 Pull Request resolved: pytorch#88831 Approved by: https://github.com/drisspg

ecly requested review from albanD and jbschlosser as code owners November 10, 2022 18:35

ecly mentioned this pull request Nov 10, 2022

Fast path for MultiheadAttention does not work with instances with dropout despite being in eval mode #88806

Closed

pytorchbot added the open source label Nov 10, 2022

ngimel requested a review from drisspg November 10, 2022 19:28

ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Nov 10, 2022

drisspg approved these changes Nov 11, 2022

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 11, 2022

pytorchmergebot added the Merged label Nov 11, 2022

pytorchmergebot closed this in 9d09968 Nov 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable check for dropout in MultiheadAttention fast_path #88831

Disable check for dropout in MultiheadAttention fast_path #88831

ecly commented Nov 10, 2022

pytorch-bot bot commented Nov 10, 2022 •

edited

linux-foundation-easycla bot commented Nov 10, 2022 •

edited

ecly commented Nov 10, 2022

ecly commented Nov 10, 2022

ngimel commented Nov 10, 2022

erichan1 commented Nov 10, 2022 •

edited

ecly commented Nov 10, 2022 •

edited

drisspg left a comment

ngimel commented Nov 11, 2022

pytorchmergebot commented Nov 11, 2022

Disable check for dropout in MultiheadAttention fast_path #88831

Disable check for dropout in MultiheadAttention fast_path #88831

Conversation

ecly commented Nov 10, 2022

pytorch-bot bot commented Nov 10, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/88831

✅ No Failures

linux-foundation-easycla bot commented Nov 10, 2022 • edited

ecly commented Nov 10, 2022

ecly commented Nov 10, 2022

ngimel commented Nov 10, 2022

erichan1 commented Nov 10, 2022 • edited

ecly commented Nov 10, 2022 • edited

drisspg left a comment

Choose a reason for hiding this comment

ngimel commented Nov 11, 2022

pytorchmergebot commented Nov 11, 2022

Merge started

pytorch-bot bot commented Nov 10, 2022 •

edited

linux-foundation-easycla bot commented Nov 10, 2022 •

edited

erichan1 commented Nov 10, 2022 •

edited

ecly commented Nov 10, 2022 •

edited