[optim] Adam defaults to fused when CUDA + differentiable=False #90865

janeyx99 · 2022-12-14T20:48:08Z

Step 1 in faster default optimizers.

Preliminary benchmarks show gaps in improvement on CUDA for BERT_pytorch and resnet18:

pytorch-bot · 2022-12-14T20:48:11Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90865

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 5df2847:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…False

albanD

Sounds good!

albanD · 2022-12-26T17:22:53Z

torch/optim/adam.py

@@ -105,9 +105,12 @@ class Adam(Optimizer):
        capturable (bool, optional): whether this instance is safe to capture in a CUDA graph.
            Passing True can impair ungraphed performance, so if you don't intend to
            graph capture this instance, leave it False (default: False)
-        fused (bool, optional): whether fused implementation of optimizer is used.
+        fused (bool, optional): whether the fused implementation (CUDA only) is used.


nit: it is not really cuda-only? You could use it for CPU. And other device like xla/mps/xpu might benefit from it as well.
Or you prefer to mention this for now and update it when we get more implementations?

Thanks for the review! I thought it was CUDA only since fused_adam only has a CUDA dispatch, but yea we can update this when there are more implementations.
https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/native_functions.yaml#L14471-L14476

janeyx99 · 2022-12-26T22:53:25Z

@pytorchbot merge

pytorchmergebot · 2022-12-26T22:55:16Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2022-12-26T22:55:19Z

Merge failed

Reason: 1 additional jobs have failed, first few of them are: build

Details for Dev Infra team

Raised by workflow job

janeyx99 · 2022-12-27T01:27:12Z

@pytorchbot merge -f "build ios failure not related"

pytorchmergebot · 2022-12-27T01:28:43Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…#91896) following up to #90865 and #92048 Pull Request resolved: #91896 Approved by: https://github.com/albanD

…g to fused (#92181) A "fix" following #90865. Realized that fused is not compatible with torch.jit.is_scripting() when looking at a later line. Took the opportunity to make the code cleaner/slightly more performant (with the extends) as well. Pull Request resolved: #92181 Approved by: https://github.com/albanD

crcrpar · 2023-01-27T08:08:13Z

@janeyx99 what would be the expected behavior when foreach=True and fused=None? Would _multi_tensor_adam be selcted?

janeyx99 · 2023-01-27T22:54:40Z

@crcrpar good point. that is the intuitive thing but is currently not true. I'm fixing it with #93184

janeyx99 added release notes: nn release notes category topic: performance topic category labels Dec 14, 2022

janeyx99 force-pushed the adam-default-fused branch 3 times, most recently from c292b78 to e379ec5 Compare December 23, 2022 03:01

janeyx99 added 6 commits December 23, 2022 04:15

[optim] Adam defaults to fused/foreach when on CUDA + differentiable=…

2b94c6e

…False

moved code down

2ba83da

forgot to unchange the default

b32b63c

Fix most test failures, add docs

84cc830

Move logic to adam() and not the constructor

118ffba

Make jit happy

7577fad

janeyx99 added the ciflow/trunk Trigger trunk jobs on your pull request label Dec 23, 2022

janeyx99 marked this pull request as ready for review December 23, 2022 06:34

janeyx99 requested a review from albanD as a code owner December 23, 2022 06:34

forgot to turn False to None :C

5df2847

janeyx99 force-pushed the adam-default-fused branch from e379ec5 to 5df2847 Compare December 23, 2022 07:06

janeyx99 changed the title ~~[optim] Adam defaults to fused/foreach when CUDA + differentiable=False~~ [optim] Adam defaults to fused when CUDA + differentiable=False Dec 23, 2022

albanD approved these changes Dec 26, 2022

View reviewed changes

pytorchmergebot added the Merged label Dec 27, 2022

pytorchmergebot closed this in a061f13 Dec 27, 2022

This was referenced Jan 9, 2023

[optim][adadelta] default to foreach when CUDA + differentiable=False #91896

Closed

[adam] Add not torch.jit.is_scripting() as a requirement for switching to fused #92181

Closed

pytorchmergebot pushed a commit that referenced this pull request Jan 14, 2023

[optim][adadelta] default to foreach when CUDA + differentiable=False (…

d376550

…#91896) following up to #90865 and #92048 Pull Request resolved: #91896 Approved by: https://github.com/albanD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[optim] Adam defaults to fused when CUDA + differentiable=False #90865

[optim] Adam defaults to fused when CUDA + differentiable=False #90865

janeyx99 commented Dec 14, 2022

pytorch-bot bot commented Dec 14, 2022 •

edited

albanD left a comment

albanD Dec 26, 2022

janeyx99 Dec 26, 2022

janeyx99 commented Dec 26, 2022

pytorchmergebot commented Dec 26, 2022

pytorchmergebot commented Dec 26, 2022

janeyx99 commented Dec 27, 2022

pytorchmergebot commented Dec 27, 2022

crcrpar commented Jan 27, 2023

janeyx99 commented Jan 27, 2023

[optim] Adam defaults to fused when CUDA + differentiable=False #90865

[optim] Adam defaults to fused when CUDA + differentiable=False #90865

Conversation

janeyx99 commented Dec 14, 2022

pytorch-bot bot commented Dec 14, 2022 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/90865

✅ No Failures

albanD left a comment

Choose a reason for hiding this comment

albanD Dec 26, 2022

Choose a reason for hiding this comment

janeyx99 Dec 26, 2022

Choose a reason for hiding this comment

janeyx99 commented Dec 26, 2022

pytorchmergebot commented Dec 26, 2022

Merge started

pytorchmergebot commented Dec 26, 2022

Merge failed

janeyx99 commented Dec 27, 2022

pytorchmergebot commented Dec 27, 2022

Merge started

crcrpar commented Jan 27, 2023

janeyx99 commented Jan 27, 2023

pytorch-bot bot commented Dec 14, 2022 •

edited