Optim foreach cleanup for NAdam #70229

mikaylagawarecki · 2021-12-21T07:05:12Z

Stack from ghstack:

Add foreach flag to NAdam optimizer + cleanup

Differential Revision: D33767873

[ghstack-poisoned]

pytorch-probot · 2021-12-21T07:05:15Z

CI Flow Status

⚛️ CI Flow

Ruleset - Version: v1
Ruleset - File: https://github.com/pytorch/pytorch/blob/d97398b6addafe10494cc89294e05e8ed807009b/.github/generated-ciflow-ruleset.json
PR ciflow labels: ciflow/default

Workflows	Labels (bold enabled)	Status
Triggered Workflows
linux-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/noarch`, `ciflow/trunk`, `ciflow/xla`	✅ triggered
linux-docs	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/docs`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-vulkan-bionic-py3.7-clang9	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`, `ciflow/vulkan`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-cuda11.3-py3.7-gcc7-bazel-test	`ciflow/all`, `ciflow/bazel`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-build	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3-clang5-mobile-custom-build-static	`ciflow/all`, `ciflow/default`, `ciflow/linux`, `ciflow/mobile`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-asan	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/sanitizers`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-clang7-onnx	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/onnx`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
linux-xenial-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-gradle-custom-build-single-full-jit	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/default`, `ciflow/linux`, `ciflow/trunk`	✅ triggered
win-vs2019-cpu-py3	`ciflow/all`, `ciflow/cpu`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
win-vs2019-cuda11.3-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/default`, `ciflow/trunk`, `ciflow/win`	✅ triggered
Skipped Workflows
caffe2-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
docker-builds	`ciflow/all`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-custom-ops	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-arm64-metal	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-coreml	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
ios-12-5-1-x86-64-full-jit	`ciflow/all`, `ciflow/ios`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda10.2-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
libtorch-linux-xenial-cuda11.3-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
linux-binary-conda	`ciflow/binaries`, `ciflow/binaries/conda`	🚫 skipped
linux-binary-libtorch-cxx11-abi	`ciflow/binaries`, `ciflow/binaries/libtorch`	🚫 skipped
linux-binary-libtorch-pre-cxx11	`ciflow/binaries`, `ciflow/binaries/libtorch`	🚫 skipped
linux-binary-manywheel	`ciflow/binaries`, `ciflow/binaries/wheel`	🚫 skipped
linux-bionic-cuda10.2-py3.9-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/slow`, `ciflow/trunk`	🚫 skipped
linux-docs-push	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
linux-xenial-cuda11.3-py3.7-gcc7-no-ops	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-arm64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-10-15-py3-lite-interpreter-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
macos-11-py3-x86-64	`ciflow/all`, `ciflow/macos`, `ciflow/trunk`	🚫 skipped
parallelnative-linux-xenial-py3.7-gcc5.4	`ciflow/all`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped
periodic-libtorch-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-libtorch-linux-xenial-cuda11.1-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/libtorch`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-bionic-cuda11.5-py3.7-gcc7	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-linux-xenial-cuda10.2-py3-gcc7-slow-gradcheck	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`, `ciflow/slow`, `ciflow/slow-gradcheck`	🚫 skipped
periodic-linux-xenial-cuda11.1-py3.7-gcc7-debug	`ciflow/all`, `ciflow/cuda`, `ciflow/linux`, `ciflow/scheduled`	🚫 skipped
periodic-win-vs2019-cuda11.1-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
periodic-win-vs2019-cuda11.5-py3	`ciflow/all`, `ciflow/cuda`, `ciflow/scheduled`, `ciflow/win`	🚫 skipped
pytorch-linux-xenial-py3-clang5-android-ndk-r19c-build	`ciflow/all`, `ciflow/android`, `ciflow/cpu`, `ciflow/linux`, `ciflow/trunk`	🚫 skipped

You can add a comment to the PR and tag @pytorchbot with the following commands:

# ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun

# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slow

For more information, please take a look at the CI Flow Wiki.

facebook-github-bot · 2021-12-21T07:05:17Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/70229
📄 Preview docs built from this PR
📄 Preview C++ docs built from this PR
🔧 Opt-in to CIFlow to control what jobs run on your PRs

💊 CI failures summary and remediations

As of commit 69ca308 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

[ghstack-poisoned]

mikaylagawarecki · 2021-12-22T08:02:18Z

torch/optim/nadam.py

+    mus = [beta1 * (1. - 0.5 * (0.96 ** (step * momentum_decay))) for step in state_steps]
+    mu_nexts = [beta1 * (1. - 0.5 * (0.96 ** ((step + 1) * momentum_decay)))
+                for step in state_steps]
+    mu_products = [mu * mu_product for mu, mu_product in zip(mus, mu_products)]


The multitensor Nadam class does an update on mu_product before the call to nadam and then does computations on mu_products directly whereas the singletensor class does an update on mu_product after the call to nadam and computes the new mu_products within the nadam function. In this combined class I preserved the singletensor behavior so I added line 245 here to ensure the same behavior

This could be done with a foreach op no?
Also the existing code is modifying mu_products inplace, don't we want to preserve that/

This could be done with a foreach op no?
Also the existing code is modifying mu_products inplace, don't we want to preserve that?

Hm I didn't do this in place because each mu_product is a float rather than a tensor so even if it's done in-place i don't think it updates the underlying state['mu_product'], but there is an update step within the NAdam.step function on line 146 which updates state['mu_product'], which is preserved from the single tensor version, am I thinking about this correctly?

Aren't you computing the same thing twice here then?
I am a bit confused now where this value is updated for each case.

In general, I think we want to stay as close to the original code as possible. Even if we have to fold some of the state update in the functional function.

Yes I agree that the same thing is being computed twice, but that was what was being done initially in torch/optim/nadam.py. The single tensor version does the update of mu_product after calling F.nadam (and this same computation is done in the function on lines 224 and 226) whereas the multitensor one does it before calling F.nadam and so the logic isn't repeated in the function! So I think we need to change the code of either one of the functional forms (for example if I preserve the initial multitensor version and get rid of this change to _multi_tensor_nadam I think I would have to remove line 226 of _single_tensor_nadam (mu_product = mu_product * mu), does that make sense?

The two options I see are:

We want to keep the exact same behavior as before and so we fold the second update for the single Tensor impl into the functional. And fold the multitensor computation into its functional version.

We want to remove this duplicate code and so modify one of the two to match the other. In such case, I think we should keep the version that does not do the computation twice as it is better! Also the functional version should perform the full step, so I would argue it is a bug that part of it is done in the optimizer outside of the functional call.

hm which option do you think I should go ahead with? It seems that ASGD and SGD have the same bug where part of the state is updated outside the functional form in single tensor ASGD and SGD as well. I think they did this (for ASGD and NAdam) because arguments were passed as floats rather than singleton tensors so they couldn't update state within the function.

[ghstack-poisoned]

albanD

Much cleaner!

[ghstack-poisoned]

mikaylagawarecki · 2022-01-25T16:08:28Z

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Differential Revision: [D33767873](https://our.internmc.facebook.com/intern/diff/D33767873) [ghstack-poisoned]

mikaylagawarecki · 2022-02-01T22:11:05Z

@mikaylagawarecki has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: Pull Request resolved: #70229 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767873 Pulled By: mikaylagawarecki fbshipit-source-id: 833ead14c1d1659351ebfbeb41045a3c7eb96dad

Summary: Pull Request resolved: pytorch/pytorch#70229 Test Plan: Imported from OSS Reviewed By: anjali411 Differential Revision: D33767873 Pulled By: mikaylagawarecki fbshipit-source-id: 833ead14c1d1659351ebfbeb41045a3c7eb96dad (cherry picked from commit 9415df6b5c9620c9d53036c28fe3f297c6d4906c)

Optim foreach cleanup for NAdam

e851e6b

[ghstack-poisoned]

mikaylagawarecki requested a review from albanD as a code owner December 21, 2021 07:05

pytorch-probot bot added the ciflow/default label Dec 21, 2021

facebook-github-bot added the cla signed label Dec 21, 2021

mikaylagawarecki marked this pull request as draft December 21, 2021 07:06

Update on "Optim foreach cleanup for NAdam"

dd93f02

[ghstack-poisoned]

mikaylagawarecki mentioned this pull request Dec 22, 2021

Optim foreach cleanup for Adam #70295

Closed

mikaylagawarecki commented Dec 22, 2021

View reviewed changes

mikaylagawarecki marked this pull request as ready for review December 22, 2021 08:02

Update on "Optim foreach cleanup for NAdam"

5ebc087

[ghstack-poisoned]

This was referenced Dec 29, 2021

Optim foreach cleanup for SGD #70481

Closed

Optim foreach cleanup for Rmsprop #70482

Closed

Optim foreach cleanup for Rprop #70483

Closed

Optim foreach cleanup for AdamW #70484

Closed

mikaylagawarecki requested a review from ngimel December 29, 2021 20:20

crcrpar mentioned this pull request Dec 31, 2021

Foreach Functions Tracking Issue #58833

Open

28 tasks

mikaylagawarecki added 6 commits December 31, 2021 03:43

Update on "Optim foreach cleanup for NAdam"

b54fb48

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

67bc1bf

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

fcac106

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

c69d430

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

73cb930

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

46b834f

[ghstack-poisoned]

This was referenced Jan 14, 2022

[optim] NAdam fold state updates into functional #71334

Closed

[optim] ASGD fold state updates into functional and pass list of vars rather than states #71335

Closed

albanD approved these changes Jan 14, 2022

View reviewed changes

Update on "Optim foreach cleanup for NAdam"

d97398b

[ghstack-poisoned]

Update on "Optim foreach cleanup for NAdam"

69ca308

Differential Revision: [D33767873](https://our.internmc.facebook.com/intern/diff/D33767873) [ghstack-poisoned]

pytorchmergebot closed this in 3653f07 Feb 9, 2022

facebook-github-bot deleted the gh/mikaylagawarecki/20/head branch February 13, 2022 15:16

mikaylagawarecki added release notes: nn release notes category topic: new features topic category labels May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optim foreach cleanup for NAdam #70229

Optim foreach cleanup for NAdam #70229

Uh oh!

mikaylagawarecki commented Dec 21, 2021 •

edited

Loading

Uh oh!

pytorch-probot bot commented Dec 21, 2021 •

edited

Loading

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Dec 21, 2021 •

edited

Loading

Uh oh!

mikaylagawarecki Dec 22, 2021

Uh oh!

albanD Jan 11, 2022

Uh oh!

albanD Jan 11, 2022

Uh oh!

mikaylagawarecki Jan 11, 2022 •

edited

Loading

Uh oh!

albanD Jan 11, 2022

Uh oh!

mikaylagawarecki Jan 11, 2022 •

edited

Loading

Uh oh!

albanD Jan 11, 2022

Uh oh!

mikaylagawarecki Jan 11, 2022

Uh oh!

albanD left a comment

Uh oh!

mikaylagawarecki commented Jan 25, 2022

Uh oh!

mikaylagawarecki commented Feb 1, 2022

Uh oh!

Uh oh!

Optim foreach cleanup for NAdam #70229

Optim foreach cleanup for NAdam #70229

Uh oh!

Conversation

mikaylagawarecki commented Dec 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-probot bot commented Dec 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚛️ CI Flow

Uh oh!

facebook-github-bot commented Dec 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

Uh oh!

mikaylagawarecki Dec 22, 2021

Choose a reason for hiding this comment

Uh oh!

albanD Jan 11, 2022

Choose a reason for hiding this comment

Uh oh!

albanD Jan 11, 2022

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albanD Jan 11, 2022

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki Jan 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albanD Jan 11, 2022

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki Jan 11, 2022

Choose a reason for hiding this comment

Uh oh!

albanD left a comment

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki commented Jan 25, 2022

Uh oh!

mikaylagawarecki commented Feb 1, 2022

Uh oh!

Uh oh!

mikaylagawarecki commented Dec 21, 2021 •

edited

Loading

pytorch-probot bot commented Dec 21, 2021 •

edited

Loading

facebook-github-bot commented Dec 21, 2021 •

edited

Loading

mikaylagawarecki Jan 11, 2022 •

edited

Loading

mikaylagawarecki Jan 11, 2022 •

edited

Loading