Skip to content

[optim] be explicit about CPU scalar tensor dtypes #111008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jon-chuang
Copy link
Collaborator

@jon-chuang jon-chuang commented Oct 11, 2023

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 11, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/111008

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 18d6b92 with merge base b5dd37f (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@jon-chuang jon-chuang changed the title [optim] be explicit about scalar tensor dtypes to respect contract [optim] be explicit about scalar tensor dtypes Oct 11, 2023
@jon-chuang
Copy link
Collaborator Author

@pytorchbot label "release notes: optim"

@jon-chuang jon-chuang changed the title [optim] be explicit about scalar tensor dtypes [optim] be explicit about CPU scalar tensor dtypes Oct 11, 2023
@@ -736,7 +736,7 @@ def _test_derived_optimizers_varying_tensors(self, optimizer_with_kwargs, kwarg)
actual = mt_p_state[k]
self.assertEqual(st_p_state[k], actual, rtol=rtol, atol=atol)

def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag):
def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag, reduced_precision=False):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this tests sparseadam. I think it would have previously failed too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn’t test sparseadam nor lbfgs since they do not have foreach/multitensor impls.

moreover, it looks like sparseadam doesn’t have its step as a tensor like the majority of our optimizers. moving step to be a tensor is a separate change that includes adding warnings and tests (as it’s BC breaking) and I would recommend moving the sparseadam changes to a separate PR and only focusing on the foreach impls in this one.

If you’re up for it, RMSProp and Rprop also were accidentally left out of the step to Tensor migration (done by @mikaylagawarecki almost 2 yrs ago) and should be brought into the fold. 😄

Copy link
Contributor

@janeyx99 janeyx99 Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn’t test sparseadam nor lbfgs since they do not have foreach/multitensor impls.

moreover, it looks like sparseadam doesn’t have its step as a tensor like the majority of our optimizers. moving step to be a tensor is a separate change that includes adding warnings and tests (as it’s BC breaking) and I would recommend moving the sparseadam changes to a separate PR and only focusing on the foreach impls in this one.

If you’re up for it, Adadelta, RMSProp and Rprop also were accidentally left out of the step to Tensor migration (done by @mikaylagawarecki almost 2 yrs ago) and should be brought into the fold. 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I recall this is actually a high prio as well. Let me migrate Adadelta, RMSProp and Rprop next.

@cpuhrsch cpuhrsch added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 11, 2023
Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for taking this on!

@@ -736,7 +736,7 @@ def _test_derived_optimizers_varying_tensors(self, optimizer_with_kwargs, kwarg)
actual = mt_p_state[k]
self.assertEqual(st_p_state[k], actual, rtol=rtol, atol=atol)

def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag):
def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag, reduced_precision=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn’t test sparseadam nor lbfgs since they do not have foreach/multitensor impls.

moreover, it looks like sparseadam doesn’t have its step as a tensor like the majority of our optimizers. moving step to be a tensor is a separate change that includes adding warnings and tests (as it’s BC breaking) and I would recommend moving the sparseadam changes to a separate PR and only focusing on the foreach impls in this one.

If you’re up for it, RMSProp and Rprop also were accidentally left out of the step to Tensor migration (done by @mikaylagawarecki almost 2 yrs ago) and should be brought into the fold. 😄

@@ -736,7 +736,7 @@ def _test_derived_optimizers_varying_tensors(self, optimizer_with_kwargs, kwarg)
actual = mt_p_state[k]
self.assertEqual(st_p_state[k], actual, rtol=rtol, atol=atol)

def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag):
def _test_derived_optimizers(self, optimizer_pairs_with_flags, flag, reduced_precision=False):
Copy link
Contributor

@janeyx99 janeyx99 Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn’t test sparseadam nor lbfgs since they do not have foreach/multitensor impls.

moreover, it looks like sparseadam doesn’t have its step as a tensor like the majority of our optimizers. moving step to be a tensor is a separate change that includes adding warnings and tests (as it’s BC breaking) and I would recommend moving the sparseadam changes to a separate PR and only focusing on the foreach impls in this one.

If you’re up for it, Adadelta, RMSProp and Rprop also were accidentally left out of the step to Tensor migration (done by @mikaylagawarecki almost 2 yrs ago) and should be brought into the fold. 😄

Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey--I am back. @jon-chuang How can I help get this over the finish line? Would want this in sooner rather than later!

@jon-chuang
Copy link
Collaborator Author

Hello @janeyx99, great! Apologies for the lack of movement on this. Let me prioritize this today.

@albanD albanD removed their request for review November 16, 2023 14:25
@jon-chuang
Copy link
Collaborator Author

@janeyx99 should be as requested now

Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! let's ship this thing :D

@janeyx99
Copy link
Contributor

@pytorchbot merge -r

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 21, 2023
@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased jon-chuang/explicit-float32-step-optim onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout jon-chuang/explicit-float32-step-optim && git pull --rebase)

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled. If you believe this is a mistake, then you can re trigger it through pytorch-bot.

@pytorchmergebot pytorchmergebot force-pushed the jon-chuang/explicit-float32-step-optim branch from 7f17207 to 18d6b92 Compare November 21, 2023 19:24
@janeyx99
Copy link
Contributor

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

seemethere added a commit that referenced this pull request Nov 21, 2023
Rely on built in bash conditionals for doing the if statement rather
than relying on $?

To avoid issues observed in #111008 (comment)

Signed-off-by: Eli Uriegas <eliuriegasmeta.com>

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Nov 21, 2023
Rely on built in bash conditionals for doing the if statement rather
than relying on $?

To avoid issues observed in #111008 (comment)

Signed-off-by: Eli Uriegas <eliuriegas@meta.com>
Pull Request resolved: #114295
Approved by: https://github.com/huydhn, https://github.com/malfet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request Merged open source release notes: optim triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optim.Adam 'step' default setting bug.
6 participants