Skip to content

Conversation

thomas-woehrle
Copy link

Adds documentation as discussed in #38853

Copy link

pytorch-bot bot commented May 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154113

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 21a5219 with merge base d7a83ab (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@albanD albanD removed their request for review May 22, 2025 17:43
@albanD albanD added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 22, 2025
Copy link
Contributor

@janeyx99 janeyx99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two concerns, but maybe it's okay.

  1. While I like having some mention of this over none at all, I'm not sure the doc reader will necessarily know what this means from just the phrase. It may be better to include a bigger note:: and include the mathematical difference like mentioned in the issue.
  2. We should mention this in all the *AdamWs, NAdam + RAdam as they have decoupled_weight_decay options.

@thomas-woehrle
Copy link
Author

  1. While I like having some mention of this over none at all, I'm not sure the doc reader will necessarily know what this means from just the phrase. It may be better to include a bigger note:: and include the mathematical difference like mentioned in the issue.

One could link to the comment of the original implementation issue, here. It explains the "problem" nicely. Would this be preferable over a description in the docstring?

  1. We should mention this in all the *AdamWs, NAdam + RAdam as they have decoupled_weight_decay options.

I could add the note to all of them. This would make it even more desirable to link to an existing explanation compared to explaining the issue in detail in the docstring itself.

Also, if #22343 would be implemented things would/could change again, just mentioning that

@janeyx99
Copy link
Contributor

janeyx99 commented Jun 27, 2025

#22343 will likely not be implemented any time soon.

A description in the docstr is still preferred (as a digestible description) and I'm not sure we typically link to issues in docs (though I'm not theoretically opposed). You can define a docstr in optimizer.py and import it in all the relevant files for deduplication.

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Aug 26, 2025
@github-actions github-actions bot closed this Sep 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open source release notes: optim Stale triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants