-
Notifications
You must be signed in to change notification settings - Fork 24.7k
Add AdamW
to C++ frontend
#40009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AdamW
to C++ frontend
#40009
Conversation
💊 CI failures summary and remediationsAs of commit 3ea1f90 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group. This comment has been revised 15 times. |
@sotlampr |
Yeah, I rebased to pass some broken tests. Should be good now. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good to go now.
@sotlampr
Thank you so much for doing this work !
Awesome, thanks @glaringlee |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glaringlee has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@glaringlee merged this pull request in 41f2dbd. |
Summary: Slightly modified Adam, following the python implementation, and the `ProducesPyTorchValues` tests pass. I had a problem with another test though (see commit c1a6241), it seems that optimizing for two steps with the same optimizer vs optimizing for two steps using freshly initialized objects will produce the same output. Pull Request resolved: pytorch#40009 Differential Revision: D22096053 Pulled By: glaringlee fbshipit-source-id: a31a8f5488cb37c53752ddf15436efabdba67dc4
Slightly modified Adam, following the python implementation, and the
ProducesPyTorchValues
tests pass. I had a problem with another test though (see commit c1a6241), it seems that optimizing for two steps with the same optimizer vs optimizing for two steps using freshly initialized objects will produce the same output.