-
Notifications
You must be signed in to change notification settings - Fork 25.6k
To add Nesterov Adam algorithm for multi-tensor optimizers API #59165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 6c0595d (more details on the Dr. CI page and at hud.pytorch.org/pr/59165):
1 failure not recognized by patterns:
This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group. |
be6e6e0
to
2f603a3
Compare
torch/optim/_functional.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this belong to this PR? looks like #59009 got added here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it belongs to #59009 :
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's look at #59009 first
3fdae58
to
60c0f39
Compare
51e8b50
to
284ba60
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. please use 4e-5
for the tolerance on gpu so that the test passes.
5039362
to
9d423a5
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
9d423a5
to
4093d27
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
9a19faf
to
569c777
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
569c777
to
f68a9e2
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
5f80cc1
to
cd97229
Compare
cd97229
to
6c0595d
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@iramazanli merged this pull request in f0e972a. |
…ch#59165) Summary: Previously in the PR: pytorch#59009 we added NAdam to Optimizers. Here in this PR we are proposing multi-tensor version of NAdam for PyTorch. Nadam has been proposed in the paper https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ and report and report : http://cs229.stanford.edu/proj2015/054_report.pdf by Timothy Dozat. It has been one of the most used algorithm in Deep Learning community. It worth to noting that the implementation of NAdam is inspired by the implementation for Keras : https://github.com/tensorflow/tensorflow/blob/f9d386849581d15d72f6f1f96f12aac230a8edbe/tensorflow/python/keras/optimizer_v2/nadam.py Pull Request resolved: pytorch#59165 Reviewed By: vincentqb Differential Revision: D29360577 Pulled By: iramazanli fbshipit-source-id: 0fe14016303b2df2cb8cc31912a2674acf63d1e5
Summary: Previously in the PR: #59009 we added NAdam to Optimizers. Here in this PR we are proposing multi-tensor version of NAdam for PyTorch. Nadam has been proposed in the paper https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ and report and report : http://cs229.stanford.edu/proj2015/054_report.pdf by Timothy Dozat. It has been one of the most used algorithm in Deep Learning community. It worth to noting that the implementation of NAdam is inspired by the implementation for Keras : https://github.com/tensorflow/tensorflow/blob/f9d386849581d15d72f6f1f96f12aac230a8edbe/tensorflow/python/keras/optimizer_v2/nadam.py Pull Request resolved: #59165 Reviewed By: vincentqb Differential Revision: D29360577 Pulled By: iramazanli fbshipit-source-id: 0fe14016303b2df2cb8cc31912a2674acf63d1e5
Previously in the PR: #59009 we added NAdam to Optimizers. Here in this PR we are proposing multi-tensor version of NAdam for PyTorch.
Nadam has been proposed in the paper https://openreview.net/forum?id=OM0jvwB8jIp57ZJjtNEZ and report and report : http://cs229.stanford.edu/proj2015/054_report.pdf by Timothy Dozat.
It has been one of the most used algorithm in Deep Learning community.
It worth to noting that the implementation of NAdam is inspired by the implementation for Keras :
https://github.com/tensorflow/tensorflow/blob/f9d386849581d15d72f6f1f96f12aac230a8edbe/tensorflow/python/keras/optimizer_v2/nadam.py