-
Notifications
You must be signed in to change notification settings - Fork 24.6k
To add Rprop documentation #63866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
To add Rprop documentation #63866
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 1bf12e5 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
e5a2a98
to
72d1b62
Compare
Codecov Report
@@ Coverage Diff @@
## master #63866 +/- ##
==========================================
- Coverage 66.84% 66.74% -0.11%
==========================================
Files 695 698 +3
Lines 90736 90881 +145
==========================================
+ Hits 60656 60658 +2
- Misses 30080 30223 +143 |
739b5ca
to
d030bdb
Compare
c755bbb
to
e6a3eaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is not equivalent if lr is not inside step_sizes range. Is that a problem?
torch/optim/rprop.py
Outdated
&\rule{110mm}{0.4pt} \\ | ||
&\textbf{input} : \theta_0 \in \mathbf{R}^d \text{ (params)},f(\theta) | ||
\text{ (objective)}, \\ | ||
&\hspace{13mm} \eta_{+/-} \text{ (etaplus, etaminus)}, \Gamma_{plus/minus} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing initialization for g_{prev}
no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
torch/optim/rprop.py
Outdated
&\textbf{input} : \theta_0 \in \mathbf{R}^d \text{ (params)},f(\theta) | ||
\text{ (objective)}, \\ | ||
&\hspace{13mm} \eta_{+/-} \text{ (etaplus, etaminus)}, \Gamma_{plus/minus} | ||
\text{ (boundaries for lr)} \\[-1.ex] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the "step_sizes" argumeents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, exactly !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be mentioned explicitly then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
\text{ (objective)}, \\ | ||
&\hspace{13mm} \eta_{+/-} \text{ (etaplus, etaminus)}, \Gamma_{plus/minus} | ||
\text{ (boundaries for lr)} \\[-1.ex] | ||
&\rule{110mm}{0.4pt} \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also \eta_t
initialization is missing no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done !
e6a3eaa
to
7199546
Compare
7199546
to
1bf12e5
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@iramazanli merged this pull request in 54b72a9. |
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper #63236.
In this PR we are adding description of Rprop to the documentation. For more details, we refer to the paper http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.1417