New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc][hackathon] To add Adadelta Optimizer to the documentation #63255
Conversation
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 05b6943 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages: linux-xenial-cuda11.3-py3.6-gcc7 / test (default, 1, 2, linux.8xlarge.nvidia.gpu) (1/2)Step: "Test PyTorch" (full log | diagnosis details | 🔁 rerun)
|
7499cbf
to
0e157af
Compare
9fbbcd5
to
8cf072a
Compare
Codecov Report
@@ Coverage Diff @@
## master #63255 +/- ##
==========================================
- Coverage 66.81% 63.72% -3.10%
==========================================
Files 695 698 +3
Lines 90845 90881 +36
==========================================
- Hits 60701 57912 -2789
- Misses 30144 32969 +2825 |
990ec96
to
23fbbb6
Compare
torch/optim/adadelta.py
Outdated
&\rule{110mm}{0.4pt} \\ | ||
&\textbf{input} : \gamma \text{ (lr)}, \: \theta_0 \text{ (params)}, | ||
\: f(\theta) \text{ (objective)}, \: \rho \text{ (decay)}, \: weightdecay \\ | ||
&\textbf{initialize} : E[g^2]_0 \leftarrow 0, \: E[\Delta x^2]_0 \leftarrow 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not name these like the other optimizers (like exp_avg) I feel like this would be easier to read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good ! and done
800e4c5
to
cdd0cf2
Compare
torch/optim/adadelta.py
Outdated
&\textbf{input} : \gamma \text{ (lr)}, \: \theta_0 \text{ (params)}, | ||
\: f(\theta) \text{ (objective)}, \: \rho \text{ (decay)}, | ||
\: \lambda \text{ (weight decay)} \\ | ||
&\textbf{initialize} : square\_avg_0 \leftarrow 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use v_0
like the other algorithms: https://pytorch.org/docs/master/generated/torch.optim.RMSprop.html#torch.optim.RMSprop ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done !
3cd6c2f
to
3ed039c
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
3ed039c
to
79cf319
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
79cf319
to
88dddf9
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
88dddf9
to
05b6943
Compare
@iramazanli has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@iramazanli merged this pull request in dafa0a5. |
It has been discussed before that adding description of Optimization algorithms to PyTorch Core documentation may result in a nice Optimization research tutorial. In the following tracking issue we mentioned about all the necessary algorithms and links to the originally published paper #63236.
In this PR we are adding description of AdaDelta Algorithm to the documentation. For more details, we refer to the paper here https://arxiv.org/abs/1212.5701
cc @vincentqb @iramazanli