Skip to content

Commit

Permalink
Highlighting in the documentation that square root comes before addin…
Browse files Browse the repository at this point in the history
…g epsilon. #23796

ghstack-source-id: 6c4dbd396edeb987c422ec69fa32b60840b3d108
Pull Request resolved: #26735
  • Loading branch information
vincentqb committed Sep 25, 2019
1 parent 5001ec4 commit 014019d
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion torch/optim/rmsprop.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,20 @@


class RMSprop(Optimizer):
"""Implements RMSprop algorithm.
r"""Implements RMSprop algorithm.
Proposed by G. Hinton in his
`course <http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf>`_.
The centered version first appears in `Generating Sequences
With Recurrent Neural Networks <https://arxiv.org/pdf/1308.0850v5.pdf>`_.
The implementation here takes the square root of the gradient average before
adding epsilon (note that TensorFlow interchanges these two operations). The effective
learning rate is thus :math:`\alpha/(\sqrt{v} + \epsilon)` where :math:`\alpha`
is the scheduled learning rate and :math:`v` is the weighted moving average
of the squared gradient.
Arguments:
params (iterable): iterable of parameters to optimize or dicts defining
parameter groups
Expand Down

0 comments on commit 014019d

Please sign in to comment.