Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential bug in adadelta #29

Closed
yorkerlin opened this issue Aug 18, 2015 · 2 comments
Closed

potential bug in adadelta #29

yorkerlin opened this issue Aug 18, 2015 · 2 comments

Comments

@yorkerlin
Copy link

@bayerj
It seems there is a bug in adadelta.py when momentum is used.
The momentum correction can be applied to adadelta, rmrprop and others stochastic updates.
The potential bug is at the 110-th line of adadelta.py

    def _iterate(self):
        for args, kwargs in self.args:
            step_m1 = self.step
            d = self.decay
            o = self.offset
            m = self.momentum
            step1 = step_m1 * m * self.step_rate
            self.wrt -= step1

            gradient = self.fprime(self.wrt, *args, **kwargs)

            self.gms = (d * self.gms) + (1 - d) * gradient ** 2
            step2 = sqrt(self.sms + o) / sqrt(self.gms + o) * gradient * self.step_rate
            self.wrt -= step2

            self.step = step1 + step2
            self.sms = (d * self.sms) + (1 - d) * self.step ** 2

            self.n_iter += 1

            yield {
                'n_iter': self.n_iter,
                'gradient': gradient,
                'args': args,
                'kwargs': kwargs,
            }

I think it should be step1 = step_m1 * m instead of step1 = step_m1 * m * self.step_rate.
Correct me if I am wrong.

Note that in rmsprop.py, the 160-th line is

 step1 = step_m1 * self.momentum

, which is correct.

@yorkerlin
Copy link
Author

What is more,
I think the adadelta with momentum correction in http://climin.readthedocs.org/en/latest/adadelta.html, is not correct.

The formula about Δθ should be similar to v in http://climin.readthedocs.org/en/latest/rmsprop.html

@bayerj
Copy link
Contributor

bayerj commented Sep 21, 2015

You indeed found a bug there, thanks. I fixed it.

I could not find an error in the docs. Note that the use of \theta_{t + 1 \over 2} can be confusing, as it is described differently for rmsprop.

@bayerj bayerj closed this as completed Sep 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants