Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difference in the implementation of rmsprop with Tensor Flow #4754

Closed
MahdiKalayeh opened this issue May 11, 2018 · 2 comments

Comments

@MahdiKalayeh
Copy link

commented May 11, 2018

There is a small difference between Chainer's implementation of rmsprop and the one in Tensor Flow which I thought worth pointing out. In TF, the epsilon is inside the sqrt function while this is not the case for Chainer's implementation. This does not make much difference when you pick very small (compared to mean grad**2) values for epsilon but it does when epsilon is large (i.e eps=1.0) as it is used in training Inception V3 and V4.

@kmaehashi

This comment has been minimized.

Copy link
Member

commented May 14, 2018

@kmaehashi

This comment has been minimized.

Copy link
Member

commented Jul 25, 2018

After the discussion, we will provide eps_inside_sqrt=False option, which can be used to switch this behavior. The default value is False to keep backward compatibility, but users who expects the computation equivalent to TensorFlow or Caffe2 can change the option to True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.