Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNIST learning rate #35

Closed
glangford opened this issue Dec 8, 2017 · 1 comment
Closed

MNIST learning rate #35

glangford opened this issue Dec 8, 2017 · 1 comment

Comments

@glangford
Copy link

glangford commented Dec 8, 2017

The paper says in 4:

Our implementation is in TensorFlow (Abadi et al. [2016]) and we use the Adam optimizer (Kingma and Ba [2014]) with its TensorFlow default parameters, including the exponentially decaying learning rate

The TensorFlow defaults for Adam are described here:
https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer

The current capsulenet.py uses lr_decay as a callback to modify the learning rate, but there isn't any evidence that the paper follows this method. Should the lr_decay callback be removed since Adam already decays the learning rate?
(update: the TensorFlow and Keras defaults for Adam appear to be the same)

@XifengGuo
Copy link
Owner

@glangford As I said in README.md, I'm not sure if the paper used this learning rate decay method. I found that adopting lr_decay can lead to faster convergence. You can remove it and train for more epochs, it's your choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants