Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backpropagation optimizers #298

Open
3 of 4 tasks
mratsim opened this issue Sep 29, 2018 · 1 comment
Open
3 of 4 tasks

Add backpropagation optimizers #298

mratsim opened this issue Sep 29, 2018 · 1 comment

Comments

@mratsim
Copy link
Owner

mratsim commented Sep 29, 2018

Currently only stochastic gradient descent is supported, at the very minimum it would be nice to support:

  • RMSProp
  • Adam
  • SGD with Momentum
  • SGD with Nesterov Momentum
@dylanagreen
Copy link
Contributor

dylanagreen commented Jul 17, 2019

@mratsim I'm curious about adding momentum to SGD (largely to avoid doing any actual work in my own Nim projects, ha). Would you want to do it in the same way as PyTorch/Tensorflow? That is, both libraries provide a single "SGD" optimizer with a momentum parameter, where for momentum > 0 momentum is applied, and where momentum == 0 it acts as simple SGD. They then also implement a "Nesterov" boolean, where true applies Nesterov momentum instead of regular momentum. Or do you envision a different implementation?

See:
https://github.com/pytorch/pytorch/blob/5911cb8e5cdc24218f57480b6647d37d86e77620/torch/optim/sgd.py#L51-L52

and:
https://github.com/tensorflow/tensorflow/blob/59217f581fdef4e5469a98b62e38f851eac88688/tensorflow/python/keras/optimizers.py#L172

mratsim pushed a commit that referenced this issue Jul 23, 2019
* Add momentum to SGD.

- I've retained an older non-momentum version of SGD for backcompatibility. Storing moments requires
a variable SGD object, and most code written for arraymancer prior to this more than likely defines
itsoptimizers with `let` since that is how it is done in the examples.

* Move the old moment update to before the weight update.

- This reording doesn't change the function of update(), but it does make it easier
to implement Nesterov momentum.

* Separate SGD with momentum into its own object.

- This preserves backwards compatibility with old `let optim` declared SGD optimizers.

* Add (optional) Nesterov momentum to the SGDMomentum optimizer.

* Add learning rate decay to SGDMomentum.

* Add tests for SGD with momentum.

* Add documentation to SGD.

* Remove an extraneous modification to newSGD.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants