Additional optimizers of differentiable functions #68

lukasmack0 · 2019-01-10T19:04:29Z

Are there any optimizers that anyone would like to see implemented? In particular optimizers for differentiable functions.
@zoq any ideas or requests? @rcurtin has suggested Nesterov's Accelerated Gradient Descent method as a potential.

zoq · 2019-01-10T19:57:53Z

Nesterov's Accelerated Gradient Descent sounds like a good idea to me, ND-Adam (Normalized Direction-preserving Adam) might be an alternative.

lukasmack0 · 2019-01-12T15:51:11Z

To clarify, for Nesterov's Accelerated Gradient Descent are you referring to the algorithm described here https://blogs.princeton.edu/imabandit/2013/04/01/acceleratedgradientdescent/, as opposed to the algorithm described here http://ruder.io/optimizing-gradient-descent/index.html#nesterovacceleratedgradient which looks like it's already implemented in sgd/update_policies/nesterov_momentum_update.hpp?

zoq · 2019-01-14T20:40:35Z

I was thinking about the http://ruder.io/optimizing-gradient-descent/index.html#nesterovacceleratedgradient which as you pointed out is already implemented.

rcurtin · 2019-01-14T21:28:48Z

Ah, I suppose that we do have http://ensmallen.org/docs.html#nesterov-momentum-sgd , but that doesn't work for functions that are just differentiable---it only works on differentiable separable functions. We could add a NesterovGradientDescent method that would work for differentiable functions, or, we can do the harder thing, which is to use the template metaprogramming infrastructure in ensmallen_bits/include/function/ to create an Evaluate(const arma::mat& coordinates) function when we have an Evaluate(const arma::mat& coordinates, const size_t i, const size_t batchSize) function. (And similarly for Gradient().) That would be tricky if you are not comfortable with template metaprogramming though. :)

Otherwise I might also suggest ND-Adam, or maybe https://arxiv.org/pdf/1711.05101.pdf] or if you want a distributed challenge (I guess you could do it with OpenMP) there is also https://papers.nips.cc/paper/5761-deep-learning-with-elastic-averaging-sgd.pdf, for instance. All of those are differentiable separable optimizers, though---maybe you can find some other type of optimizer to implement, if you like?

rcurtin · 2019-01-19T19:46:50Z

@lukasmack0 I was porting mlpack issues to ensmallen today and opened #73, which I think might be interesting to you. 👍

originalsouth · 2019-01-31T21:20:25Z

Are there any optimizers that anyone would like to see implemented? In particular optimizers for differentiable functions.

How about Non-linear Conjugate Gradient, BFGS, and perhaps even Newton--Raphson (if the Hessian is available).

zoq · 2019-01-31T21:24:01Z

Absolutely, there is L-BFGS, so personally I don't see BFGS as a high priority, but I agree each one would be a nice addition.

niteya-shah · 2019-02-06T13:26:19Z

@rcurtin can I do AdamWR / SGDWR ? I read the paper and I think I can do it and I didnt see its implementation in the library.

rcurtin · 2019-02-06T17:13:15Z

@niteya-shah: sure, I think it could be nice to add those also.

mlpack-bot · 2019-04-09T17:08:36Z

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

originalsouth · 2019-04-17T00:33:31Z

Sad :(

rcurtin · 2019-04-17T00:37:02Z

We can reopen it if you like. It might be better to open issues written for first-time contributors for BFGS, nonlinear CG, and N-R (although we don't have any abstractions for second-order differentiable functions at the moment). "Written for first-time contributors" basically just means that they have enough detail to get started even if they are not familiar with the internals of ensmallen. If you'd like to do that I could mark those keep-open. This repository doesn't currently have problems with way too many issues open, so it's not a problem for me to add a few feature requests there. (Although I reserve the right to close the issue a year or two from now if nobody jumps on it.)

lukasmack0 · 2019-04-17T07:08:54Z

I'd be interested in having it re-opened. I should have a lot more time to look at it in the next month or so.

zoq · 2019-04-17T13:44:06Z

Personally, I don't see the need to open an issue that asks for new optimisers in a general sense. Contributions in that or another direction are always welcome. Also, if somebody likes to see a method implemented, opening a new issue is just fine and also much appreciated.

rcurtin added c: optimizers t: feature request labels Mar 10, 2019

mlpack-bot bot added the s: stale label Apr 9, 2019

mlpack-bot bot closed this as completed Apr 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional optimizers of differentiable functions #68

Additional optimizers of differentiable functions #68

lukasmack0 commented Jan 10, 2019

zoq commented Jan 10, 2019

lukasmack0 commented Jan 12, 2019

zoq commented Jan 14, 2019

rcurtin commented Jan 14, 2019

rcurtin commented Jan 19, 2019

originalsouth commented Jan 31, 2019

zoq commented Jan 31, 2019

niteya-shah commented Feb 6, 2019

rcurtin commented Feb 6, 2019

mlpack-bot bot commented Apr 9, 2019

originalsouth commented Apr 17, 2019

rcurtin commented Apr 17, 2019 •

edited

Loading

lukasmack0 commented Apr 17, 2019

zoq commented Apr 17, 2019

Additional optimizers of differentiable functions #68

Additional optimizers of differentiable functions #68

Comments

lukasmack0 commented Jan 10, 2019

zoq commented Jan 10, 2019

lukasmack0 commented Jan 12, 2019

zoq commented Jan 14, 2019

rcurtin commented Jan 14, 2019

rcurtin commented Jan 19, 2019

originalsouth commented Jan 31, 2019

zoq commented Jan 31, 2019

niteya-shah commented Feb 6, 2019

rcurtin commented Feb 6, 2019

mlpack-bot bot commented Apr 9, 2019

originalsouth commented Apr 17, 2019

rcurtin commented Apr 17, 2019 • edited Loading

lukasmack0 commented Apr 17, 2019

zoq commented Apr 17, 2019

rcurtin commented Apr 17, 2019 •

edited

Loading