Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookahead Optimizer: k steps forward, 1 step back #138

Merged
merged 22 commits into from Dec 25, 2019
Merged

Conversation

@zoq
Copy link
Member

zoq commented Sep 25, 2019

Implementation of "Lookahead Optimizer: k steps forward, 1 step back" by Michael R. Zhang, James Lucas, Geoffrey Hinton, Jimmy Ba.

zoq added 4 commits Sep 25, 2019
zoq added 5 commits Sep 25, 2019
@mlpack-bot

This comment has been minimized.

Copy link

mlpack-bot bot commented Oct 29, 2019

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍

@mlpack-bot mlpack-bot bot added the s: stale label Oct 29, 2019
@zoq zoq added s: keep open and removed s: stale labels Oct 29, 2019
@zoq

This comment has been minimized.

Copy link
Member Author

zoq commented Oct 29, 2019

Let us keep this open.

@conradsnicta

This comment has been minimized.

Copy link
Contributor

conradsnicta commented Oct 30, 2019

This optimiser seems related (at least in spirit) to the Late Acceptance Hill-Climbing (LAHC) optimiser, which has been around since about 2008 in one form or another. A recent-ish journal article on LAHC is here: https://doi.org/10.1016/j.ejor.2016.07.012
Recent improvements on LAHC are here: https://doi.org/10.1007/978-3-030-03991-2_29

Curiously, the arxiv pre-print for the Lookahead optimiser doesn't seem to mention LAHC at all. Maybe this is a case of convergent evolution.

@zoq

This comment has been minimized.

Copy link
Member Author

zoq commented Oct 31, 2019

Thanks for the reference, I agree it seems related; the improvements from the "Diversified Late Acceptance Search" paper are interesting as well, looks like there is an implementation for LAHC in python: https://github.com/Gunnstein/lahc

zoq added 3 commits Dec 14, 2019
Copy link
Member

rcurtin left a comment

The paper idea seems interesting and I enjoyed reading it, but I agree with @conradsnicta that it's related to things that are already out there. But I think that's the authors' problem and not so much ours. :)

The implementation looks correct to me---it's quite simple and easy to read both the paper and the code---but I do have a few comments about documentation and how we handle the inner optimizer type. Let me know what you think. 👍

doc/optimizers.md Show resolved Hide resolved

const static bool value =
HasBatchSize<OptimizerType, BatchSizeForm>::value;
HasBatchSize<OptimizerType, BatchSizeForm>::value ||
HasBatchSize<OptimizerType, BatchSizeConstForm>::value;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

Ahh, nice fix 👍

* algorithm, and the final objective value is returned.
* Optimize the given function using Eve. The given starting point will be
* modified to store the finishing point of the algorithm, and the final
* objective value is returned.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

Nice catch 👍

* journal = {CoRR},
* year = {2019},
* url = {http://arxiv.org/abs/1907.08610}
* }

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

I think we need an @endcode here too.

This comment has been minimized.

Copy link
@zoq

zoq Dec 18, 2019

Author Member

Nice catch, looks like there are a couple more optimizer where the @endcode or @code is missing, so I will fix those as well.


// Check if the optimizer implements HasMaxIterations() and override the
// parameter with k.
if (callbacks::traits::HasMaxIterationsSignature<BaseOptimizerType>::value)

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

These traits generally don't have to do with the callbacks themselves; if you wanted, maybe we should move them out of the callbacks namespace and put them into function/traits.hpp or something? I am not picky either way, and no need to do that in this PR, just wanted to see what you think of the idea.

This comment has been minimized.

Copy link
@zoq

zoq Dec 18, 2019

Author Member

Agreed, not sure why I put them into the callbacks traits.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 20, 2019

Member

Yeah, it's not a huge deal either way---if it's more convenient I think it's fine to open an issue or do it in a future PR.

else
{
Warn << "The BaseOptimizerType does not have a definition of "
<< "MaxIterations(), don't override the number of iteration.";

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

Maybe instead of don't override the number of iteration, it's more clear to say the given value of k will be ignored or something like this? The user should probably be aware that in a situation like this, the inner optimizer will probably run until convergence and that will make this whole Lookahead strategy take way too long. In fact, I wouldn't be opposed to forcing the existence of a MaxIterations() function and failing if it's not there.

This comment has been minimized.

Copy link
@zoq

zoq Dec 18, 2019

Author Member

There are some optimizers that don't implement MaxIterations() like Differential Evolution which could work with the Lookahead optimizer without running forever, so unless you have a strong opinion here, I would go with the rewording.

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 24, 2019

Member

Agreed, that sounds good to me. I'd suggest the base optimizer will have its configuration unchanged perhaps?

{
Warn << "Parameters are reset before every Optimize call; set "
<< "ResetPolicy() to false.";
baseOptimizer.ResetPolicy() = false;

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 14, 2019

Member

Maybe it's worth adding a boolean parameter to Lookahead that will set whether or not ResetPolicy() is called at the start of every outer iteration? It's not clear to me what they do in the paper implementation, though, which is probably what we should choose as the default if we offer an option like this.

This comment has been minimized.

Copy link
@zoq

zoq Dec 18, 2019

Author Member

Good idea.

include/ensmallen_bits/lookahead/lookahead_impl.hpp Outdated Show resolved Hide resolved
zoq and others added 6 commits Dec 14, 2019
Co-Authored-By: Ryan Curtin <ryan@ratml.org>
…functions traits.
… optimizer call.
zoq added 2 commits Dec 23, 2019
Copy link
Member

rcurtin left a comment

Awesome, looks good to me. Only two more comments; I can fix them during merge, up to you. :)

@@ -561,6 +561,7 @@ The following optimizers can be used with differentiable functions:
- [Big Batch SGD](#big-batch-sgd)
- [IQN](#iqn)
- [Katyusha](#katyusha)
- [Lookahead](#lookahead)

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 24, 2019

Member

Just noticed this; I can fix it during merge. The sentence before this list says differentiable functions but it should say differentiable separable functions.

This comment has been minimized.

Copy link
@zoq

zoq Dec 24, 2019

Author Member

Nice catch, done.

else
{
Warn << "The BaseOptimizerType does not have a definition of "
<< "MaxIterations(), don't override the number of iteration.";

This comment has been minimized.

Copy link
@rcurtin

rcurtin Dec 24, 2019

Member

Agreed, that sounds good to me. I'd suggest the base optimizer will have its configuration unchanged perhaps?

zoq added 2 commits Dec 24, 2019
…xIterations() and update the function types documentation.
@mlpack-bot
mlpack-bot bot approved these changes Dec 25, 2019
Copy link

mlpack-bot bot left a comment

Second approval provided automatically after 24 hours. 👍

@rcurtin rcurtin merged commit 22269e4 into mlpack:master Dec 25, 2019
3 checks passed
3 checks passed
Static Code Analysis Checks Build finished.
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
mlpack master build test Build finished.
Details
@rcurtin

This comment has been minimized.

Copy link
Member

rcurtin commented Dec 25, 2019

Awesome, I'll release a new version of ensmallen with this and #137 now. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.