Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New lr policies, MultiStep and StepEarly #190

Merged
merged 1 commit into from
Oct 16, 2014
Merged

Conversation

sguada
Copy link
Contributor

@sguada sguada commented Mar 6, 2014

  • MultiStep: See lenet_multistep_solver.prototxt
    Allows to define multiple steps in the solver.prototxt by setting lr_policy: multistep and by defining stepvalue when the learning rate should be decreased. This allows to have not evenly distributed steps. One should define the sequence of stepvalue in increasing order.
  • StepEarly: See lenet_stepearly_solver.prototxt
    Allows to decrease the lr_rate dynamically based in the behaviour of Test accuracy.
    The learning will be decreased when for a number Tests defined by stepearly the maximum accuracy has not increased.

@shelhamer
Copy link
Member

Nice policies Sergio. Thanks for the examples. Could you also include tests?

Learning rate policies and termination criteria #76 are both scheduled parts of the solver, and the conversation kind of stalled about the best way to add these. The options were observer/notify classes, coding right into solver, or make learning rate and termination factories like layerfactory.

I think refactoring to a LearningRateFactory could be nice and orderly, and then the solver would call the LearningRate for any updates. What do you think?

Re: naming, StepPlateau or StepFlat might be more descriptive than StepEarly. Or as you suggested elsewhere, EarlyStep has a nice relationship to early stopping.

@kloudkl
Copy link
Contributor

kloudkl commented Mar 11, 2014

@shelhamer, I agree with you since this PR increases the number of learning rates to @Yangqing's refactoring threshold.

I will use AdaptiveLearningRateFactory and AdaptiveLearningRate when I got the time to solve #30. AdaptiveLearningRate can not be mixed with LearningRate because of the different APIs.

template<typename Dtype>
class LearningRate {
 public:
   Dtype schedule(const int iteration);
}

template<typename Dtype>
class AdaptiveLearningRate {
 public:
   // returns parameter wise learning rate
   shared_ptr<Blob<Dtype> > schedule(const int iteration, const shared_ptr<Blob<Dtype> > gradient);
}

@tdomhan
Copy link
Contributor

tdomhan commented Mar 15, 2014

Having a multistep decrease is definitely useful. The only thing I'd like to add is that having an unlimited number of steps, makes parametrizing Caffe more difficult. (The reason I bring this up is that I running hyperparameter optimization on caffe). So maybe instead of having to set each step, the stepsize could, just like the learning rate, follow a parametric function, e.g. decay exponentially or linearly. Let me know what you think.

@sguada
Copy link
Contributor Author

sguada commented Mar 16, 2014

@tdomhan I will fix first the current new_lr_decay policies and will add more later.

@shelhamer
Copy link
Member

@sguada it'd be great to include these policies, and multistep would simplify the cifar-10 example.

@beniz
Copy link

beniz commented Oct 9, 2014

hi Caffe! just a heads up to say I did try the merge on my own repo, and did run the test just as Travis did, and I am not getting any error.

FYI Travis reports this after all tests are successful:
src/caffe/solver.cpp:392: Line ends in whitespace. Consider deleting these extra spaces. [whitespace/end_of_line] [4]

Conflicts:
	include/caffe/solver.hpp
	src/caffe/proto/caffe.proto
	src/caffe/solver.cpp
@wendlerc
Copy link

When will this commit be available approximately?

sguada added a commit that referenced this pull request Oct 16, 2014
New lr policies, MultiStep and StepEarly
@sguada sguada merged commit bdd0a00 into BVLC:dev Oct 16, 2014
@sguada
Copy link
Contributor Author

sguada commented Oct 16, 2014

@Mezn it is available, let me know if you have any problems.

@beniz
Copy link

beniz commented Oct 16, 2014

@sguada my understanding is that stepearly is not part of the commit. Also, the *.prototxt for mnist are in examples/lenet instead of examples/mnist. my 2cents :) and thanks for this.

@sguada
Copy link
Contributor Author

sguada commented Oct 16, 2014

@beniz I removed the stepearly, since it was complicating the solver since now there are many outputs during test.
The examples for mnist should be back in examples/mnist. Fixed after #1293 and #1308

@beniz
Copy link

beniz commented Oct 16, 2014

@sguada thanks for the explanations. I'm into stochastic optimization, I'd be interested in looking at the old stepearly code. FYI, I am experimenting with a 'stagnation' policy relying on the median losses and or tests in order to speed up the overall training time.

RazvanRanca pushed a commit to RazvanRanca/caffe that referenced this pull request Nov 4, 2014
New lr policies, MultiStep and StepEarly
@sguada sguada mentioned this pull request Dec 21, 2014
@ronghanghu
Copy link
Member

Let's remove examples/lenet/lenet_stepearly_solver.prototxt.

ronghanghu added a commit to ronghanghu/caffe that referenced this pull request Nov 26, 2015
This `examples/lenet/lenet_stepearly_solver.prototxt` is introduced in BVLC#190 by mistake, since stepearly is never actually merged.
lukeyeager added a commit to lukeyeager/caffe that referenced this pull request Aug 15, 2016
Fix Python installation with CMake install target
acmiyaguchi pushed a commit to acmiyaguchi/caffe that referenced this pull request Nov 13, 2017
This `examples/lenet/lenet_stepearly_solver.prototxt` is introduced in BVLC#190 by mistake, since stepearly is never actually merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants