-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "plateau" LR policy #4606
base: master
Are you sure you want to change the base?
Add "plateau" LR policy #4606
Conversation
024fcb0
to
f22fdb4
Compare
This method has been used in experiments for the following article. |
@@ -181,6 +183,8 @@ message SolverParameter { | |||
optional int32 stepsize = 13; | |||
// the stepsize for learning rate policy "multistep" | |||
repeated int32 stepvalue = 34; | |||
// the stepsize for learning rate policy "plateau" | |||
repeated int32 plateau_winsize = 41; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just re-use stepsize
and/or stepvalue
?
Also, did you mean to use plateau_winsize
here? Your PR body mentions plateau_stepsize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your comment. I just updated the PR body.
Talking about plateau_winsize
, I agree that it's not a bad idea to re-use stepvalue
or stepsize
. However, they specify exact number of iterations between LR drops while plateau_winsize
is more like a minimum bound of of iterations between two consecutive LR drops. I thought sharing one parameter name might cause confusion.
ed02f67
to
7d57157
Compare
Applied changes from the upstream |
Hi, |
Hi @JM-MP |
Adds plateau LR policy to solver
Adds plateau LR policy to solver
Is this only supported in single gpu mode? In multi-gpu mode, |
This PR adds a new LR policy called "plateau".
It's believed that as far as loss decreases it's better to keep the higher LR, and this policy help you to do that without repetitive trials or continuous monitoring.
With the policy, LR is lowered when the minimum loss isn't updated for a certain number of iterations (plateau_winsize). On the other hand, the LR never drops if loss keeps decreasing.
You should set one or more window-sizes (just like with "multistep" policy) in solver.prototxt