-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix error in keras Learning Rate Scheduler #3135
Conversation
- When the LearningRateScheduleCallback is initialized with multiplier = constant, it does not work correctly (there is no learning rate decay): more specifically learning rates = initial_lr * multiplier, initial_lr * multiplier, initial_lr * multiplier,... whereas the expectation it decays as initial_lr, initial_lr * multiplier, initial_lr * multiplier^2 and such - Another nit is when multiplier = constant, there is no need to enforce that staircase = True. The change made to fix the above mistake can handle staircase = False as well Signed-off-by: Dinesh Ramasamy <89654805+iitmdinesh@users.noreply.github.com>
if not callable(multiplier): | ||
self.staircase = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove it? Aren't we supposed to apply multiplier only once per epoch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is we should not override what the user has configured. I can set staircase = True or staircase = False and both would work fine with this new formulation
self.staircase = True | ||
self.multiplier = lambda epoch: multiplier | ||
# If multiplier is a constant, it corresponds to exponential decay | ||
self.multiplier = lambda epoch: multiplier ** epoch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good catch.
In future, instead of apply multiplier each epoch, I think we should also do add a "step" option so that we can change it every N steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. That is however currently supported by means of passing in a callable instead of a number
Unit Test Results (with flaky tests) 843 files ±0 843 suites ±0 6h 29m 2s ⏱️ ±0s For more details on these failures, see this check. Results for commit 719c495. ± Comparison against base commit 719c495. ♻️ This comment has been updated with latest results. |
Checklist before submitting
Description
Fixes issue identified in Uber, which uses this repository. If multiplier is set to a constant, we realized by looking at the learning rate logs that it remained a constant equal to initial_lr * multiplier.
Review process to land