-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update learning rate exponentially and maybe, add smoothing to loss plot #85
Conversation
In the meantime, I've noticed something else: The linear update to the learning rate
means that we have very little representation of low learning rates, and vice versa:
After changing to
(again, following Sylvain Gugger's post https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html ) it looks like this:
For a simple convnet on MNIST, I now have the following plots:
Actually, comparing both plots now, the smoothing looks less important. So if you prefer, we can also just update the logic (today's commit),and forget about the smoothing. |
Hi @dfalbel as discussed :-)
Here are examples of how the plot looks now - one for uniform splits, one for log-spaced splits: What do you think? |
First proposal of implementing a smoothed display of the loss, as done here: https://sgugger.github.io/how-do-you-find-a-good-learning-rate.html and (1:1 translation to R) here: https://blogs.rstudio.com/ai/posts/2020-10-19-torch-image-classification/.
From practical experience, this should be more helpful to most users.
Implementation-wise, it seems like it has to go into
plot.lr_records
, in which case I don't really know how we would want to make it configurable (in a useful way) ... For now, just so one can compare, I've added argumentssmoothed_loss
andbeta
to this method.Here are example plots of smoothed vs. non-smoothed loss:
What do you think Daniel?