You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do a couple of runs with constant learning, varied systematically between runs, and find out if we can find some learning rate that consistently reduces loss. After that we can experiment with more complicated schemes.
The text was updated successfully, but these errors were encountered:
I started a run, effortless-cosmos-19, with the cyclic scheduler with maximum and minimum set to values that may be more sane for this model (based on this article). So far, it is at least outperforming all previous runs. Will let it run until Monday 8 a.m. (according to my estimation) and then we'll see how it went.
Already, now, one interesting feature is the correlation of the periodic pattern in the loss with the learning rate. This might suggest that, as also mentioned in the article, some kind of decreasing maximum might be helpful to press down the big (seemingly useless) peaks in the loss and make it converge even faster.
Do a couple of runs with constant learning, varied systematically between runs, and find out if we can find some learning rate that consistently reduces loss. After that we can experiment with more complicated schemes.
The text was updated successfully, but these errors were encountered: