LR too high for gradient accumulation #3040

marii-moe · 2020-11-26T09:37:14Z

We were not dividing by the number of batches to accumulate, so this effectively increasing the learning rate. Added test to make sure this is fixed. I think this thread was lost when fastai/fastai2 got moved to fastai/fastai: fastai/fastai2#194

fixes #3023

review-notebook-app · 2020-11-26T09:37:17Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jph00 · 2020-11-26T17:57:59Z

Nice one!

muellerzr · 2020-11-26T18:29:11Z

I know this is fixed and merged now, but I'm hesitant to use it. I was seeing better results with the old implementation of the high LR and got worse results here. Do you have any advice @marii-moe as to how I should adapt my old LR to work with this new adjustment?

marii-moe · 2020-11-27T02:30:30Z

I know this is fixed and merged now, but I'm hesitant to use it. I was seeing better results with the old implementation of the high LR and got worse results here. Do you have any advice @marii-moe as to how I should adapt my old LR to work with this new adjustment?

I am not familiar with your particular example, but you should get approximately the same results if you set your learning rate like so:
new_lr = old_lr*n_acc/bs

Fixed lr too high for gradient accumulation fixed fastai#3023

a077acc

marii-moe requested a review from jph00 as a code owner November 26, 2020 09:37

jph00 merged commit 35e5303 into fastai:master Nov 26, 2020

jph00 added the bug label Nov 26, 2020

jph00 changed the title ~~Fixed lr too high for gradient accumulation fixed #3023~~ LR too high for gradient accumulation Nov 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LR too high for gradient accumulation #3040

LR too high for gradient accumulation #3040

marii-moe commented Nov 26, 2020

review-notebook-app bot commented Nov 26, 2020

jph00 commented Nov 26, 2020

muellerzr commented Nov 26, 2020 •

edited

Loading

marii-moe commented Nov 27, 2020

LR too high for gradient accumulation #3040

LR too high for gradient accumulation #3040

Conversation

marii-moe commented Nov 26, 2020

review-notebook-app bot commented Nov 26, 2020

jph00 commented Nov 26, 2020

muellerzr commented Nov 26, 2020 • edited Loading

marii-moe commented Nov 27, 2020

muellerzr commented Nov 26, 2020 •

edited

Loading