Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learning rate scheduler should step after each optimizer step #1476

Closed
rmrao opened this issue Apr 13, 2020 · 3 comments 路 Fixed by #1477
Closed

Learning rate scheduler should step after each optimizer step #1476

rmrao opened this issue Apr 13, 2020 · 3 comments 路 Fixed by #1477
Labels
bug Something isn't working help wanted Open to be worked on

Comments

@rmrao
Copy link
Contributor

rmrao commented Apr 13, 2020

馃悰 Bug

I'm not sure that this is a bug or if it is a deliberate design decision, but right now the learning rate schedule gets updated at every "step" which actually corresponds to every forward pass. I think a more standard implementation would have the learning rate scheduler "step" interval correspond to being updated every backwards pass. This has caused me a lot of problems with instability as I did not realize that using standard learning rate warmups of say 16000 steps would actually only warm up for 1000 steps if I set accumulate_grad_batches=16.

@rmrao rmrao added bug Something isn't working help wanted Open to be worked on labels Apr 13, 2020
@williamFalcon
Copy link
Contributor

good point. it should be every backward pass as you mention.

@williamFalcon
Copy link
Contributor

mind submitting PR?

@rmrao
Copy link
Contributor Author

rmrao commented Apr 13, 2020

Sure, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants