-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Support None from training_step in LRFinder #18129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support None from training_step in LRFinder #18129
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! only minor comments :)
cb0b3e2 to
f7e76a9
Compare
Lightning modules may return None in the training_step, in order to skip the current batch when using automatic optimization. However, the learning rate finder failed in that case. This fix * modifies the _LRCallback on train batch end method to support None from the training step by also handling an empty dict (since _AutomaticOptimization.run turns None STEP_OUTPUT into an empty dict) * makes sure, that the list of losses and lrs have the same length as otherwise the selected learning rate would not be correct. If the lightning module returns None in the training step, nan is added to list of losses. because they are ignored when computing the suggestion.
|
In the last build pl-cpu (macOS-11, lightning, 3.10, 2.0), pl-cpu (ubuntu-20.04, lightning, 3.10, 2.0) and probot failed all simply with |
awaelchli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks <3
for more information, see https://pre-commit.ci
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit 28c401c)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit 28c401c)
Lightning modules may return None in the training_step, in order to skip the current batch when using automatic optimization. However, the learning rate finder failed in that case.
What does this PR do?
This fix
_LRCallback.on_train_batch_endmethod to support None from the training step by also handling an empty dict (since_AutomaticOptimization.runturns NoneSTEP_OUTPUTinto an empty dict)nanis added to list of losses. because they are ignored when computing the suggestion.Fixes #17992
Before submitting
PR review
Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:
Reviewer checklist