Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove NaNs from loss in LRFinder #1862

Merged
merged 5 commits into from
May 19, 2020
Merged

Remove NaNs from loss in LRFinder #1862

merged 5 commits into from
May 19, 2020

Conversation

rohitgr7
Copy link
Contributor

@rohitgr7 rohitgr7 commented May 17, 2020

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

What does this PR do?

Fixes #1850.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 馃檭

@mergify mergify bot requested a review from a team May 17, 2020 13:10
@mergify mergify bot requested a review from a team May 17, 2020 13:36
@rohitgr7 rohitgr7 requested review from awaelchli and removed request for a team May 17, 2020 14:36
@mergify mergify bot requested a review from a team May 17, 2020 14:36
@williamFalcon williamFalcon changed the title Remove NaNs from loss in LRFinder WIP Remove NaNs from loss in LRFinder May 17, 2020
Copy link
Member

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I wouldn't know how to test this however. Which kind of models have this behaviour?
Let's also add a changelog entry?

@mergify mergify bot requested a review from a team May 17, 2020 22:35
@awaelchli
Copy link
Member

notifiying also @SkafteNicki

@SkafteNicki
Copy link
Member

Nan can happens at the very end of the search if the loss/gradient explodes too much before we stop the search. I think we could add a simple test, something like

lr_finder = trainer.lr_find(...)
lr_before_nan = lr_finder.suggestion()
lr_finder.results['loss'][-1] = np.nan
lr_after_nan = lr_finder.suggestion()
assert lr_before_nan == lr_after_nan

Do simple run, run the suggestion algorithm, introduce a nan artificially, recompute suggestion. If this works the two suggestions should be the same.

@Borda
Copy link
Member

Borda commented May 18, 2020

just a note for speed, could we do the filtering in PyTorch not converting to NumPy?

@SkafteNicki
Copy link
Member

@Borda the values are stored as a list (of floats), so i guess that the penalty of converting to numpy vs pytorch is small

@williamFalcon
Copy link
Contributor

the conversion is at the full end of the lr finder and before training begins no?
this means the cost is negligible unless i'm missing something

@williamFalcon
Copy link
Contributor

Is this ready?

@rohitgr7
Copy link
Contributor Author

Is this ready?

I will add the test case now.

@Borda Borda added the feature Is an improvement or enhancement label May 18, 2020
@mergify
Copy link
Contributor

mergify bot commented May 18, 2020

This pull request is now in conflict... :(

@rohitgr7 rohitgr7 requested review from awaelchli and removed request for a team May 18, 2020 17:32
@mergify mergify bot requested a review from a team May 18, 2020 17:32
Copy link
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SkafteNicki
https://github.com/rohitgr7/pytorch-lightning/blob/fix_lr_finder/pytorch_lightning/trainer/lr_finder.py#L383 still introduces gpu_syncs. Could the same behavior be achieved with tensors as well to avoid gpu-syncs? (can probably be neglected, just asking, while we're on it :))

Otherwise this looks fine to me :)

@mergify mergify bot requested a review from a team May 19, 2020 05:54
@Borda Borda changed the title WIP Remove NaNs from loss in LRFinder Remove NaNs from loss in LRFinder May 19, 2020
@Borda Borda merged commit ac76dfc into Lightning-AI:master May 19, 2020
@rohitgr7 rohitgr7 deleted the fix_lr_finder branch May 19, 2020 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

lr_find doesn't return the correct suggestion if some losses are nan
6 participants