Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move NaN/Inf detection to a separate utilities file #6834

Merged
merged 17 commits into from
Apr 8, 2021

Conversation

ananthsub
Copy link
Contributor

@ananthsub ananthsub commented Apr 5, 2021

What does this PR do?

Fixes #6815

Before submitting

  • Was this discussed/approved via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

Did you have fun?

Make sure you had fun coding 🙃

@pep8speaks
Copy link

pep8speaks commented Apr 5, 2021

Hello @ananthsub! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-08 16:42:06 UTC

@codecov
Copy link

codecov bot commented Apr 5, 2021

Codecov Report

Merging #6834 (d34889b) into master (22a266d) will decrease coverage by 7%.
The diff coverage is 96%.

❗ Current head d34889b differs from pull request most recent head b7f4229. Consider uploading reports for the commit b7f4229 to get more accurate results

@@           Coverage Diff            @@
##           master   #6834     +/-   ##
========================================
- Coverage      91%     84%     -7%     
========================================
  Files         192     194      +2     
  Lines       12191   13129    +938     
========================================
- Hits        11145   11041    -104     
- Misses       1046    2088   +1042     

@ananthsub ananthsub changed the title [WIP] Move NaN/Inf detection to a separate utilities file Move NaN/Inf detection to a separate utilities file Apr 5, 2021
@ananthsub ananthsub marked this pull request as ready for review April 5, 2021 09:34
Copy link
Contributor

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a todo on top of the TrainingTricksMixin class that it should be removed in v1.5?

@awaelchli awaelchli added this to the 1.3 milestone Apr 5, 2021
pytorch_lightning/trainer/training_tricks.py Outdated Show resolved Hide resolved
log = logging.getLogger(__name__)


def print_nan_gradients(model: nn.Module) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of these improvements? (my code)

https://github.com/jpuigcerver/PyLaia/blob/master/laia/utils/checks.py

  • Optional exception
  • Include non finite percentages

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful for some users with unstable losses. Worth to consider.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be useful! Maybe we can add it in a follow up PR? this one is mainly moving code around for parity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer, sure

Copy link
Contributor

@tchaton tchaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

log = logging.getLogger(__name__)


def print_nan_gradients(model: nn.Module) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be useful for some users with unstable losses. Worth to consider.

pytorch_lightning/trainer/training_loop.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/training_tricks.py Outdated Show resolved Hide resolved
pytorch_lightning/utilities/nan.py Outdated Show resolved Hide resolved
ananthsub and others added 3 commits April 7, 2021 12:29
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[trainer] Simplify Trainer dependencies by making TrainerTrainingTricksMixin a utils class
6 participants