Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Add LightningOptimizer parity test and resolve AMP bug #5191

Merged
merged 59 commits into from
Dec 23, 2020

Conversation

tchaton
Copy link
Contributor

@tchaton tchaton commented Dec 19, 2020

What does this PR do?

Fixes #5165
Fixes #5159

Before submitting

  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together? Otherwise, we ask you to create a separate PR for every change.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?
  • Did you verify new and existing tests pass locally with your changes?
  • If you made a notable change (that affects users), did you update the CHANGELOG?

PR review

Anyone in the community is free to review the PR once the tests have passed.
Before you start reviewing make sure you have read Review guidelines. In short, see the following bullet-list:

  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified; Bugfixes should be including in bug-fix release milestones (m.f.X) and features should be included in (m.X.b) releases.

Did you have fun?

Make sure you had fun coding 🙃

@tchaton tchaton added this to the 1.1.x milestone Dec 19, 2020
@tchaton tchaton changed the title Bugfix/5165 enable pl optimizer refactor [Bugfix] Add LightningOptimizer parity test and resolve AMP bug Dec 19, 2020
@tchaton tchaton added priority: 0 High priority task bug Something isn't working labels Dec 19, 2020
@Borda Borda mentioned this pull request Dec 20, 2020
11 tasks
pytorch_lightning/core/lightning.py Outdated Show resolved Hide resolved
@@ -489,6 +489,10 @@ def optimizer_step(self, optimizer, opt_idx, batch_idx, train_step_and_backward_
'native PyTorch amp and lbfgs are not compatible.'
' To request, please file a Github issue in PyTorch and tag @mcarilli')

if not isinstance(optimizer, LightningOptimizer):
# wraps into LightingOptimizer only for running step
optimizer = LightningOptimizer.to_lightning_optimizer(optimizer, self.trainer)
Copy link
Member

@awaelchli awaelchli Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tchaton you can move the if statement inside the to_lightning_optimizer function to achieve idempotence and avoid code duplication.

Copy link
Member

@awaelchli awaelchli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests are a bit complex, I don't really understand them but I was able to verify that they fail on the master branch with the wrong loss values so I suppose the scaling bug is fixed 👍

@tchaton tchaton enabled auto-merge (squash) December 22, 2020 08:21
@tchaton
Copy link
Contributor Author

tchaton commented Dec 22, 2020

the tests are a bit complex, I don't really understand them but I was able to verify that they fail on the master branch with the wrong loss values so I suppose the scaling bug is fixed 👍

The test are performing parity comparison with vanilla training in several scenarios and making sure everything matches properly when using enable_pl_optimizer=True/False. We will need to add more parity test such as Apex, DDP modes, multi optimisers.

Best,
T.C

@tchaton tchaton enabled auto-merge (squash) December 23, 2020 07:50
@tchaton tchaton merged commit ae04311 into master Dec 23, 2020
@Borda Borda deleted the bugfix/5165_enable_pl_optimizer_refactor branch December 23, 2020 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority: 0 High priority task
Projects
None yet
6 participants