Explicitly enable grad in closure #18268

0x404 · 2023-08-10T01:57:54Z

What does this PR do?

When using optimizers from certain third-party libraries (such as Hugging Face's AdamW), their optimizer.step is decorated with torch.no_grad. However, Lightning encapsulates hooks like training_step and backward within a closure, which is executed during optimizer.step. This results in calculations within training_step that rely on gradients not having access to gradients.

To address this issue, the Closure.closure is decorated with torch.enable_grad. This ensures that gradients can be computed correctly within optimizer.step, while the remaining parts of the optimizer can still benefit from the no_grad context.

Fixes #18254

Before submitting

Was this discussed/agreed via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

src/lightning/pytorch/loops/optimization/automatic.py

tests/tests_pytorch/loops/optimization/test_closure.py

…nto bugfix/no-grad

for more information, see https://pre-commit.ci

…ix/no-grad

awaelchli · 2023-08-10T20:59:07Z

Thanks @0x404 for jumping on this so quickly. Great improvement.

Co-authored-by: awaelchli <aedu.waelchli@gmail.com> (cherry picked from commit b88b8b3)

0x404 added 2 commits August 9, 2023 04:59

test: add a test case for testing closure with no_grad optimizer

9fac094

fix: enable grad when running closure

80281f5

0x404 requested review from awaelchli, carmocca, justusschock, Borda and williamFalcon as code owners August 10, 2023 01:57

github-actions bot added the pl Generic label for PyTorch Lightning package label Aug 10, 2023

Merge branch 'master' into bugfix/no-grad

77bb6b0

0x404 mentioned this pull request Aug 10, 2023

"element 0 of tensors does not require grad and does not have a grad_fn" when using AdamW from Hugging Face #18254

Closed

0x404 changed the title ~~fix "no grad" error when using optimizer with @torch.no_grad step~~ fix no grad error when using optimizer with @torch.no_grad step Aug 10, 2023

awaelchli reviewed Aug 10, 2023

View reviewed changes

src/lightning/pytorch/loops/optimization/automatic.py Show resolved Hide resolved

tests/tests_pytorch/loops/optimization/test_closure.py Show resolved Hide resolved

awaelchli changed the title ~~fix no grad error when using optimizer with @torch.no_grad step~~ Explicitly enable grad in closure Aug 10, 2023

awaelchli added optimization bug Something isn't working community This PR is from the community labels Aug 10, 2023

awaelchli added this to the 2.0.x milestone Aug 10, 2023

0x404 added 3 commits August 10, 2023 08:07

fix: simply test case

9e9aa9f

test: assert torch.is_grad_enabled during training step

43b422a

Merge branch 'bugfix/no-grad' of https://github.com/0x404/lightning i…

b1cce73

…nto bugfix/no-grad

carmocca approved these changes Aug 10, 2023

View reviewed changes

Borda requested a review from awaelchli August 10, 2023 08:25

Borda approved these changes Aug 10, 2023

View reviewed changes

mergify bot added the ready PRs ready to be merged label Aug 10, 2023

awaelchli and others added 5 commits August 10, 2023 22:16

extend the test to reproduce the error on master

2821829

more concise test description

deb1d2d

[pre-commit.ci] auto fixes from pre-commit.com hooks

715c374

for more information, see https://pre-commit.ci

add changelog

76b2ca5

Merge branch 'bugfix/no-grad' of github.com:0x403/lightning into bugf…

2c8a2c2

…ix/no-grad

awaelchli approved these changes Aug 10, 2023

View reviewed changes

awaelchli merged commit b88b8b3 into Lightning-AI:master Aug 10, 2023
79 checks passed

This was referenced Aug 10, 2023

“element 0 of tensors does not require grad and does not have a grad_fn” #18222

Closed

Precision 32 disabling grad #18062

Closed

Borda pushed a commit that referenced this pull request Aug 14, 2023

Explicitly enable grad in closure (#18268)

cdaec95

Co-authored-by: awaelchli <aedu.waelchli@gmail.com> (cherry picked from commit b88b8b3)

lexierule pushed a commit that referenced this pull request Aug 14, 2023

Explicitly enable grad in closure (#18268)

28e0bdd

Co-authored-by: awaelchli <aedu.waelchli@gmail.com> (cherry picked from commit b88b8b3)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicitly enable grad in closure #18268

Explicitly enable grad in closure #18268

0x404 commented Aug 10, 2023

awaelchli commented Aug 10, 2023

Explicitly enable grad in closure #18268

Explicitly enable grad in closure #18268

Conversation

0x404 commented Aug 10, 2023

What does this PR do?

PR review

awaelchli commented Aug 10, 2023