Skip to content

Conversation

@carmocca
Copy link
Contributor

@carmocca carmocca commented Oct 4, 2023

What does this PR do?

ExitStack will only __exit__ its context managers after it has ben __enter__ed. This means that

stack = ExitStack()
ctx1 = ContextManager()
stack.enter_context(ctx1)  # `ctx1.__enter__` is called
ctx2 = FailingContextManager()  # raises exception
stack.enter_context(ctx2)  # this is not reached

ctx1.__exit__ never gets called.

The solution is to reorder the code

stack = ExitStack()
ctx1 = ContextManager()
ctx2 = FailingContextManager()  # raises exception
stack.enter_context(ctx1)  # this is not reached
stack.enter_context(ctx2)  # this is not reached

Fixes #18705.

This is unreleased code. No need to mention it in the CHANGELOG

cc @Borda @carmocca @justusschock @awaelchli

@carmocca carmocca added the bug Something isn't working label Oct 4, 2023
@carmocca carmocca added this to the 2.1 milestone Oct 4, 2023
@carmocca carmocca self-assigned this Oct 4, 2023
@github-actions github-actions bot added the fabric lightning.fabric.Fabric label Oct 4, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Oct 4, 2023

⚡ Required checks status: All passing 🟢

Groups summary

🟢 pytorch_lightning: Tests workflow
Check ID Status
pl-cpu (macOS-11, lightning, 3.8, 1.12, oldest) success
pl-cpu (macOS-11, lightning, 3.9, 1.12) success
pl-cpu (macOS-11, lightning, 3.10, 1.13) success
pl-cpu (macOS-11, lightning, 3.10, 2.0) success
pl-cpu (ubuntu-20.04, lightning, 3.8, 1.12, oldest) success
pl-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
pl-cpu (ubuntu-20.04, lightning, 3.10, 2.0) success
pl-cpu (windows-2022, lightning, 3.8, 1.12, oldest) success
pl-cpu (windows-2022, lightning, 3.9, 1.12) success
pl-cpu (windows-2022, lightning, 3.10, 1.13) success
pl-cpu (windows-2022, lightning, 3.10, 2.0) success
pl-cpu (macOS-11, pytorch, 3.8, 1.13) success
pl-cpu (ubuntu-20.04, pytorch, 3.8, 1.13) success
pl-cpu (windows-2022, pytorch, 3.8, 1.13) success
pl-cpu (macOS-12, pytorch, 3.11, 2.0) success
pl-cpu (ubuntu-22.04, pytorch, 3.11, 2.0) success
pl-cpu (windows-2022, pytorch, 3.11, 2.0) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 pytorch_lightning: Azure GPU
Check ID Status
[pytorch-lightning (GPUs) (testing Lightning latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=177624&view=logs&jobId=47e66f3c-897a-5428-da11-bf5c7745762e) success
[pytorch-lightning (GPUs) (testing PyTorch latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=177624&view=logs&jobId=3f274fac-2e11-54ca-487e-194c91f3ae9f) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 pytorch_lightning: Benchmarks
Check ID Status
lightning.Benchmarks success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 fabric: Docs
Check ID Status
docs-make (fabric, doctest) success
docs-make (fabric, html) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 lightning_fabric: CPU workflow
Check ID Status
fabric-cpu (macOS-11, lightning, 3.8, 1.12, oldest) success
fabric-cpu (macOS-11, lightning, 3.9, 1.12) success
fabric-cpu (macOS-11, lightning, 3.10, 1.13) success
fabric-cpu (macOS-11, lightning, 3.10, 2.0) success
fabric-cpu (ubuntu-20.04, lightning, 3.8, 1.12, oldest) success
fabric-cpu (ubuntu-20.04, lightning, 3.9, 1.12) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 1.13) success
fabric-cpu (ubuntu-20.04, lightning, 3.10, 2.0) success
fabric-cpu (windows-2022, lightning, 3.8, 1.12, oldest) success
fabric-cpu (windows-2022, lightning, 3.9, 1.12) success
fabric-cpu (windows-2022, lightning, 3.10, 1.13) success
fabric-cpu (windows-2022, lightning, 3.10, 2.0) success
fabric-cpu (macOS-11, fabric, 3.8, 1.13) success
fabric-cpu (ubuntu-20.04, fabric, 3.8, 1.13) success
fabric-cpu (windows-2022, fabric, 3.8, 1.13) success
fabric-cpu (macOS-12, fabric, 3.11, 2.0) success
fabric-cpu (ubuntu-22.04, fabric, 3.11, 2.0) success
fabric-cpu (windows-2022, fabric, 3.11, 2.0) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 lightning_fabric: Azure GPU
Check ID Status
[lightning-fabric (GPUs) (testing Fabric latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=177626&view=logs&jobId=3f274fac-2e11-54ca-487e-194c91f3ae9f) success
[lightning-fabric (GPUs) (testing Lightning latest)](https://dev.azure.com/Lightning-AI/72ab7ed8-b00f-4b6e-b131-3388f7ffafa7/_build/results?buildId=177626&view=logs&jobId=47e66f3c-897a-5428-da11-bf5c7745762e) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 mypy
Check ID Status
mypy success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.

🟢 install
Check ID Status
install-pkg (ubuntu-22.04, app, 3.8) success
install-pkg (ubuntu-22.04, app, 3.11) success
install-pkg (ubuntu-22.04, fabric, 3.8) success
install-pkg (ubuntu-22.04, fabric, 3.11) success
install-pkg (ubuntu-22.04, pytorch, 3.8) success
install-pkg (ubuntu-22.04, pytorch, 3.11) success
install-pkg (ubuntu-22.04, lightning, 3.8) success
install-pkg (ubuntu-22.04, lightning, 3.11) success
install-pkg (ubuntu-22.04, notset, 3.8) success
install-pkg (ubuntu-22.04, notset, 3.11) success
install-pkg (macOS-12, app, 3.8) success
install-pkg (macOS-12, app, 3.11) success
install-pkg (macOS-12, fabric, 3.8) success
install-pkg (macOS-12, fabric, 3.11) success
install-pkg (macOS-12, pytorch, 3.8) success
install-pkg (macOS-12, pytorch, 3.11) success
install-pkg (macOS-12, lightning, 3.8) success
install-pkg (macOS-12, lightning, 3.11) success
install-pkg (macOS-12, notset, 3.8) success
install-pkg (macOS-12, notset, 3.11) success
install-pkg (windows-2022, app, 3.8) success
install-pkg (windows-2022, app, 3.11) success
install-pkg (windows-2022, fabric, 3.8) success
install-pkg (windows-2022, fabric, 3.11) success
install-pkg (windows-2022, pytorch, 3.8) success
install-pkg (windows-2022, pytorch, 3.11) success
install-pkg (windows-2022, lightning, 3.8) success
install-pkg (windows-2022, lightning, 3.11) success
install-pkg (windows-2022, notset, 3.8) success
install-pkg (windows-2022, notset, 3.11) success

These checks are required after the changes to src/lightning/fabric/plugins/precision/bitsandbytes.py, src/lightning/fabric/plugins/precision/transformer_engine.py, src/lightning/fabric/strategies/deepspeed.py, src/lightning/fabric/strategies/fsdp.py, src/lightning/fabric/strategies/strategy.py, src/lightning/fabric/strategies/xla_fsdp.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 60 minutes every 180 seconds. If you have any other questions, contact carmocca for help.

@mergify mergify bot added the ready PRs ready to be merged label Oct 4, 2023
@awaelchli
Copy link
Contributor

Should be safe to merge, I'll handle the docs failure in e.g. #18718

@carmocca carmocca merged commit 384c5b3 into master Oct 4, 2023
@carmocca carmocca deleted the carmocca/exitstack-fix branch October 4, 2023 23:25
module_sharded_ctx = self.module_sharded_context()
stack = ExitStack()
if not self.zero_stage_3:
stack.enter_context(super().module_init_context(empty_init=empty_init))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for completeness, it would probably also be better to apply it to this line right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this one matters because in the super() call, all the ctxmanagers are instantiated before any is entered

# These operations are applied to each submodule 'bottom up' in the module hierarchy.
stack.enter_context(torch.device("meta"))
elif _TORCH_GREATER_EQUAL_1_13:
stack.enter_context(_EmptyInit(enabled=bool(empty_init)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't it applied in other places here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch.device wont raise an exception but I missed this _EmptyInit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you include it in #18734?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working fabric lightning.fabric.Fabric ready PRs ready to be merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fabric leaks the default device on exception

3 participants