Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load_from_checkpoint support for LightningCLI when using dependency injection #18105

Merged
merged 10 commits into from Feb 23, 2024

Conversation

mauvilsa
Copy link
Contributor

@mauvilsa mauvilsa commented Jul 18, 2023

What does this PR do?

For now this pull request is intended to start the discussion on how to add full support for load_from_checkpoint for models trained using LightningCLI.

LightningCLI is designed to allow the use of dependency injection, which is a good programming pattern. Since this means that class instances are given to the module's __init__, the current implementation of save_hyperparameters is not enough because there is no reliable way to know how the objects were instantiated. And this prevents load_from_checkpoint from working correctly.

The main idea is that LightningCLI provides to save_hyperparameters, through a context variable, the actual parameters used for instantiation. These get saved as normal in the hparams.yaml file. Then load_from_checkpoint uses a custom instantiator that makes use of jsonargparse to support the configuration of subclasses (class_path and init_args) in configs.

The issue #15427 asks for a way to instantiate a model from a config file. However, load_from_checkpoint does both instantiation and loading of weights. Is there really a need to only instantiate from config? Like having a load_from_config which would get the path to the hparams.yaml file?

Also #15427 asks about dataloaders. From what I understand save_hyperparameters can also be used for data modules. However, I don't know how that is intended to work.

Fixes #15427
Fixes #13279

Before submitting
  • Was this discussed/agreed via a GitHub issue? (not for typos and docs)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure your PR does only one thing, instead of bundling different changes together?
  • Did you make sure to update the documentation with your changes? (if necessary)
  • Did you write any new necessary tests? (not for typos and docs)
  • Did you verify new and existing tests pass locally with your changes?
  • Did you list all the breaking changes introduced by this pull request?
  • Did you update the CHANGELOG? (not for typos, docs, test updates, or minor internal changes/refactors)

PR review

Anyone in the community is welcome to review the PR.
Before you start reviewing, make sure you have read the review guidelines. In short, see the following bullet-list:

Reviewer checklist
  • Is this pull request ready for review? (if not, please submit in draft mode)
  • Check that all items from Before submitting are resolved
  • Make sure the title is self-explanatory and the description concisely explains the PR
  • Add labels and milestones (and optionally projects) to the PR so it can be classified

@github-actions github-actions bot added the pl Generic label for PyTorch Lightning package label Jul 18, 2023
requirements/pytorch/extra.txt Outdated Show resolved Hide resolved
src/lightning/pytorch/cli.py Outdated Show resolved Hide resolved
tests/tests_pytorch/test_cli.py Outdated Show resolved Hide resolved
@mauvilsa mauvilsa marked this pull request as ready for review July 18, 2023 05:32
@awaelchli awaelchli added lightningcli pl.cli.LightningCLI feature Is an improvement or enhancement community This PR is from the community labels Jul 19, 2023
@mergify mergify bot removed the has conflicts label Aug 8, 2023
Copy link
Member

@Borda Borda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have this as optional so that still allow users to use an older version of jsonargparse and not have a too narrow version range...

requirements/pytorch/extra.txt Outdated Show resolved Hide resolved
Copy link
Member

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting solution. But I think we should iterate on the UX for this first

tests/tests_pytorch/test_cli.py Outdated Show resolved Hide resolved
@Borda Borda changed the title load_from_checkpoint support for LightningCLI when using dependency injection load_from_checkpoint support for LightningCLI when using dependency injection Aug 21, 2023
@mergify mergify bot removed the has conflicts label Aug 22, 2023
@mauvilsa mauvilsa requested a review from Borda October 3, 2023 09:43
Copy link
Member

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks fine. Given the complexity and the lack of docs I don't expect users to try this yet. Will you add those later?

tests/tests_pytorch/test_cli.py Outdated Show resolved Hide resolved
src/lightning/pytorch/core/mixins/hparams_mixin.py Outdated Show resolved Hide resolved
src/lightning/pytorch/core/mixins/hparams_mixin.py Outdated Show resolved Hide resolved
tests/tests_pytorch/test_cli.py Outdated Show resolved Hide resolved
@awaelchli awaelchli added this to the 2.3 milestone Feb 13, 2024
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Feb 13, 2024
Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Merging #18105 (c71406f) into master (39a86f8) will decrease coverage by 35%.
Report is 3 commits behind head on master.
The diff coverage is 100%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #18105      +/-   ##
==========================================
- Coverage      84%      48%     -35%     
==========================================
  Files         450      442       -8     
  Lines       38154    38050     -104     
==========================================
- Hits        31893    18430   -13463     
- Misses       6261    19620   +13359     

@github-actions github-actions bot added the docs Documentation related label Feb 20, 2024
Copy link
Member

@carmocca carmocca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Borda you have changes requested for this PR. Can you check it again please?

requirements/pytorch/extra.txt Show resolved Hide resolved
src/lightning/pytorch/CHANGELOG.md Outdated Show resolved Hide resolved
@mergify mergify bot removed the has conflicts label Feb 21, 2024
@mergify mergify bot added the ready PRs ready to be merged label Feb 23, 2024
@carmocca carmocca merged commit 623ec58 into Lightning-AI:master Feb 23, 2024
103 of 104 checks passed
@mauvilsa mauvilsa deleted the cli-load-from-checkpoint branch February 23, 2024 12:43
def __call__(self, class_type: Type[ModuleType], *args: Any, **kwargs: Any) -> ModuleType:
with _given_hyperparameters_context(
hparams=self.cli.config_dump.get(self.key, {}),
instantiator="lightning.pytorch.cli.instantiate_module",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carmocca @awaelchli @Borda I changed to this to avoid providing an instantiator during the call to load_from_checkpoint. The tests run fine, but now I am wondering what happens when the pytorch-lightning package is installed instead of lightning. This line could be changed so that the import path is different for each package. But then, how would this be tested? Also, if someone saves a checkpoint using pytorch-lightning, should the load work when using lightning?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a CI step where lightning.pytorch gets rewritten to pytorch_lightning for the CI jobs that build pytorch_lightning. If tests pass (they did) it should be a reliable signal that this worked as expected.

Also, if someone saves a checkpoint using pytorch-lightning, should the load work when using lightning?

However, this most likely doesn't work, but mixing the two packages is strongly discouraged. I guess it could be a problem for different users who share checkpoints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it becomes an issue, we could write a checkpoint migration that will rewrite the instantiator import path on the fly:
https://github.com/Lightning-AI/pytorch-lightning/blob/master/src/lightning/pytorch/utilities/migration/migration.py

Checkpoint migrations are applied automatically when lightning loads a checkpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrations only go one way pytorch-lightning -> lightning, or is it bidirectional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Migrations are just one way for us to define transformations to "upgrade" checkpoints from one version of lightning to another, for example when breaking changes were made so people can still load their old checkpoints.

It could also be used to define a transformation that replaces the instantiator string if the user is running from within the other package.

Copy link
Contributor Author

@mauvilsa mauvilsa Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it known how common is it for people to share checkpoints and use them indistinctly in pytorch-lightning or lightning? If it is not common or it is not known how common, then we could leave it like this. And do an improvement when someone raises the issue. Or should this be considered now before the feature is included in a release?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok leaving it like this for now.

speediedan added a commit to speediedan/finetuning-scheduler that referenced this pull request May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This PR is from the community dependencies Pull requests that update a dependency file docs Documentation related feature Is an improvement or enhancement lightningcli pl.cli.LightningCLI pl Generic label for PyTorch Lightning package ready PRs ready to be merged
Projects
None yet
4 participants