Skip to content

Commit

Permalink
Add LightningDataModule.load_from_checkpoint to load datamodules di…
Browse files Browse the repository at this point in the history
…rectly from checkpoint (#12550)

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: otaj <ota@grid.ai>
  • Loading branch information
4 people committed May 3, 2022
1 parent 1c25ab8 commit cd01856
Show file tree
Hide file tree
Showing 6 changed files with 279 additions and 133 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

### Added

- Added `LightningDataModule.load_from_checkpoint` to support loading datamodules directly from checkpoint ([#12550](https://github.com/PyTorchLightning/pytorch-lightning/pull/12550))


- Added a friendly error message when attempting to call `Trainer.save_checkpoint()` without a model attached ([#12772](https://github.com/PyTorchLightning/pytorch-lightning/pull/12772))

Expand Down
1 change: 1 addition & 0 deletions docs/source/common/checkpointing_basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ Inside a Lightning checkpoint you'll find:
- State of all callbacks (for stateful callbacks)
- State of datamodule (for stateful datamodules)
- The hyperparameters used for that model if passed in as hparams (Argparse.Namespace)
- The hyperparameters used for that datamodule if passed in as hparams (Argparse.Namespace)
- State of Loops (if using Fault-Tolerant training)

----
Expand Down
76 changes: 75 additions & 1 deletion pytorch_lightning/core/datamodule.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@
# limitations under the License.
"""LightningDataModule for loading DataLoaders with ease."""
from argparse import ArgumentParser, Namespace
from typing import Any, Dict, List, Mapping, Optional, Sequence, Tuple, Union
from typing import Any, Dict, IO, List, Mapping, Optional, Sequence, Tuple, Union

from torch.utils.data import DataLoader, Dataset, IterableDataset

from pytorch_lightning.core.hooks import CheckpointHooks, DataHooks
from pytorch_lightning.core.mixins import HyperparametersMixin
from pytorch_lightning.core.saving import _load_from_checkpoint
from pytorch_lightning.utilities.argparse import add_argparse_args, from_argparse_args, get_init_arguments_and_types
from pytorch_lightning.utilities.types import _PATH


class LightningDataModule(CheckpointHooks, DataHooks, HyperparametersMixin):
Expand Down Expand Up @@ -52,6 +54,9 @@ def teardown(self):
"""

name: str = ...
CHECKPOINT_HYPER_PARAMS_KEY = "datamodule_hyper_parameters"
CHECKPOINT_HYPER_PARAMS_NAME = "datamodule_hparams_name"
CHECKPOINT_HYPER_PARAMS_TYPE = "datamodule_hparams_type"

def __init__(self) -> None:
super().__init__()
Expand Down Expand Up @@ -158,3 +163,72 @@ def load_state_dict(self, state_dict: Dict[str, Any]) -> None:
state_dict: the datamodule state returned by ``state_dict``.
"""
pass

@classmethod
def load_from_checkpoint(
cls,
checkpoint_path: Union[_PATH, IO],
hparams_file: Optional[_PATH] = None,
**kwargs,
):
r"""
Primary way of loading a datamodule from a checkpoint. When Lightning saves a checkpoint
it stores the arguments passed to ``__init__`` in the checkpoint under ``"datamodule_hyper_parameters"``.
Any arguments specified through \*\*kwargs will override args stored in ``"datamodule_hyper_parameters"``.
Args:
checkpoint_path: Path to checkpoint. This can also be a URL, or file-like object
hparams_file: Optional path to a ``.yaml`` or ``.csv`` file with hierarchical structure
as in this example::
dataloader:
batch_size: 32
You most likely won't need this since Lightning will always save the hyperparameters
to the checkpoint.
However, if your checkpoint weights don't have the hyperparameters saved,
use this method to pass in a ``.yaml`` file with the hparams you'd like to use.
These will be converted into a :class:`~dict` and passed into your
:class:`LightningDataModule` for use.
If your datamodule's ``hparams`` argument is :class:`~argparse.Namespace`
and ``.yaml`` file has hierarchical structure, you need to refactor your datamodule to treat
``hparams`` as :class:`~dict`.
\**kwargs: Any extra keyword args needed to init the datamodule. Can also be used to override saved
hyperparameter values.
Return:
:class:`LightningDataModule` instance with loaded weights and hyperparameters (if available).
Note:
``load_from_checkpoint`` is a **class** method. You should use your :class:`LightningDataModule`
**class** to call it instead of the :class:`LightningDataModule` instance.
Example::
# load weights without mapping ...
datamodule = MyLightningDataModule.load_from_checkpoint('path/to/checkpoint.ckpt')
# or load weights and hyperparameters from separate files.
datamodule = MyLightningDataModule.load_from_checkpoint(
'path/to/checkpoint.ckpt',
hparams_file='/path/to/hparams_file.yaml'
)
# override some of the params with new values
datamodule = MyLightningDataModule.load_from_checkpoint(
PATH,
batch_size=32,
num_workers=10,
)
"""
return _load_from_checkpoint(
cls,
checkpoint_path,
map_location=None,
hparams_file=hparams_file,
strict=None,
**kwargs,
)
Loading

0 comments on commit cd01856

Please sign in to comment.