replace Hparams by init args #1896

williamFalcon · 2020-05-19T20:03:17Z

Problem

hparams was a temporary fix for not auto storing args by users. It’s something everyone hacks around, is not intuitive and makes the pl module somehow less like at pt module.

end of hparams!

This PR

This PR removes that and instead:

Stores all the args passed in init automatically so checkpoints can have this information.
doesn’t store things like losses, etc... only primitives, lists, dicts, tuples and namespace
auto saves this info into checkpoints
it DOES NOT assign properties automatically

Backward compatibility

this PR is still backward compatible for people who want to continue using hparams directly.

Summary

Before:

hparams = dict or Namespace

class LitModel(pl.LightningModule):
    def __init__(self, hparams, my_pretrained_nn_module):
        super().__init__()
        self.hparams = hparams
        self.l1 = nn.Linear(hparams.in_dim, hparams.out_dim)
        self.feature_extractor = my_pretrained_nn_module()

# old way had a ton of problems with this
model = LitModel.load_from_checkpoint(PATH)

New:

class LitModel(pl.LightningModule):
    def __init__(self, in_dim, out_dim, my_pretrained_nn_module):
        super().__init__()
        self.in_dim = in_dim
        self.out_dim = out_dim
        
        # self.in_dim, etc were auto registered to the module
        self.l1 = nn.Linear(in_dim, out_dim)
        self.feature_extractor = my_pretrained_nn_module()

# load from checkpoint still works as normal, but objects and such need to be specified
model = LitModel.load_from_checkpoint(PATH, my_pretrained_nn_module=MyModule)

# or can overwrite the old settings as well
model = LitModel.load_from_checkpoint(PATH, in_dim=some_new_dim, my_pretrained_nn_module=MyModule)

williamFalcon · 2020-05-19T20:03:44Z

@awaelchli @justusschock

Need to figure out the loading/saving stuff and update tests

pep8speaks · 2020-05-19T20:08:41Z

Hello @williamFalcon! Thanks for updating this PR.

In the file pl_examples/domain_templates/reinforce_learn_Qnet.py:

Line 289:52: W504 line break after binary operator

Comment last updated at 2020-05-24 17:00:58 UTC

awaelchli · 2020-05-19T20:08:43Z

if someone renames "self" to "this" for example, it could break, right?
need to pull out the first one in the arg list and only consider the rest

docs/source/weights_loading.rst

williamFalcon · 2020-05-19T20:24:47Z

if someone renames "self" to "this" for example, it could break, right?
need to pull out the first one in the arg list and only consider the rest

is it an ordered dict? otherwise what if someone names it not this but something else

yukw777 · 2020-05-19T20:30:27Z

pytorch_lightning/core/lightning.py

+        """
+        # two frames back is the init of the child module
+        frame = inspect.currentframe()
+        args = frame.f_back.f_back.f_locals


I mentioned this in #1735, but if my LightningModule was a subclass of another LightningModule, this wouldn't work right? We have to dynamically determine how many levels we should go since we always need to get to the leaf level.

good point. any suggestions?
I guess we could always backtrack right up to the nn.Module?

@yukw777 hadn't read that carefully actually haha. Good suggestion!

(also, happy to co-author this since it's fairly involved)

haha no worries. I'd be happy to help out.

I had to implement something similar at my job, and I ended up going with #1735 (comment). I tried to see if I can somehow automate the whole thing, but it was more trouble than it's worth, so I decided to keep things more "declarative". This does mean that a PL user would need to implement that abstract property, which makes LightningModule less transparent... We could make LightningModule a data class as @mateuszpieniak, but it's only available in 3.7, and it also makes LightningModule less transparent. It does seem like we do need to add something like this though, as it's impossible for PL to figure out whose __init__() args to save automatically...

how about i take a stab at a v1 and ping you to finish it haha.

sounds good! looking forward to it!

Re the dataclass, we could always depend on https://pypi.org/project/dataclasses/ to get dataclasses in 3.6.

I think adding a serialize_args abstract method makes sense. That's what I use at work (on a non lightning training pipeline) and it works pretty well. We have sensible serialization defaults, so it only needs to be overriden if the training module has custom, non serializable types.

williamFalcon · 2020-05-19T21:26:11Z

pytorch_lightning/core/lightning.py

+        # set module_arguments in child
+        setattr(child, 'module_arguments', module_arguments)
+
+    def _is_allowed_hparam_value(self, value):


@yukw777 this should be good no? allows for basically anything except objects (but allows dicts, lists, tuples)

Previously any picklable objects could be in hparams, so I think we should keep that behavior, which is actually quite useful for things like custom vocabulary dictionaries. It also makes it easy to invert dependencies to write tests.

@williamFalcon Other examples: this would not allow to use dataclasses and OmegaConf which are easy to use now

Borda · 2020-05-19T21:36:21Z

if someone renames "self" to "this" for example, it could break, right?
need to pull out the first one in the arg list and only consider the rest

is it an ordered dict? otherwise what if someone names it not this but something else

in python does not matter if you call it self or king just the first argument is treated such way and hold for the particular method, the self is just convention as well as this in Java

Borda · 2020-05-19T21:54:33Z

pl_examples/domain_templates/reinforce_learn_Qnet.py

+                 replay_size,
+                 warm_start_steps,
+                 gamma, eps_start,
+                 eps_end,
+                 eps_last_frame,
+                 sync_rate,
+                 lr,
+                 episode_length,
+                 batch_size) -> None:


let's add types...

Borda · 2020-05-19T21:55:44Z

pl_examples/domain_templates/semantic_segmentation.py

@@ -185,7 +182,7 @@ def main(hparams):
    # ------------------------
    # 1 INIT LIGHTNING MODEL
    # ------------------------
-    model = SegModel(hparams)
+    model = SegModel(**hparams)


we shall be careful if the hparams contains any argument (key) not listed in the init it will crash

Borda · 2020-05-19T21:56:08Z

pl_examples/models/lightning_template.py

+                 drop_prob=0.2,
+                 batch_size=2,
+                 in_features=28 * 28,
+                 learning_rate=0.001 * 8,
+                 optimizer_name='adam',
+                 data_root='./datasets',
+                 out_features=10,
+                 hidden_dim=1000,


let's add types...

Borda · 2020-05-19T21:58:17Z

for sure with this change and previously merged matrics we have to go with v0.8 as next release :]
correct me if I am wrong but I feel this is quite major API change... @PyTorchLightning/core-contributors

williamFalcon · 2020-05-19T23:25:43Z

@yukw777 ok... the last issue to handle is what you mentioned.
I need to log off for today. Want to take a look at it?
The issue is definitely with the subclassing. Check the test_auto_hparams test in test_trainer.py

=========================================================================================== short test summary info ===========================================================================================
FAILED tests/trainer/test_trainer.py::test_auto_hparams - TypeError: __init__() got an unexpected keyword argument 'batch_size'
FAILED tests/trainer/test_trainer.py::test_dict_namespace_param_save_load - TypeError: __init__() got an unexpected keyword argument 'drop_prob'

@justusschock fixed all the checkpoint and yaml stuff. Take a look?

williamFalcon · 2020-05-19T23:26:18Z

tests/trainer/test_trainer.py

+def test_auto_hparams(tmpdir):
+    class SubClassEvalModelTemplate(EvalModelTemplate):
+        def __init__(self, subclass_arg=1200):
+            super().__init__()
+
+    class SubSubClassEvalModelTemplate(SubClassEvalModelTemplate):
+        pass
+
+    classes = [SubClassEvalModelTemplate, EvalModelTemplate, SubSubClassEvalModelTemplate]
+
+    for CLASS in classes:
+        # test that the model automatically sets the args passed into init as attrs
+        model = CLASS()
+        assert model.batch_size == 32
+        model = CLASS(batch_size=179)
+        assert model.batch_size == 179
+
+        if isinstance(model, SubClassEvalModelTemplate):
+            assert model.subclass_arg == 1200
+
+        # verify that the checkpoint saved the correct values
+        trainer = Trainer(max_steps=20)
+        trainer.fit(model)
+        raw_checkpoint_path = os.listdir(trainer.checkpoint_callback.dirpath)
+        raw_checkpoint_path = [x for x in raw_checkpoint_path if '.ckpt' in x][0]
+        raw_checkpoint_path = os.path.join(trainer.checkpoint_callback.dirpath, raw_checkpoint_path)
+        raw_checkpoint = torch.load(raw_checkpoint_path)
+        assert 'module_arguments' in raw_checkpoint
+        assert raw_checkpoint['module_arguments']['batch_size'] == 179
+
+        # verify that model loads correctly
+        model = CLASS.load_from_checkpoint(raw_checkpoint_path)
+        assert model.batch_size == 179
+
+        # verify that we can overwrite whatever we want
+        model = CLASS.load_from_checkpoint(raw_checkpoint_path, batch_size=99)
+        assert model.batch_size == 99


@justusschock added this test. Any cases missing?

Yes, you only pass yaml serializable stuff in there. For example there may be users who pass their loss functions that way if they experiment with then, but you can't serialize stuff like torch.nn.MSELoss with yaml

williamFalcon · 2020-05-19T23:30:02Z

@yukw777 @tullie this is what i added. Why do we need the datamodules?

    def _auto_register_hparams(self):
        """
        Removes the need to pass in hparams. Instead, we register every argument in init
        to the module with some caveats:
        1. we don't overwrite the property if it already exists
        2. we also store a module_arguments property for model loading and saving
        """
        # two frames back is the init of the child module
        frame = inspect.currentframe()
        frame_args = frame.f_back.f_back.f_locals

        # we'll save hparams automatically (renamed to module_arguments)
        module_arguments = {}

        # pull out the child itself to make sure we have no issues
        child = frame_args['self']

        # auto set the attr which enables self.attr anywhere in the code
        for name, value in frame_args.items():

            # don't add self
            if name not in ['self']:

                # only track some things
                is_trackable = self._is_allowed_hparam_value(value)

                # don't overwrite something already set
                if not hasattr(child, name) and is_trackable:
                    setattr(child, name, value)

                if is_trackable:
                    module_arguments[name] = value

        # set module_arguments in child
        setattr(child, 'module_arguments', module_arguments)

    def _is_allowed_hparam_value(self, value):
        if isinstance(value, Namespace):
            return True
        return not hasattr(value, '__dict__')

docs/source/hyperparameters.rst

Borda · 2020-05-20T16:00:50Z

have you seen this recently?
AttributeError: module 'tensorflow' has no attribute 'io'

tests/models/test_hparams.py

Borda · 2020-05-24T17:50:31Z

tests/models/test_hparams.py

+    pass
+
+
+class AggSubClassEvalModel(SubClassEvalModel):


@williamFalcon here it is as init arg

Borda · 2020-05-24T18:08:12Z

pytorch_lightning/trainer/training_io.py

@@ -119,6 +120,12 @@
 else:
    HOROVOD_AVAILABLE = True

+PRIMITIVE_TYPES = (


@williamFalcon @awaelchli @yukw777 @festeh any other primitives shall be stored in checkpoint?

I'm using dataclasses. They are actually great in the way that they give you auto-completion (plain config classes would allow it too) and can provide some validation. But I never heard that somebody else also doing that, so I'd probably could just patch this variable in my code or invent some other hack.

In general I think we cannot handle all cases here so it would be beneficial to allow user to manually save some picklable argument, maybe via @should_pickle(argument) decorator. I'll try to design this feature.

that sounds interesting, I would just stay for this PR to get done with the complete list, and kindly ask you to make a follow-up PR with your suggestion?

awaelchli · 2020-05-24T18:42:11Z

pytorch_lightning/core/lightning.py

+
+def _collect_init_args(frame, path_args: list) -> list:
+    """Recursive search for all children."""
+    if '__class__' in frame.f_locals:


What does this do?
Here is a an example we should try to handle or at least give a warning

from pytorch_lightning import LightningModule class Example(LightningModule): def __init__(this, arg): super().__init__() this.arg = arg def forward(self, x): pass x = Example(1) print(x.module_arguments) # {'this': Example(), 'arg': 1}

module_arguments contains the object itself.
If this the PR needs to be merged asap I hope we can put it at least on the list of TODOs.

true, it would contain the aka self but it will be filtered out as it is not a primitive...
so you recommend doing the primitive filtering already here?

not sure, but we need to also inspect the constructor to get a list of accepted args and filter based on that, because of this example:

from pytorch_lightning import LightningModule class Example(LightningModule): def __init__(self, arg): my_local_var = 2 super().__init__() def forward(self, x): pass x = Example(1) print(x.module_arguments) # {'arg': 1, 'my_local_var': 2}

This will fail when we try to restore and pass in the local var which is not an argument in the constructor.

We should probably filter based on
inspect.signature(Example.__init__).parameters
and only save these locals, not the others.

x = Example(1) print(x.module_arguments) # {'arg': 1, 'my_local_var': 2}

this works as in the frame it appears with arg name always...
and as it is saved you can call x = Example(arg=1)

but there is an extra arg.
Example.load_from_checkpoint(...)
will load module argumets
module_arguments = {'arg': 1, 'my_local_var': 2}
and then call
Example(**module_arguments)
Which means
my_local_var is passed in but this name is not accepted.
Thus yielding a TypeError: __init__ got an unexpected argument "my_local_var"
I am 99% certain.

well, wait if my_local_var is not among __init__ arguments, it won't be present in module_arguments neither...

@awaelchli mind adding an edge case to the test so we know the exact case and also we can truly test it... :]

in follow up or here? because william wants to merge asap he said

@williamFalcon ^^

let’s add it as a follow up.

ready to merge? happy to merge now

williamFalcon · 2020-05-24T23:00:17Z

Great job @Borda, thanks for the feedback @yukw777 @awaelchli!

@PyTorchLightning/core-contributors play with this for a bit? also verify than hydra still works? @tullie

yukw777 · 2020-05-24T23:01:25Z

Thank you for pushing through this @Borda ! Glad I was able to help out here.

awaelchli · 2020-05-24T23:31:10Z

Really cool idea this PR! It will simplify checkpointing a lot :)

I compiled a list to keep track of unresolved issues discussed in this thread here. Feel free to add anything I have missed.
See #1937

DKandrew · 2020-05-28T07:41:36Z

Received a warning when running the Trainer UserWarning: Did not find hyperparameters at model hparams. Saving checkpoint without hyperparameters. Does it relate to the legacy code of hparams?

Borda · 2020-05-28T08:33:01Z

Received a warning when running the Trainer UserWarning: Did not find hyperparameters at model hparams. Saving checkpoint without hyperparameters. Does it relate to the legacy code of hparams?

mind shoot an issue?

DKandrew · 2020-05-31T05:11:21Z

Received a warning when running the Trainer UserWarning: Did not find hyperparameters at model hparams. Saving checkpoint without hyperparameters. Does it relate to the legacy code of hparams?

mind shoot an issue?

Sure!

drozzy · 2020-06-01T04:48:34Z

This is awesome, when can we expect this in the release?

drozzy · 2020-06-01T04:55:21Z

Wait, does this solve the issue of using params in jupyter notebooks? In other words, can I omit argparse with this?

* Misleading exception raised during batch scaling Use batch_size from `model.hparams.batch_size` instead of `model.batch_size` * Improvements considering #1896 * Apply suggestions from code review Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Misleading exception raised during batch scaling Use batch_size from `model.hparams.batch_size` instead of `model.batch_size` * Improvements considering #1896 * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

austinmw · 2021-04-02T16:44:33Z

Hi, what's the recommended way to use this with argparse?

model = LitModel(**vars(args))?

awaelchli · 2021-04-02T18:11:42Z

@austinmw Yes

mergify bot requested a review from a team May 19, 2020 20:03

williamFalcon changed the title ~~No hparams~~ [WIP] No hparams May 19, 2020

williamFalcon commented May 19, 2020

View reviewed changes

docs/source/weights_loading.rst Outdated Show resolved Hide resolved

yukw777 reviewed May 19, 2020

View reviewed changes

williamFalcon commented May 19, 2020

View reviewed changes

Borda reviewed May 19, 2020

View reviewed changes

mergify bot requested a review from a team May 19, 2020 21:55

Borda reviewed May 19, 2020

View reviewed changes

mergify bot requested a review from a team May 19, 2020 21:56

Borda added the feature Is an improvement or enhancement label May 19, 2020

Borda added this to the 0.8.0 milestone May 19, 2020

Borda added the Important label May 19, 2020

Borda requested review from ethanwharris, jeremyjordan, justusschock and tullie May 19, 2020 22:03

williamFalcon commented May 19, 2020

View reviewed changes

lgvaz mentioned this pull request May 19, 2020

Automatically log hyperparameters airctic/icevision#19

Closed

Borda reviewed May 20, 2020

View reviewed changes

docs/source/hyperparameters.rst Outdated Show resolved Hide resolved

mergify bot requested a review from a team May 20, 2020 13:47

flake8

2892e5a

williamFalcon commented May 24, 2020

View reviewed changes

tests/models/test_hparams.py Show resolved Hide resolved

Borda reviewed May 24, 2020

View reviewed changes

tests/models/test_hparams.py

pass

class AggSubClassEvalModel(SubClassEvalModel):

Copy link

Member

Borda May 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@williamFalcon here it is as init arg

mergify bot requested a review from a team May 24, 2020 17:51

Borda reviewed May 24, 2020

View reviewed changes

mergify bot requested a review from a team May 24, 2020 18:08

awaelchli reviewed May 24, 2020

View reviewed changes

mergify bot requested a review from a team May 24, 2020 18:43

williamFalcon merged commit caa9c67 into master May 24, 2020

awaelchli mentioned this pull request May 24, 2020

TODO list for "replace Hparams by init args" PR #1937

Closed

12 tasks

Borda deleted the no_hparams branch May 25, 2020 06:48

Darktex mentioned this pull request May 26, 2020

Support OmegaConf in Lightning #1883

Closed

justusschock mentioned this pull request May 28, 2020

Misleading exception raised during batch scaling #1973

Merged

Borda added this to Done in Key features - Roadmap v1.0 May 29, 2020

DKandrew mentioned this pull request May 31, 2020

A warning that may comes from legacy code #2024

Closed

remisphere mentioned this pull request Jul 24, 2020

Trainer.scale_batch_size requires model.batch_size instead of model.hparams.batch_size #2484

Closed

tbenst mentioned this pull request Sep 14, 2020

add support for save_hyperparameters with Python Data Class #3494

Closed

acxz mentioned this pull request May 20, 2021

Activating logger but still saving model acxz/pl-utils#10

Open

replace Hparams by init args #1896

replace Hparams by init args #1896

Conversation

williamFalcon commented May 19, 2020 • edited

Problem

This PR

Backward compatibility

Summary

williamFalcon commented May 19, 2020

pep8speaks commented May 19, 2020 • edited

Comment last updated at 2020-05-24 17:00:58 UTC

awaelchli commented May 19, 2020 • edited

williamFalcon commented May 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yukw777 May 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Borda commented May 19, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Borda commented May 19, 2020 • edited

williamFalcon commented May 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

williamFalcon commented May 19, 2020

Borda commented May 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

festeh May 24, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

awaelchli May 24, 2020 • edited by Borda

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

williamFalcon commented May 24, 2020

yukw777 commented May 24, 2020 • edited

awaelchli commented May 24, 2020

DKandrew commented May 28, 2020 • edited

Borda commented May 28, 2020

DKandrew commented May 31, 2020

drozzy commented Jun 1, 2020

drozzy commented Jun 1, 2020

austinmw commented Apr 2, 2021

awaelchli commented Apr 2, 2021

williamFalcon commented May 19, 2020 •

edited

pep8speaks commented May 19, 2020 •

edited

awaelchli commented May 19, 2020 •

edited

yukw777 May 19, 2020 •

edited

Borda commented May 19, 2020 •

edited

williamFalcon commented May 19, 2020 •

edited

festeh May 24, 2020 •

edited

awaelchli May 24, 2020 •

edited by Borda

yukw777 commented May 24, 2020 •

edited

DKandrew commented May 28, 2020 •

edited