Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callbacks continued #889

Merged
merged 7 commits into from Feb 26, 2020
Merged

Callbacks continued #889

merged 7 commits into from Feb 26, 2020

Conversation

hadim
Copy link
Contributor

@hadim hadim commented Feb 18, 2020

This PR is based on #849.

A new callback mechanism is added where a user can pass a list of callbacks to the trainer that is called during the various steps of the trainer's life.

from pytorch_lightning.callbacks import Callback

class PrintCallback(Callback):
    def on_fit_begin(self):
        print("Training is started!")

    def on_fit_end(self):
        print(f"Training is done. The logs are: {self.trainer.logs}")

# a list of callbacks
callbacks = [PrintCallback()]
trainer = Trainer(callbacks=callbacks)

It allows inserting custom logic in the training loop outside of the LightningModule.

Let me know what do you guys think.

@Borda Borda added the feature Is an improvement or enhancement label Feb 18, 2020
environment.yml Outdated Show resolved Hide resolved
@hadim hadim force-pushed the callbacks branch 2 times, most recently from 8a7a829 to b9a04f5 Compare February 19, 2020 15:06
@pep8speaks
Copy link

pep8speaks commented Feb 19, 2020

Hello @hadim! Thanks for updating this PR.

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-02-26 00:59:24 UTC

@Borda Borda changed the title [WIP] Callbacks [WIP] Callbacks [dependence on #849] Feb 19, 2020
@hadim hadim force-pushed the callbacks branch 3 times, most recently from 21fbe62 to aa40e34 Compare February 19, 2020 16:26
@Borda Borda changed the title [WIP] Callbacks [dependence on #849] [blocked by #849] Callbacks [wip] Feb 21, 2020
@hadim hadim force-pushed the callbacks branch 2 times, most recently from 1e1729e to 5328e7b Compare February 22, 2020 13:40
@hadim hadim changed the title [blocked by #849] Callbacks [wip] Callbacks [wip] Feb 23, 2020
@hadim
Copy link
Contributor Author

hadim commented Feb 23, 2020

@williamFalcon about immutability, I had a look and I don't think this is something easy todo in Python without heavy hacks. See https://discuss.python.org/t/immutability-in-python-is-really-hard/2536/2

I think we should just make it very clear in the doc/docstring that callbacks MUST NOT modify the pl module and the trainer.

@williamFalcon
Copy link
Contributor

why do we want to add this limitation? shouldn’t the user be ok with messing around with the trainer or model if they want to in a callback?

@williamFalcon
Copy link
Contributor

also, should we rethink the name “Callback”?

this ALWAYS ends up needing a lengthy explanation about what it is. why not call it something more intuitive?

Maybe something like:
“FunctionBlock”
“CodeBlock”
“Snippet”

just brainstorming here lol. i just think it’s super confusing for newcomers and non engineers.

@hadim
Copy link
Contributor Author

hadim commented Feb 23, 2020

I haven't totally incorporated the 3 already existant callbacks in the new system (they still use the new Callback class). I think the merge of the existant callbacks should be done in another PR to not make this one too large in terms of changes.

About the name, I like Callback but I don't have strong feelings about it so whatever you decide is good for me.

@jeremyjordan
Copy link
Contributor

I haven't totally incorporated the 3 already existant callbacks in the new system (they still use the new Callback class). I think the merge of the existant callbacks should be done in another PR to not make this one too large in terms of changes.

Regarding this, should we update the early_stopping and model_checkpointing args to serve as simple flags to include the callbacks (or whatever we call them) in the list?

Trainer(..., checkpoint_callback=True)  # use default
Trainer(..., checkpoint_callback=False)  # disable

checkpoint_callback = ModelCheckpoint(filepath='my_path')
Trainer(..., checkpoint_callback=checkpoint_callback)  # warn user to pass via callbacks arg
Trainer(..., callbacks=[checkpoint_callback])  # use checkpoint callback with user provided spec

@jeremyjordan
Copy link
Contributor

why do we want to add this limitation? shouldn’t the user be ok with messing around with the trainer or model if they want to in a callback?

Yes, but if the user can modify the trainer then the callback quickly transitions from non-essential to essential for reproducibility.

However, there are cases when it would be desirable to modify objects in the trainer. For example, fast.ai has a callback for a learning rate finder which is quite useful. This would be something that the user would want to enable when first building a model but then disable once they find a good range and want to start training.

@Borda
Copy link
Member

Borda commented Feb 23, 2020

Regarding this, should we update the early_stopping and model_checkpointing args to serve as simple flags to include the callbacks (or whatever we call them) in the list?

Then we shall check that they are not duplicated, meaning when they are added via simple flag and later among callbacks?

@jeremyjordan
Copy link
Contributor

Ah yes, good point 😄 Perhaps in that case, we notice that an instance of ModelCheckpoint was provided in the callback list and issue a warning that the user provided instance will be used in place of the default instance.

checkpoint_callback = ModelCheckpoint(filepath='my_path')

Trainer(..., checkpoint_callback=True)  # use default
Trainer(..., callbacks=[checkpoint_callback])  # use checkpoint callback with user provided spec
Trainer(..., checkpoint_callback=True, callbacks=[checkpoint_callback])  # use checkpoint callback with user provided spec and alert user via a warning

What do you think? Does this resolve any potential confusion?

@Borda
Copy link
Member

Borda commented Feb 23, 2020

I think that the question is very hard because you do not have much clues if double callback (e.g. CheckPoint) is mistake or desired state... Maybe you would need to match also parameters of the ChP to determine whether or not they are identical...

@hadim
Copy link
Contributor Author

hadim commented Feb 23, 2020

What do you think if we finish the design of the callback system and discuss the way we integrate the existing callbacks in a new PR/issue?

Do you have any comments about the one implemented in this PR?

pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved
pytorch_lightning/trainer/trainer.py Outdated Show resolved Hide resolved
Copy link
Contributor

@jeremyjordan jeremyjordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work on this! just a few comments

hparams = tutils.get_hparams()
model = CurrentTestModel(hparams)

class TestCallback(Callback):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also assert that the trainer and pl_module arg values are an instance of the expected class?

pytorch_lightning/callbacks/base.py Show resolved Hide resolved
@@ -64,7 +64,7 @@ def __init__(self, monitor: str = 'val_loss', min_delta: float = 0.0, patience:
self.monitor_op = mode_dict[mode]
self.min_delta *= 1 if self.monitor_op == np.greater else -1

self.on_train_begin()
self.on_train_begin(None, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's weird for a callback to invoke one of its own methods. i know this is how the EarlyStopping callback was previously written, but do you happen to know why this is the case? i would expect on_train_begin to be called when the TrainerCallbackHookMixin invokes it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's weird and should be fixed.

@jeremyjordan
Copy link
Contributor

@hadim could we consolidate the code such that model hooks and callbacks are invoked at the same time, and within the context of the a profiler?

with self.profiler.profile('on_epoch_start'):
    self.on_epoch_start()  # callbacks
    if self.is_function_implemented('on_epoch_start'):
        model = self.get_model()
        model.on_epoch_start()  # model hook

@hadim
Copy link
Contributor Author

hadim commented Feb 25, 2020

Yes, we should. Not sure when I'll have time to work on this PR this week. So we could merge it as it is and improve it in other PRs or if someone wants to push some commits here, feel free (I can add you to my fork).

@williamFalcon williamFalcon added this to the 0.6.1 milestone Feb 25, 2020
@williamFalcon
Copy link
Contributor

@hadim why don't we merge this asap so we can get it in the release.
@jeremyjordan maybe open a new PR to implement your suggestions? i do like the profiler addition here.

Great job everyone!

@hadim
Copy link
Contributor Author

hadim commented Feb 25, 2020

I agree @williamFalcon

Will rebase soon.

@hadim
Copy link
Contributor Author

hadim commented Feb 26, 2020

Rebasing wasn't a very funny moment here!

Please merge whenever you can @williamFalcon!

I'll open a new issue for what's next.

@hadim hadim mentioned this pull request Feb 26, 2020
5 tasks
@jeremyjordan
Copy link
Contributor

happy to pick up the work on a separate PR

@hadim
Copy link
Contributor Author

hadim commented Feb 26, 2020

@jeremyjordan look at #947

@williamFalcon williamFalcon merged commit be24456 into Lightning-AI:master Feb 26, 2020
@Borda Borda changed the title Callbacks [wip] Callbacks continued Feb 26, 2020
@Borda
Copy link
Member

Borda commented Feb 26, 2020

@hadim @jeremyjordan GREAT job! Pls could you update also the change log...

@hadim hadim deleted the callbacks branch February 26, 2020 13:05
@jeremyjordan jeremyjordan mentioned this pull request Feb 27, 2020
5 tasks
@Borda Borda mentioned this pull request Mar 4, 2020
tullie pushed a commit to tullie/pytorch-lightning that referenced this pull request Apr 3, 2020
* Add callback system + associated test

* Add trainer and pl_module args to callback methods

* typing

* typo in docstring

* Switch to on_.*_start()

* fix on_test_start

* fix the mess after rebasing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants