Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Design of DataModule #2707

Closed
awaelchli opened this issue Jul 25, 2020 · 3 comments 路 Fixed by #3344
Closed

Improve Design of DataModule #2707

awaelchli opened this issue Jul 25, 2020 · 3 comments 路 Fixed by #3344
Assignees
Labels
feature Is an improvement or enhancement help wanted Open to be worked on

Comments

@awaelchli
Copy link
Member

awaelchli commented Jul 25, 2020

馃殌 Feature

Motivation

Since the introduction of datamodules we have duplicated code. For example, the train/val/test_dataloader methods in LightningModule and DataModule have the same name, same docs, same signature but live in two files.
Also the argparse methods are copy pasted from Trainer, it will become impossible to maintain this in the future.

Pitch

Factor out the hook definitions:

class DataHooks:

    def prepare_data(...):
    """ docs... """

    def setup(...):
    """ docs... """

    def train_dataloader(...):
    """ docs... """

    etc.

and then in the other classes, inherit from it:

class LightningModule(..., ModelHooks, DataHooks, ...)
    ...

class DataModule(..., DataHooks, ...)
    ...

Note also:

  • we even have duplicated tests

Alternatives

I see no alternatives. This is the best way I know.

@PyTorchLightning/core-contributors makes sense?

@awaelchli awaelchli added feature Is an improvement or enhancement help wanted Open to be worked on labels Jul 25, 2020
@Borda
Copy link
Member

Borda commented Jul 25, 2020

yes, just make them abstract...

@williamFalcon
Copy link
Contributor

A few things to keep in mind:

  1. We want datamodules to be usable even if you're not using lightning for training.
  2. We are moving away from Mixins.

@awaelchli
Copy link
Member Author

We want datamodules to be usable even if you're not using lightning for training.

This won't be changed by my suggestion. It's just refactoring, nothing else.

We are moving away from Mixins.

Which mixins? I am aware that in Lightning some classes are named mixins but conceptually they are actually not and should not be called that. Mixins have their place and one should not confuse multiple inheritance with mixins. Could you clarify what you mean?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants