Add Support for multiple train loaders #1959

justusschock · 2020-05-26T15:00:22Z

Before submitting

Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
Did you read the contributor guideline, Pull Request section?
Did you make sure to update the docs?
Did you write any new necessary tests?
If you made a notable change (that affects users), did you update the CHANGELOG?

What does this PR do?

When this is finished it adds support for drawing batches from multiple train loaders at once. If the loaders are specified as a Mapping (dict), the resulting batch will consist of one batch per loader under the same keys as the loaders like this:

loaders = {"x": loader_x, "y": loader_y, "z": loader_z}

will result in a batch like this:

{"x": batch_from_loader_x, "y": batch_from_loader_y, "z": batch_from_loader_z}

and loaders in a sequence will return in a sequence-batch built of the separate batches in the correct order:

loaders = [loader_0, loader_1, loader_2]

will result in a batch like this:

[batch_from_loader_0, batch_from_loader_1, batch_from_loader_2]

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

pep8speaks · 2020-05-26T15:00:29Z

Hello @justusschock! Thanks for updating this PR.

In the file pytorch_lightning/trainer/supporters.py:

Line 260:93: W291 trailing whitespace
Line 310:1: W293 blank line contains whitespace

In the file tests/base/model_train_dataloaders.py:

Line 45:1: W293 blank line contains whitespace

In the file tests/base/model_train_steps.py:

Line 177:1: W391 blank line at end of file

Comment last updated at 2021-01-04 19:58:04 UTC

awaelchli · 2020-05-26T15:40:05Z

in the eval loop we pass in a dataloader_idx, what are your thoughts on that? does it make sense or not to have that in training_step as well?

justusschock · 2020-05-26T15:51:23Z

I don't think so, because in the validation phase we will use the dataloaders sequentially and in the training we will use them in parallel

awaelchli · 2020-05-26T16:57:09Z

yup, I like your idea. just wanted to raise this because in slack there was talk about making it consistent with eval loops.

williamFalcon · 2020-05-26T17:14:35Z

How do we handle different length datasets?

Option 1:
Cycle through smaller one

(L is a long dataset, 0,1,2,3,4 are cycles of the smaller datasets)
LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL
0000000 1111111111 22222222 33333333 44

Option 2:
Cycle through the min length of all datasets.

I personally think it should be option 1.

@justusschock one very simple way to get this feature right now is just to add all trainining datasets to concat dataset for the user.

import torch


class ConcatDataset(torch.utils.data.Dataset):
    def __init__(self, *datasets):
        self.datasets = datasets

    def __getitem__(self, i):
        result = []
        for dataset in self.datasets:
            cycled_i = i % len(dataset)
            result.append(dataset[cycled_i])

        return tuple(result)

    def __len__(self):
        return max(len(d) for d in self.datasets)

justusschock · 2020-05-26T17:21:16Z

@williamFalcon I think it should be the minimum length, since this would be the most explicit version.
IMO the cycling should be done by the user within the dataset. (I don't favor this version, but just wanted to state it here)

Another option would be to sample from each loader as long as they did not yet run out of samples and to omit the loaders which are already exhausted.

williamFalcon · 2020-05-26T17:52:14Z

but seems weird to me that i wouldn’t use the full dataset if i had a smaller one.
maybe we need to support a few modes?

Borda · 2020-06-09T21:51:56Z

@justusschock any progress here?
or someone from @PyTorchLightning/core-contributors can help...

justusschock · 2020-06-10T07:45:20Z

I'll get this done, once the metrics are finally finished :D Sorry for the delay on my side

reactivetype · 2020-06-28T19:30:13Z

@justusschock any progress here?

I could get around this problem using a custom pytorch BatchSampler which I pass to the dataloader. My dataset's get_item takes a tuple of integers as index, each integer gets a data item for corresponding dataset. The only drawback is that the data item needs to be same shape for all 3 datasets but the batch size can be different for each dataset's items.

justusschock · 2020-06-29T14:24:07Z

Okay, this is almost finished. Currently there still is a bug, if a loader has no length. Any ideas how we should proceed with this? Shall we set the overall length simply to inf?

@awaelchli @williamFalcon @Borda

Borda · 2020-06-29T14:31:03Z

I guess that we made similar "hotfix" to valid dataloader and salt length to inf
https://github.com/PyTorchLightning/pytorch-lightning/blob/f1c96930b19e608f9875df642235cc48dea2f8ee/pytorch_lightning/trainer/data_loading.py#L288-L291

christofer-f · 2020-07-08T12:59:22Z

Hi, @omiita spotted this error...
The following code gives wrong number of iterations in a training cycle

import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torchvision.datasets import MNIST
from torchvision import transforms
import pytorch_lightning as pl

class MNISTModel(pl.LightningModule):
    def __init__(self):
        super(MNISTModel, self).__init__()
        self.l_mnist = torch.nn.Linear(28 * 28, 10)

    def forward(self, x):
        return torch.relu(self.l_mnist(x.view(x.size(0), -1)))

    def training_step(self, batch, batch_idx):
        x, y = batch['mnist']
        y_hat = torch.relu(self.l_mnist(x.view(x.size(0), -1)))
        loss_mnist = F.cross_entropy(y_hat, y)
        tensorboard_logs = {'train_loss': loss_mnist}
        return {'loss': loss_mnist, 'log': tensorboard_logs}

    def configure_optimizers(self):
        opt_mnist = torch.optim.Adam(self.l_mnist.parameters(), lr=0.02)
        return opt_mnist

    def train_dataloader(self):
        loader_mnist = DataLoader(MNIST(os.getcwd(), train=True, 
            download=True, transform=transforms.ToTensor()), batch_size=32)
        loaders = {"mnist": loader_mnist}
        return loaders

def main():
    mnist_model = MNISTModel()
    trainer = pl.Trainer(gpus=1, fast_dev_run=False, max_epochs=1)    
    trainer.fit(mnist_model)   

if __name__ == "__main__":
    main()

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

pytorch_lightning/trainer/supporters.py

pytorch_lightning/utilities/data.py

tests/base/model_train_dataloaders.py

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

pytorch_lightning/trainer/data_loading.py

pytorch_lightning/trainer/supporters.py

tchaton · 2021-01-04T07:56:32Z

pytorch_lightning/trainer/supporters.py

+            length = all_lengths
+
+        elif isinstance(all_lengths, Mapping):
+            length = compute_func(all_lengths.values())


What happens if something defines something like

{"a":{"b":...}}

good point, this would currently fail. Is this something we want to enable?

Not at this point I think. People will open an issue if needed.

pytorch_lightning/trainer/supporters.py

pytorch_lightning/trainer/train_loader_patch.py

Co-authored-by: chaton <thomas@grid.ai>

tchaton

Awesome work !

Borda · 2021-01-04T20:55:24Z

@justusschock it seems that this break checks, mind check it ASAP 🐰

adamgayoso · 2021-02-19T02:06:49Z

It doesn't look like the documentation for fit() wasn't updated. Should fit() be able to take multiple train data loaders as well?

justusschock added the feature Is an improvement or enhancement label May 26, 2020

justusschock self-assigned this May 26, 2020

mergify bot requested a review from a team May 26, 2020 15:01

Borda modified the milestones: 0.7.7, 0.8.0 May 26, 2020

awaelchli mentioned this pull request May 26, 2020

Fix num batches in case of multiple dataloaders and percent_check #1920

Merged

5 tasks

rohitgr7 mentioned this pull request May 26, 2020

Separate *_percent_check for each *_dataloader #1964

Closed

williamFalcon added the priority: 0 High priority task label May 28, 2020

Borda modified the milestones: 0.8.0, 0.9.0 Jun 9, 2020

Borda added help wanted Open to be worked on waiting on author Waiting on user action, correction, or update labels Jun 9, 2020

justusschock marked this pull request as ready for review June 29, 2020 13:35

Borda force-pushed the train_loaders branch from 6901093 to 21853b7 Compare June 29, 2020 14:08

justusschock changed the title ~~WIP: Add Support for multiple train loaders~~ Add Support for multiple train loaders Jun 29, 2020

justusschock mentioned this pull request Jul 2, 2020

Using multiple dataloaders in the training_step? #2457

Closed

Borda modified the milestones: 1.1.x, 1.2 Dec 31, 2020

Apply suggestions from code review

a2d017f

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

Borda approved these changes Dec 31, 2020

View reviewed changes

pytorch_lightning/trainer/supporters.py Outdated Show resolved Hide resolved

pytorch_lightning/utilities/data.py Outdated Show resolved Hide resolved

tests/base/model_train_dataloaders.py Outdated Show resolved Hide resolved

Borda and others added 2 commits December 31, 2020 10:53

Apply suggestions from code review

6af5c90

Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>

flake8

e138c7a

Borda requested review from williamFalcon, tchaton and SeanNaren December 31, 2020 10:15

Borda enabled auto-merge (squash) December 31, 2020 10:21

chlog

9265651

tchaton mentioned this pull request Jan 3, 2021

Continual/Multitask/Transfer Learning in PyTorch Lightning #5314

Closed

tchaton reviewed Jan 4, 2021

View reviewed changes

justusschock and others added 6 commits January 4, 2021 10:09

Update pytorch_lightning/trainer/supporters.py

7669a40

Co-authored-by: chaton <thomas@grid.ai>

add missing test

8e9bd3d

fix dataset length

854756b

Update supporters.py

55037d4

remove unused patch

9d42c7e

remove tests of otherwise unused patch

232c7ce

SeanNaren approved these changes Jan 4, 2021

View reviewed changes

tchaton approved these changes Jan 4, 2021

View reviewed changes

Merge branch 'release/1.2-dev' into train_loaders

59651c6

Borda merged commit d88cf4a into release/1.2-dev Jan 4, 2021

Borda deleted the train_loaders branch January 4, 2021 19:57

Borda mentioned this pull request Jan 4, 2021

hotfix: dataloaders - add unimplemented methods #5352

Merged

12 tasks

justusschock mentioned this pull request Jan 7, 2021

add possibility for nested loaders #5404

Merged

5 tasks

SkafteNicki mentioned this pull request Feb 19, 2021

Update docs on arg train_dataloader in fit #6076

Merged

11 tasks

turian mentioned this pull request Apr 27, 2021

Feasibility of multi-task training in lightning with dynamic model size #1502

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Support for multiple train loaders #1959

Add Support for multiple train loaders #1959

justusschock commented May 26, 2020 •

edited

pep8speaks commented May 26, 2020 •

edited

awaelchli commented May 26, 2020

justusschock commented May 26, 2020

awaelchli commented May 26, 2020

williamFalcon commented May 26, 2020

justusschock commented May 26, 2020 •

edited

williamFalcon commented May 26, 2020

Borda commented Jun 9, 2020

justusschock commented Jun 10, 2020

reactivetype commented Jun 28, 2020

justusschock commented Jun 29, 2020 •

edited

Borda commented Jun 29, 2020

christofer-f commented Jul 8, 2020

tchaton Jan 4, 2021

justusschock Jan 4, 2021

tchaton Jan 4, 2021 •

edited

tchaton left a comment

Borda commented Jan 4, 2021

adamgayoso commented Feb 19, 2021

Add Support for multiple train loaders #1959

Add Support for multiple train loaders #1959

Conversation

justusschock commented May 26, 2020 • edited

Before submitting

What does this PR do?

PR review

Did you have fun?

pep8speaks commented May 26, 2020 • edited

Comment last updated at 2021-01-04 19:58:04 UTC

awaelchli commented May 26, 2020

justusschock commented May 26, 2020

awaelchli commented May 26, 2020

williamFalcon commented May 26, 2020

justusschock commented May 26, 2020 • edited

williamFalcon commented May 26, 2020

Borda commented Jun 9, 2020

justusschock commented Jun 10, 2020

reactivetype commented Jun 28, 2020

justusschock commented Jun 29, 2020 • edited

Borda commented Jun 29, 2020

christofer-f commented Jul 8, 2020

tchaton Jan 4, 2021

Choose a reason for hiding this comment

justusschock Jan 4, 2021

Choose a reason for hiding this comment

tchaton Jan 4, 2021 • edited

Choose a reason for hiding this comment

tchaton left a comment

Choose a reason for hiding this comment

Borda commented Jan 4, 2021

adamgayoso commented Feb 19, 2021

justusschock commented May 26, 2020 •

edited

pep8speaks commented May 26, 2020 •

edited

justusschock commented May 26, 2020 •

edited

justusschock commented Jun 29, 2020 •

edited

tchaton Jan 4, 2021 •

edited