-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dict specified val_dataloader, test_dataloader and predict_dataloader #16015
Comments
Hi, thanks for your interest in this feature!
Yes sort of. Multiple dataloader support for validation has existed before it was added to the training loop. Until that point, validation was running sequentially over the multiple dataloaders, which made the most sense because validation was typically just about collecting the metrics. When multiple dataloader support was considered for training_step, we found that the most practical use would be to fetch the data for the dataloaders individually and give the collated batches to the training_step. This is why there is a difference. As you say, perhaps in some cases it also makes sense to do the same for validation, where you could log the outputs side-by-side as we iterate through the dataloaders jointly. @justusschock probably knows a bit more about this and the edge cases here. |
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team! |
1 similar comment
This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions - the Lightning Team! |
|
It's not clear to me what this issue requests. Support for multiple dataloaders has existed for a while, but they are run sequentially during evaluation. #17163 removes this sequential constraint for But the top post doesn't explicitly state that the problem is running sequentiallly |
@carmocca this issue requests the same dict based dataloader support for val_dataloader, test_dataloader and predict_dataloader as exists for train_dataloader. Specifically, when setting up a LightningDataModule, a train_dataloader can return a dict, where a key can correspond to a dataloader. One example is that if I want my val_dataloader to contain 3 different dataloaders "val_dataloader A", "val_dataloader B" and "val_dataloader C", the only thing I can get is a list of dataloaders, but I can't reuse any name or any definition to what each dataloader is, so if by any chance the order of the dataloaders is changed, this can't be automatically detected/accounted for my the LightningModule. This is why the dict based train_dataloader is convenient, it allows to return:
instead of The question is mostly why this dict based dataloader organization is only available for train_dataloader and not the other methods. Edit: This is also reflected in the documentation for the LightningDataModule: Where the val_dataset, test_dataset and predict_dataset can't be of type Edit2: this person is also reporting the limitation I'm talking about: #16830 (comment) |
A dict can be returned at least in version 2.0. Try playing around with this script: import os
import torch
from torch.utils.data import DataLoader, Dataset
from lightning.pytorch import LightningModule, Trainer
class RandomDataset(Dataset):
def __init__(self, size, length):
self.len = length
self.data = torch.randn(length, size)
def __getitem__(self, index):
return self.data[index]
def __len__(self):
return self.len
class BoringModel(LightningModule):
def __init__(self):
super().__init__()
self.layer = torch.nn.Linear(32, 2)
def forward(self, x):
return self.layer(x)
def test_step(self, batch, batch_idx, dataloader_idx=0):
print(dataloader_idx, batch_idx, batch)
def test_dataloader(self):
return {"A": DataLoader(RandomDataset(32, 64), batch_size=2), "B": DataLoader(RandomDataset(32, 64), batch_size=2)}
def run():
model = BoringModel()
trainer = Trainer(
default_root_dir=os.getcwd(),
limit_test_batches=1,
barebones=True,
)
trainer.test(model)
if __name__ == "__main__":
run() But the dataloaders will be consumed sequentially (during validate, test and predict), and the dictionary keys are not included in the batch. Perhaps you meant that you are interested in consuming them using a different mode: https://lightning.ai/docs/pytorch/stable/data/iterables.html#multiple-iterables Example: def test_dataloader(self):
from lightning.pytorch.utilities import CombinedLoader
dataloaders = {"A": DataLoader(RandomDataset(32, 64), batch_size=2), "B": DataLoader(RandomDataset(32, 64), batch_size=2)}
return CombinedLoader(dataloaders, mode="max_size_cycle") Which in 2.0.1 raises: ValueError: `trainer.test()` only supports the `CombinedLoader(mode="sequential")` mode. But I just implemented support in master with #17163 if this is blocking you |
The |
@carmocca it seems your proposal goes in the direction I'm pitching. If I can indeed consume the validation data and test data with ("dataloader name", batch) that should be fine. This helps the validation and test set, when validating and testing on multiple dataloaders while knowing which dataloader is provided everytime and allow for the logging to reuse the name of the dataloader. Thank you for your help. Edit: we may close this issue as resolved. |
Description & Motivation
Conveniently, a DataModule can return a
train_dataloader()
which is a dict, where the name of a dataset is a key and the value is the dataloader.This is unfortunately not possible with
val_dataloader()
,test_dataloader()
andpredict_dataloader()
. Yet this could be particularly interesting from a logging perspective allowing for the DataModule to name the datasets and reuse those names when logging the data from the LitModule.Perhaps there is a specific reason for why this hasn't been implemented or can't be implemented, but it's not obvious to me.
I have tested the dict based approaches above and the documentation in DataModule.from_datasets() also reflects that impossibility.
Pitch
No response
Alternatives
No response
Additional context
No response
Edit: typo
cc @Borda @justusschock @awaelchli
The text was updated successfully, but these errors were encountered: