Why provide a BatchMetaDataLoader if meta-sets have the same API as normal pytorch data-sets? #76

renesax14 · 2020-07-10T18:11:45Z

I was reading the very helpful paper for the library and saw this paragraph that confused me wrt the implementation decisions/how the library works/usage:

2.4 Meta Data-loaders
The objects presented in Sections 2.1 & 2.2 can be iterated over to generate datasets from the meta- training set; these datasets are PyTorch Dataset objects, and as such can be included as part of any standard data pipeline (combined with DataLoader). Nonetheless, most meta-learning algorithms operate better on batches of tasks. Similar to how examples are batched together with DataLoader in PyTorch, Torchmeta exposes a MetaDalaoader that can produce batches of tasks when iterated over.

In particular it says that the meta-sets (wether of type/inherits CombinationMetaDataset or a MetaDataset) are just normal pytorch data-sets.
If they have the same API as normal pytorch datasets, then why not just always pass them directly to the standard pytorch dataloaders? Why at all provide the interface:

dataloader = torchmeta.utils.data.BatchMetaDataLoader(dataset, batch_size=16)

I think a comment about this somewhere (probably in the paper would be good).

I've of course the paper (twice now) and I hope I didn't miss this detail if it was mentioned.

The text was updated successfully, but these errors were encountered:

tristandeleu · 2020-07-10T21:29:07Z

BatchMetaDataLoader is just syntactic sugar for torch.utils.data.DataLoader, with a special collate function and sampler. The reason why you'd want to use BatchMetaDataLoader for Torchmeta's datasets over torch.utils.data.DataLoader is because the defaults for torch.utils.data.DataLoader were made specifically for standard supervised learning, and not episodes as is necessary in meta-learning: elements of Torchmeta's datasets are indexed with tuples of classes, as opposed to integers for standard PyTorch datasets.

To summarize, the datasets are indeed normal PyTorch datasets, as in they have the same API, but they are using a different indexing which requires different functions in the DataLoader. This could be made more explicit in the documentation.

tristandeleu closed this as completed Jul 10, 2020

tristandeleu mentioned this issue Jul 10, 2020

How to loop through the whole data-set as in standard training (none episodic) #70

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why provide a BatchMetaDataLoader if meta-sets have the same API as normal pytorch data-sets? #76

Why provide a BatchMetaDataLoader if meta-sets have the same API as normal pytorch data-sets? #76

renesax14 commented Jul 10, 2020

tristandeleu commented Jul 10, 2020

Why provide a BatchMetaDataLoader if meta-sets have the same API as normal pytorch data-sets? #76

Why provide a BatchMetaDataLoader if meta-sets have the same API as normal pytorch data-sets? #76

Comments

renesax14 commented Jul 10, 2020

tristandeleu commented Jul 10, 2020