Length check fails for IterableDataset #1076

snie2012 · 2020-05-27T23:59:37Z

I am using IterableDataset with unknown length. It fails inside the following if clause:

Line 642 in cc78b3b

if hasattr(data, "__len__"):

Here is the logic why it fails:

Data is PyTorch DataLoader, which has the __len__ attr, so it goes into the if clause
It runs the code epoch_length = len(data), and leads to length = self._IterableDataset_len_called = len(self.dataset) in dataloader.py in pytorch
Since dataset is an IterableDataset, it doesn't have length so it throws sth like TypeError: object of type 'IterableDataset' has no len()

My concern is that we shouldn't expect IterableDataset to have __len__ and the code should not fail because of it. Any thoughts on this?

The text was updated successfully, but these errors were encountered:

vfdev-5 · 2020-05-28T00:05:39Z

@snie2012 thanks for the report! Yes, you are right about that. We have a test with IterableDataset but with defined epoch_length:

ignite/tests/ignite/engine/test_engine.py

Line 511 in a2f302b

def test_engine_with_iterable_dataloader():

Yes, definitely, we need to support IterableDataset with unknown length !

vfdev-5 · 2020-05-28T12:35:35Z

@snie2012 the bug should be fixed in the next nightly release

SantoshGuptaML · 2021-04-29T01:11:23Z

Should DistributedSampler also support IterableDataset ? I get this error message

    train_sampler = DistributedSampler(TrainDataset(config))
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/distributed.py", line 63, in __init__
    self.num_samples = int(math.ceil(len(self.dataset) * 1.0 / self.num_replicas))
TypeError: object of type 'TrainDataset' has no len()

Where TrainDataset is an extension of IterableDataset

vfdev-5 · 2021-04-29T12:19:13Z

@SantoshGuptaML no, torch's DistributedSampler does not support iterable datasets, only map-style datasets.

There are various way to route samples depending on process (distributed or simply mp):

In case of distributed training, the simplest way to distribute it (if possible) is to set up your iterable dataset depending on the rank:

import ignite.distributed as idist
rank = idist.get_rank()
ws = idist.get_world_size()
dataset = get_iterable_dataset(..., rank=rank, world_size=ws, ...)

where get_iterable_dataset is your custom method that fetches unique data for a rank.

vfdev-5 added the bug label May 28, 2020

vfdev-5 self-assigned this May 28, 2020

vfdev-5 added a commit to vfdev-5/ignite that referenced this issue May 28, 2020

Fixes pytorch#1076

f1f3f1b

vfdev-5 mentioned this issue May 28, 2020

Fixes #1076 #1077

Merged

3 tasks

sdesrozis closed this as completed in #1077 May 28, 2020

sdesrozis pushed a commit that referenced this issue May 28, 2020

Fixes #1076 (#1077)

6ccc4a8

alierenak mentioned this issue Jul 26, 2021

Distributed TPU training with run_mlm duplicate data huggingface/transformers#12883

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Length check fails for IterableDataset #1076

Length check fails for IterableDataset #1076

snie2012 commented May 27, 2020 •

edited

Loading

vfdev-5 commented May 28, 2020 •

edited

Loading

vfdev-5 commented May 28, 2020

SantoshGuptaML commented Apr 29, 2021

vfdev-5 commented Apr 29, 2021

Length check fails for IterableDataset #1076

Length check fails for IterableDataset #1076

Comments

snie2012 commented May 27, 2020 • edited Loading

vfdev-5 commented May 28, 2020 • edited Loading

vfdev-5 commented May 28, 2020

SantoshGuptaML commented Apr 29, 2021

vfdev-5 commented Apr 29, 2021

snie2012 commented May 27, 2020 •

edited

Loading

vfdev-5 commented May 28, 2020 •

edited

Loading