ProgressBar ETA with IterableDataset where __len__ undefined

## ❓ Questions/Help/Support

I've been successfully using ignite with regular `Dataset`/`TensorDataset` classes in the past. These have a fixed length and are tied to a `DataLoader` with a `DistributedSampler`. Keeping all other training hyper-parameters equal, if I increase the number of nodes/GPUs, I've always noticed that the ETA displayed by the `ProgressBar` reduces.

Then, I switched to an `IterableDataset` where the length was computable in advance and so `__len__` was defined. There is no `DistributedSampler` defined in this case because the dataset is iterable: the data files are grouped into distinct subsets in advance and assigned to different ranks. In this scenario too, I noticed that keeping all else equal, the ETA displayed by `ProgressBar` reduces when the number of nodes/GPUs increases. Some earlier discussion on this here: https://github.com/pytorch/ignite/issues/1263.

Finally, I came across the setting where I had a massive dataset where the length (i.e., number of data-points) was **not** computable in advance. So I removed the `__len__` definition, making the `IterableDataset` more generic.

Unfortunately, in this final setting, I find that the ETA displayed by `ProgressBar` doesn't reduce when the number of nodes/GPUs increases. I tried training for a fixed 50000 iterations, i.e., `epoch_length` of 50000. I notice that if I train on 1 GPU, the ETA is much lesser than if I train on > 1 GPUs. I also notice that the overall time taken per iteration is much lesser when 1 GPU is used.

I'm confused about this behavior, it doesn't seem like I'm doing something incorrect. Could you please explain what may be happening?

@vfdev-5 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ProgressBar ETA with IterableDataset where len undefined #1518

❓ Questions/Help/Support

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ProgressBar ETA with IterableDataset where __len__ undefined #1518

Description

❓ Questions/Help/Support

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

ProgressBar ETA with IterableDataset where len undefined #1518