# Diference between iterabler and mapper

## Iterable-style DataPipes
An iterable-style dataset is an instance of a subclass of IterableDataset that implements the `__iter__()` protocol, and represents an iterable over data samples. This type of datasets is particularly suitable for cases where random reads are expensive or even improbable, and where the batch size depends on the fetched data.

In [1]:
import torchdata.datapipes as dp

In [3]:
pipeline = (
    dp.iter.IterableWrapper(range(10))
    .map(lambda x: x + 1)
    .shuffle()
    .batch(batch_size=2)
    .shuffle()
)

for i in iter(pipeline):
    print(i, end=' ')

[6, 5] [3, 10] [8, 2] [4, 9] [1, 7] 

An Iterable can't be accessed by index

In [4]:
pipeline[2]

NotImplementedError: 

But you can force it into a list

In [5]:
list(pipeline)

[[6, 2], [10, 4], [5, 1], [9, 3], [7, 8]]

In [6]:
list(pipeline)[2]

[5, 1]

## Map-style DataPipes
A Map-style DataPipe is one that implements the `__getitem__()` and `__len__()` protocols, and represents a map from (possibly non-integral) indices/keys to data samples.

For example, when accessed with mapdatapipe`[idx]`, could read the idx-th image and its corresponding label from a folder on the disk.

In [7]:
pipeline = (
    dp.map.SequenceWrapper(range(10))
    .map(lambda x: x + 1)
    .shuffle()
    .batch(batch_size=2)
    .shuffle()
)
for i in iter(pipeline):
    print(i, end=' ')

[8, 5] [3, 9] [2, 10] [4, 1] [6, 7] 



In [8]:
pipeline[2]

[2, 10]