# "TorchData: PyTorch Data loading utility library"

> Learn How to load image data with TorchData

- toc: true 
- badges: true
- comments: true
- categories: [pytorch]
- image: https://images.pexels.com/photos/1029635/pexels-photo-1029635.jpeg?auto=compress
- keywords: PyTorch, deep learning, data


# In this tutorial, we will learn about TorchData


![Photo by Scott Webb from Pexels](https://images.pexels.com/photos/1029635/pexels-photo-1029635.jpeg?auto=compress&cs=tinysrgb&w=1260&h=750&dpr=2)

In [1]:
!pip install torchdata -q

In [1]:
import os.path
import re

import torch
from torch.utils.data.datapipes.utils.decoder import imagehandler, mathandler
from torchdata.datapipes.iter import (
    FileOpener,
    Filter,
    IterableWrapper,
    IterKeyZipper,
    Mapper,
    RoutedDecoder,
    TarArchiveLoader,FileLister,CSVParser, Filter, 
)


from PIL import Image
from torch.utils.data import DataLoader
from torchvision.transforms.functional import to_tensor

In [2]:
ROOT = "/Users/aniket/datasets/cifar-10/train"

In [3]:
csv_dp = FileLister(f"{ROOT}/../trainLabels.csv")
csv_dp = FileOpener(csv_dp)
csv_dp = csv_dp.parse_csv()
csv_dp = Filter(csv_dp, lambda x: x[1]!="label")

labels = {e: i for i, e in enumerate(set([e[1] for e in csv_dp]))}



In [4]:
x = iter(csv_dp)
next(x)

['1', 'frog']

In [7]:
def get_filename(data):    
    idx, label = data
    return f"{ROOT}/{idx}.png", label

def load_image(data):
    file, label = data
    return Image.open(file), label

def process(data):
    img, label = data
    return to_tensor(img), labels[label]

In [8]:
dp = csv_dp.map(get_filename)
dp = dp.map(load_image)
dp = dp.map(process)

In [9]:
dl = DataLoader(
        dp,
        batch_size=4,
        shuffle=True,
    )

In [10]:
next(iter(dl))[0].shape

torch.Size([4, 3, 32, 32])