# Reading images from disk for a classification task
## Problem
You want to read images from disk for a classification task. Each image belongs to exactly one class.

## Solution TensorFlow

Images are are read from the filesystem with the `ImageDataGenerator` class from the `tensorflow.keras.preprocesing.image` module. The class expects the following layout on the filesystem to correctly assign labels.
```
root/class1/aaa.png
root/class1/bbb.png
...
root/class2/aaa.png
root/class2/bbb.png
```
A root directory contains subfolder for each class which contains the images. When the class is instantiated different image augmentations and transformations can be specified (e.g. `rescale=1/255` to scale the pixel values between 0.0 and 1.0). The `flow_from_directory` method creates a generator that is used to train the model with the model's `fit_generator` method. The `flow_from_directory` method takes among others, arguments to specify the root directory of the dataset and the batch size.

In [1]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1/255)
train_generator = datagen.flow_from_directory('/tmp/dataset', batch_size=16)

# The generator is used with the model's fit_generator function
# model.fit_generator(train_generator)

Found 5 images belonging to 2 classes.


## Solution PyTorch
The `torchvision` package comes with a class `ImageFolder` which loads images from disk and assigns labels to them. The class expects the follwing layout on the filesystem.
```
root/class1/aaa.png
root/class1/bbb.png
...
root/class2/aaa.png
root/class2/bbb.png
```
A root directory contains subfolder for each class which contains the images. The supported file formats are .jpg,.jpeg,.png,.ppm,.bmp,.pgm, and .tif.


In [2]:
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
import torchvision.transforms as transforms

dataset = ImageFolder('/tmp/dataset', transform=transforms.ToTensor())
dataloader = DataLoader(dataset)

for images, labels in dataloader:
    # Training loop
    pass

The labels are numeric tensors where each class is assigned a number in alphabetical order. The transform argument specifies what transformations to apply to the images. In this example, it only transforms an image into a tensor. However, more complex transformation chains can be defined.