# Dataset Creation
In this notebook, the creation of a simple annotated dataset for image classification tasks will be covered. The dataset will be created in PyTorch because these are then compatible with the use of pre-trained torch models. 

In [1]:
# Imports
import os
import torch
import torchvision
import numpy as np
import matplotlib.pyplot as plt
from torchvision import datasets, transforms

  from .autonotebook import tqdm as notebook_tqdm


In [2]:
# Normalization parameters (comuted from numerous images)
mean = np.array([0.5, 0.5, 0.5])
std = np.array([0.25, 0.25, 0.25])

# Define transforms for the dataset 
data_transforms = {
    'train': transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean,std)
    ]),
    'valid': transforms.Compose([
        transforms.ToTensor(), 
        transforms.Normalize(mean,std)
    ])
}

We will create a tiny custom dataset of images from 4 different cartoons (Looney Tunes, The Simpsons, Tom & Jerry, Scooby-Doo). The images can be stored in a local folder. The corresponding labels are defined in the name of the folder.

The images are loaded with the datasets.Imagefolder function. These datasets are then read with the Dataloader function. We can define the size of the batches here.

In [3]:
# Image folder object to load images from folders
train_datasets = datasets.ImageFolder(os.path.join('../../data/custom_dataset', 'train'), data_transforms['train'])
valid_dataset = datasets.ImageFolder(os.path.join('../../data/custom_dataset', 'valid'), data_transforms['valid'])
datasets_dict = {'train': train_datasets, 'valid': valid_dataset}

# Image dataloaders to load data in batches
train_loader = torch.utils.data.DataLoader(datasets_dict['train'], batch_size=4, shuffle=True)
valid_loader = torch.utils.data.DataLoader(datasets_dict['valid'], batch_size=4, shuffle=True)
dataloaders_dict = {'train': train_loader, 'valid': valid_loader}

# Get size of datasets
for key in datasets_dict.keys():
   print("In the " + key + " dataset are " + str(len(datasets_dict[key])) + " images")

# Get classes
class_names = datasets_dict['train'].classes
print("These are the classes: " + str(class_names))

In the train dataset are 8 images
In the valid dataset are 4 images
These are the classes: ['Looney Tunes', 'Scooby Doo', 'The Simpsons', 'Tom & Jerry']


A function is written to show the images with its corresponding label. 

In [4]:
def imshow(inp, title):
    inp = inp.numpy().transpose((1,2,0))
    inp = std * inp + mean
    inp = np.clip(inp,0,1)
    plt.imshow(inp)
    plt.title(title)
    plt.show()

In [5]:
# Get a batch from the dataset
inputs, labels = next(iter(train_loader))

print(inputs[0])

# Set to device
inputs = inputs.to(device)
labels = labels.to(device)

# Make a grid
out = torchvision.utils.make_grid(x, nrow=1) # organizing the images in a grid structure

# Show
imshow(out, title=predictions)

RuntimeError: stack expects each tensor to be equal size, but got [3, 360, 480] at entry 0 and [3, 720, 1280] at entry 1