# 6. A Quick Word on Data Augmentation

### About this notebook

This notebook was used in the 50.039 Deep Learning course at the Singapore University of Technology and Design.

**Author:** Matthieu DE MARI (matthieu_demari@sutd.edu.sg)

**Version:** 1.0 (10/02/2023)

**Requirements:**
- Python 3 (tested on v3.9.6)
- Torch (tested on v1.12.1)
- Torchvision (tested on v0.13.1)

### Imports and CUDA

In [1]:
# Torch
import torch
import torchvision
from torch.utils.data import Dataset
from torchvision import datasets
import torch.optim as optim
from torchvision.transforms import ToTensor, Compose, Normalize, RandomHorizontalFlip
from torchvision.datasets import MNIST
import torch.nn.functional as F
import torch.nn as nn

In [2]:
# Use GPU if available, else use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


### MNIST Dataset

This is the same original MNIST dataset as before, with no data augmentation.

In [3]:
# Define transform to convert images to tensors and normalize them
transform_data = Compose([ToTensor(),
                          Normalize((0.1307,), (0.3081,))])

# Load the data
batch_size = 256
train_dataset = MNIST(root='./mnist/', train = True, download = True, transform = transform_data)
test_dataset = MNIST(root='./mnist/', train = False, download = True, transform = transform_data)
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size = batch_size, shuffle = True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size = batch_size, shuffle = False)

### Using Data augmentation on the MNIST Dataset

Data augmentation is a technique used to artificially increase the size of the training dataset by applying various transformations to the existing data. This helps to expose the model to more variations of the data, making it more robust and less prone to overfitting.

By applying data augmentation, the model can learn to recognize patterns and features that are supposed to be invariant to certain changes in the data, such as translations, rotations, and scaling. This allows the model to generalize better to new data, which is particularly useful when the amount of available training data is limited.

Data augmentation can also be used to balance the distribution of classes in the dataset by artificially increasing the number of examples of the under-represented classes. This can help to improve the performance of the model on these classes.

In summary, data augmentation is a powerful technique that can help to improve the generalization ability of a model by increasing the diversity of the training data.

Some possible ways to perform data augmentation on this dataset include:
- Randomly translating images by a certain number of pixels,
- Randomly rotating the images by a small angle,
- Randomly scaling the images by a small factor,
- Randomly cropping the images,
- Randomly flipping the images horizontally,
- Adding random noise to the images,
- Randomly changing the brightness or contrast of the images,
- Etc.

These can be accomplished by using pre-built data augmentation transforms such as RandomRotation, RandomAffine, RandomCrop, RandomHorizontalFlip, RandomGrayscale from the torchvision.transforms library or using other libraries like imgaug, albumentations, etc. Feel free to have a look at: https://pytorch.org/vision/stable/transforms.html

Additionally, you can also chain multiple transforms together using the Compose class from the torchvision.transforms library to apply multiple data augmentations at once.

For instance, if we want to flip images horizontally, we will simply amend the code below, adding RandomHorizontalFlip() to the list of transforms.

**Quick question:** Does it makes sense to perform this data augmentation on all pictures? Can all digits be flipped horizontally to generate new valid images?

In [4]:
# Define transform to convert images to tensors and normalize them
transform_data = Compose([RandomHorizontalFlip(), ToTensor(),
                          Normalize((0.1307,), (0.3081,))])

# Load the data
batch_size = 256
train_dataset_aug = MNIST(root = './mnist/', train = True, download = True, transform = transform_data)
test_dataset_aug = MNIST(root = './mnist/', train = False, download = True, transform = transform_data)
train_loader_aug = torch.utils.data.DataLoader(train_dataset_aug, batch_size = batch_size, shuffle = True)
test_loader_aug = torch.utils.data.DataLoader(test_dataset_aug, batch_size = batch_size, shuffle = False)

### What's next?

In the next notebook, we will discuss about some important pre-trained models that were state-of-the-art in the past and their underlying techniques.