# Data Augmentation

Data Augmentation is a technique to artificially increase the size of the training set by applying transformations to the original images. This is a very useful technique to improve the performance of deep learning models, especially when the training set is not large enough to train a large model. For example, in the case of image classification, we can apply transformations such as rotation, translation, scaling, flipping, etc. to the original images to generate new images. In this notebook, we will see how to apply data augmentation to the CIFAR-10 dataset.

But it is important to understand that augmentation is not done to make the dataset larger, but to make the model more robust. So essentially the next batch to be processed in augmented on the CPU while the current batch is being processed on the GPU. This is done to make the model more robust and generalize better. We should introduce randomness in the augmentation process and not make it deterministic.

You have to be careful that the data augmentation doesnt destroy the label of the image.

Many a times data augmentation makes it really diffcult for the model to learn, due to the introduction of a lot of variety in data. So in addition to L2, Dropout data augmentation is also a good way to regularize the model, to prevent overfitting.

## 1. Imports and Configuration

In [1]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import tensorflow_datasets as tfds

# Configure GPU memory growth to be dynamic instead of allocating all memory at once
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

  from .autonotebook import tqdm as notebook_tqdm


## 2. Data Loading and Preprocessing

In [2]:
(ds_train, ds_test), ds_info = tfds.load(
    "cifar10",
    split=["train", "test"],
    shuffle_files=True,
    as_supervised=True,  # will return tuple (img, label) otherwise dict
    with_info=True,  # able to get info about dataset
)

In [3]:
def normalize_img(image, label):
    """Normalizes images"""
    return tf.cast(image, tf.float32) / 255.0, label


AUTOTUNE = tf.data.experimental.AUTOTUNE
BATCH_SIZE = 32

"""
augment() function is applied to (image, label) pairs. 

Augmentation is done because it helps the model to learn better. 
For example, if you want to classify a dog, you want your model to be robust to 
different angles, positions, lighting conditions etc. So, you can artificially
introduce those variations in your dataset by applying random transformations to your images,
like flipping, rotating, changing brightness, etc. This helps the model to get trained on
different variations of the same image and thus be robust to them.

In the below code, we are doing the following augmentation:
1. Resize the images to a bigger height and width.
2. Convert some images to grayscale randomly.
3. Introduce random brightness and contrast to the images.
4. Flip the images horizontally randomly.

"""
def augment(image, label):
    # If we want to resize an image to a specific size, we can use tf.image.resize().
    new_height = new_width = 32
    image = tf.image.resize(image, (new_height, new_width))

    # If we want to convert the image to grayscale, we can use tf.image.rgb_to_grayscale()
    # 10% chance to convert image to grayscale. So that our model 
    # can do prediction on grayscale images as well along with color images.
    if tf.random.uniform((), minval=0, maxval=1) < 0.1:
        # Copy the grayscale channel 3 times so we can concat it back to RGB channels
        # so that we can have 3 channels for grayscale image as well to pass it to our model.
        image = tf.tile(tf.image.rgb_to_grayscale(image), [1, 1, 3])
    

    # Random brightness and contrast
    image = tf.image.random_brightness(image, max_delta=0.1)
    image = tf.image.random_contrast(image, lower=0.1, upper=0.2)

    # a left upside down flipped is still a dog ;)
    image = tf.image.random_flip_left_right(image)  # 50%
    # image = tf.image.random_flip_up_down(image) #%50%

    return image, label


# Setup for train dataset
ds_train = ds_train.map(normalize_img, num_parallel_calls=AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits["train"].num_examples)
# Augment images in parallel using our custom augment function defined above.
ds_train = ds_train.map(augment, num_parallel_calls=AUTOTUNE)
ds_train = ds_train.batch(BATCH_SIZE)
ds_train = ds_train.prefetch(AUTOTUNE)

# Setup for test Dataset
ds_test = ds_train.map(normalize_img, num_parallel_calls=AUTOTUNE)
ds_test = ds_train.batch(BATCH_SIZE)
ds_test = ds_train.prefetch(AUTOTUNE)


# TF >= 2.3.0
# We can add data augmentation as part of our model. Not sure it is done while
# training the above custom method is done in parallel while the model is training.
# Doing it this way, we don't have to do the above custom method but we might loose
# some performance.

# data_augmentation = keras.Sequential(
#     [
#         layers.experimental.preprocessing.Resizing(height=32, width=32,),
#         layers.experimental.preprocessing.RandomFlip(mode="horizontal"),
#         layers.experimental.preprocessing.RandomContrast(factor=0.1,),
#     ]
# )

## 3. Model Definition

In [4]:
model = keras.Sequential(
    [
        keras.Input((32, 32, 3)),
        # data_augmentation,
        layers.Conv2D(4, 3, padding="same", activation="relu"),
        layers.Conv2D(8, 3, padding="same", activation="relu"),
        layers.MaxPooling2D(),
        layers.Conv2D(16, 3, padding="same", activation="relu"),
        layers.Flatten(),
        layers.Dense(64, activation="relu"),
        layers.Dense(10),
    ]
)

## 4. Compile Model

In [5]:
model.compile(
    optimizer=keras.optimizers.Adam(3e-4),
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"],
)

## 5. Model Training and Evaluation

In [6]:
model.fit(ds_train, epochs=5, verbose=2)
model.evaluate(ds_test)

Epoch 1/5
1563/1563 - 30s - loss: 2.0951 - accuracy: 0.2179
Epoch 2/5
1563/1563 - 4s - loss: 1.9100 - accuracy: 0.3035
Epoch 3/5
1563/1563 - 3s - loss: 1.7779 - accuracy: 0.3533
Epoch 4/5
1563/1563 - 4s - loss: 1.6882 - accuracy: 0.3907
Epoch 5/5
1563/1563 - 4s - loss: 1.6472 - accuracy: 0.4094


[1.6219048500061035, 0.4206799864768982]