<!--TITLE:Data Augmentation-->

# Introduction #

**TODO**

# Understanding Invariance #

An idea we touched on when looking at pooling was **invariance**. Broadly, to say that a machine learning model is invariant to some property means that it will treat two examples differing only by that property as being exactly the same. We said that pooling made a network invariant to small shifts in the position of a feature. The network simply ignores that kind of difference.

To solve a classification problem means to find a model that is invariant with respect to the class labels: it should assign the same label to all images in the same class.

Giving our model translation invariance is helpful because things in the same class tend to have the same features, even if those features are shifted around. If our classifier detects a beak and feathers, our picture is almost certainly of a bird and not a fish or a horse, regardless of where those beak and features happen to be.

# Learning Invariance #

So one way to improve a model is to build an invariance into the model itself, like we did with pooling. The other way is to *teach* the model the invariance. This is what you are doing when you show the model many different images with the same class label. It learns to treat those images the same.

<!--TODO: images of the same class-->
<figure>
<img src="" width=400 alt="Images of the same class.">
</figure>

A typical dataset, however, won't have enough examples to fully teach the model all the ways in which images from a class might be different. If you need to improve the accuracy of your model, the best thing is to collect more data. Another (much less expensive) way is to *modify* the data you already have.

The idea behind **data augmentation** is that you can teach your model an invariance by transforming your existing data in ways that images of a class in unseen data might vary. For instance, if you are classifying cars, your classifier should know that whether a car is facing left-to-right or right-to-left doesn't affect what class the car is in. If a car is a Honda in one direction, it's also a Honda in any other direction.

<!--TODO: cars in different directions -->
<figure>
<img src="" width=400 alt="Same class of car in different directions.">
</figure>

So, if you augment your dataset by flipping all of your images, you help your classifier to learn that this is a distinction it should ignore.

<!--TODO: data for free-->
<figure>
<img src="" width=400 alt="Augmented cars.">
</figure>

# Example - Training with Data Augmentation #

Keras lets you augment your data in two ways. The first way is to include it in the data pipeline with a function like [`ImageDataGenerator`](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator). The second way is to include it in the model definition by using Keras's **preprocessing layers**. This is the approach that we'll take. The primary advantage for us is that the image transformations will be computed on the GPU, speeding up training significantly.

Let's see how we can use the data augmentations provided by Keras to improve the performance of the classifier from Lesson 1.

## Step 1 - Load Data ##

In [None]:
#$HIDE_INPUT$
from cv_prelude import *

## Step 2 - Define Model ##

We'll continue with the VGG16 model we've used throughout this course. For the head, we've increased the number of hidden units in the head from 6 to 8. Since there is now more variation in the data, we can use a little extra capacity to help our model generalize.

In [None]:
from tensorflow.keras import Sequential
import tensorflow.keras.layers as layers
# these are a new feature in TF 2.2
import tensorflow.keras.layers.experimental.preprocessing as preprocessing


pretrained_base = tf.keras.models.load_model(
    '/kaggle/input/cv-course-models/cv-course-models/vgg16-pretrained-base',
)
pretrained_base.trainable = False

model = Sequential([
    # Preprocessing
    # randomly flip left to right
    preprocessing.RandomFlip(mode='horizontal'),
    # randomly rotate by as much as 20% either direction
    preprocessing.RandomRotation(factor=0.20),
    # randomly adjust the contrast by factors between 0.5 and 1.5
    preprocessing.RandomContrast(factor=0.5),

    # Base
    pretrained_base,

    # Head
    layers.Flatten(),
    layers.Dense(8, activation='relu'),
    layers.Dense(1, activation='sigmoid'),
])

## Step 3 - Train and Evaluate ##

In [None]:
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['binary_accuracy'],
)

history = model.fit(
    ds_train,
    validation_data=ds_valid,
    epochs=15,
)

In [None]:
import pandas as pd

history_frame = pd.DataFrame(history.history)

history_frame.loc[:, ['loss', 'val_loss']].plot()
history_frame.loc[:, ['binary_accuracy', 'val_binary_accuracy']].plot();

# Conclusion #

The `albumentations` library is another source for augmentations; it has been popular with Kagglers in competitions.