<a href="https://colab.research.google.com/github/nyp-sit/sdaai-pdc2-students/blob/master/iti107/session-3/data_augmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" align="left"/></a>

# Image Data Augmentation (Optional Exercise)

Welcome to this week's programming exercise. You will learn to use the Keras ImageDataGenerator to apply different transformations to the image data and observe the effects of the transformations. 

In [None]:
from __future__ import print_function

import os
import numpy as np

from utils import prepare_data

from tensorflow.keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

import matplotlib
import matplotlib.pyplot as plt

%matplotlib inline

## Get the data

The function below `prepare_data()` will download a dataset (in zip format) consisting of pictures that depicts Positive and Negative emotions. It automatically unzip and copy the image files into 'Negative' and 'Positive' subfolder in the folder specified by `data_path` variable below.

In [None]:
data_path = "data"
valid_size = 0.2
FORCED_DATA_REWRITE = True

In [None]:
train_path, valid_path = prepare_data(data_path=data_path, 
                                      valid_size=valid_size, 
                                      FORCED_DATA_REWRITE=FORCED_DATA_REWRITE)

In [None]:
train_neg_path = os.path.join(train_path, "Negative")
train_pos_path = os.path.join(train_path, "Positive")
valid_neg_path = os.path.join(valid_path, "Negative")
valid_pos_path = os.path.join(valid_path, "Positive")

In [None]:
n_examples = 5

In [None]:
np.random.seed(42)
positive_expamples = np.random.choice(os.listdir(train_pos_path), size=n_examples, replace=False)
negative_expamples = np.random.choice(os.listdir(train_neg_path), size=n_examples, replace=False)

In [None]:
plt.figure(figsize=(5, n_examples * 2))
for i in range(n_examples):
    plt.subplot(n_examples, 2, i * 2 + 1)
    img = load_img(os.path.join(train_pos_path, positive_expamples[i]))
    plt.imshow(img)
    plt.axis("off")
    if i == 0:
        plt.title("Positive", fontsize=18)
    plt.subplot(n_examples, 2, i * 2 + 2)
    img = load_img(os.path.join(train_neg_path, negative_expamples[i]))
    plt.imshow(img)
    plt.axis("off")
    if i == 0:
        plt.title("Negative", fontsize=18)

In [None]:
img_height, img_width = 400, 500

In [None]:
def compare_images(img1, img2):
    if type(img1) == np.ndarray:
        img1 = array_to_img(img1)
    if type(img2) == np.ndarray:
        img2 = array_to_img(img2)
    plt.figure(figsize=(14, 6))
    plt.subplot(121)
    plt.imshow(img1)
    plt.axis("off")
    plt.title("Original", fontsize=18)
    plt.subplot(122)
    plt.imshow(img2)
    plt.axis("off")
    plt.title("Transformed", fontsize=18)

### Rescaling 

The images are usually stored in an RGB (Red Green Blue) format. In this format the image is represented as a three-dimensional (or three-channel) array. 
One dimension is for channels (red, green, and blue colors) and two other dimensions are spatial dimension. Thus, every pixel is encoded through three numbers. Each number is usually stored as an 8-bit unsigned integer type (0 to 255).

Rescaling is an operation that moves your data from one numerical range to another by simple division using a predefined constant. In deep neural networks you might want to restrict your input to the range from 0 to 1, due to possible overflow, optimization, stability issues, and so on.

In [None]:
datagen_rescaled = ImageDataGenerator(rescale=1. / 255.)
datagen_default = ImageDataGenerator()

In [None]:
## class_mode == None will not return any target label

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_rescaled = datagen_rescaled.flow_from_directory(train_path, 
                                                    target_size=(img_height, img_width), 
                                                    batch_size=1, 
                                                    classes=['Positive'],
                                                    shuffle=False, 
                                                    class_mode=None)

In [None]:
np.random.seed(1)
sample_default = next(gen_default)
#print(sample_default)
sample_rescaled = next(gen_rescaled)
compare_images(sample_default[0], sample_rescaled[0])

Visually both images are identical, but that’s just because Python image tools rescale images for displaying. If you look at the raw data, which are arrays, you can see that they differ exactly by a factor of 255.

In [None]:
sample_default[0][:2, :2, 0]   # examine only the first 2 pixel values of each x, y axis of the first channel

In [None]:
sample_rescaled[0][:2, :2, 0]

### Rotation

This transformation rotates the image in a certain direction (clockwise or counterclockwise).

The parameter that allows the rotations is called rotation_range. It specifies the range of rotations in degrees from which the random angle will be chosen uniformly to do a rotation. Note that during the rotation the size of the image remains the same. Thus, some of the image regions will be cropped out and some of the regions of the new image will need to be filled.

In [None]:
datagen_rotated = ImageDataGenerator(rotation_range=45, fill_mode="constant")
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_rotated = datagen_rotated.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_rotated = next(gen_rotated)
compare_images(sample_default[0], sample_rotated[0])

### Horizontal shift

This transformation shifts the image to a certain direction along the horizontal axis (left or right). The size of the shift can be determined using the width_shift_range parameter and is measured as a fraction of the total width.

In [None]:
datagen_hshifted = ImageDataGenerator(width_shift_range=0.4, fill_mode="constant")
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_hshifted = datagen_hshifted.flow_from_directory(train_path, 
                                                    target_size=(img_height, img_width), 
                                                    batch_size=1, 
                                                    classes=['Positive'],
                                                    shuffle=False, 
                                                    class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_hshifted = next(gen_hshifted)
compare_images(sample_default[0], sample_hshifted[0])

### Vertical shift
It shifts the image along the vertical axis (up or down). The parameter through which we can control the range of shift is called the height_shift generator, and is also measured as a fraction of total height.


In [None]:
datagen_vshifted = ImageDataGenerator(height_shift_range=0.5)
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1,
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_vshifted = datagen_vshifted.flow_from_directory(train_path, 
                                                    target_size=(img_height, img_width), 
                                                    batch_size=1,
                                                    classes=['Positive'],
                                                    shuffle=False, 
                                                    class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_vshifted = next(gen_vshifted)
compare_images(sample_default[0], sample_vshifted[0])

### Shearing

Shear mapping or shearing displaces each point in the vertical direction by an amount proportional to its distance from an edge of the image. Note that in general the direction does not have to be vertical and can be arbitrary. The parameter that controls the displacement rate is called shear_range and corresponds to the deviation angle (in radians) between a horizontal line in the original picture and the image (in the mathematical sense) of this line in the transformed image.

In [None]:
datagen_sheared = ImageDataGenerator(shear_range=30.0)
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                   target_size=(img_height, img_width), 
                                   batch_size=1,
                                   classes=['Positive'],
                                   shuffle=False, 
                                   class_mode=None)
gen_sheared = datagen_sheared.flow_from_directory(train_path,
                                   target_size=(img_height, img_width), 
                                   batch_size=1,
                                   classes=['Positive'],
                                   shuffle=False, 
                                   class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_sheared = next(gen_sheared)
compare_images(sample_default[0], sample_sheared[0])

### Zoom

This transformation zooms the initial image in or out. The zoom_range parameter controls the zooming factor, it is either a float or \[lower, upper\]. If a float, \[lower, upper\] = \[1-zoom_range, 1+zoom_range\].

In [None]:
datagen_zoomed = ImageDataGenerator(zoom_range=0.5, fill_mode='constant')
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_zoomed = datagen_zoomed.flow_from_directory(train_path, 
                                                target_size=(img_height, img_width), 
                                                batch_size=1, 
                                                classes=['Positive'],
                                                shuffle=False, 
                                                class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_zoomed = next(gen_zoomed)
compare_images(sample_default[0], sample_zoomed[0])

### Horizontal flip

It flips the image with respect to the vertical axis. One can either turn it on or off using the horizontal_flip parameter.

In [None]:
datagen_hflipped = ImageDataGenerator(horizontal_flip=True)
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_hflipped = datagen_hflipped.flow_from_directory(train_path, 
                                                    target_size=(img_height, img_width),
                                                    classes=['Positive'],
                                                    batch_size=1, 
                                                    shuffle=False, 
                                                    class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_hflipped = next(gen_hflipped)
compare_images(sample_default[0], sample_hflipped[0])

### Vertical flip
It flips the image with regard to the horizontal axis. The vertical_flip Boolean parameter controls the presence of this transformation.

In [None]:
datagen_vflipped = ImageDataGenerator(vertical_flip=True)
datagen_default = ImageDataGenerator()

In [None]:
gen_default = datagen_default.flow_from_directory(train_path, 
                                                  target_size=(img_height, img_width), 
                                                  batch_size=1, 
                                                  classes=['Positive'],
                                                  shuffle=False, 
                                                  class_mode=None)
gen_vflipped = datagen_vflipped.flow_from_directory(train_path, 
                                                    target_size=(img_height, img_width), 
                                                    batch_size=1,
                                                    classes=['Positive'],
                                                    shuffle=False, 
                                                    class_mode=None)

In [None]:
np.random.seed(21)
sample_default = next(gen_default)
sample_vflipped = next(gen_vflipped)
compare_images(sample_default[0], sample_vflipped[0])

## Combination

Let’s try to apply all the described augmentation transformations simultaneously and see what happens. Recall that the parameters of each of the transformations are chosen randomly from the specified range; thus, we should have a considerably diverse set of samples.

Let’s initialize our ImageDataGenerator with all the available options turned on and test it on an image of a red hydrant.

In [None]:
datagen = ImageDataGenerator(rotation_range=45, 
                             width_shift_range=0.2, 
                             height_shift_range=0.2, 
                             shear_range=0.2, 
                             zoom_range=0.3, 
                             horizontal_flip=True, 
                             vertical_flip=True, 
                             fill_mode="nearest")

In [None]:
try:
    img = load_img(os.path.join(train_pos_path, "Firehydrant2.jpg"))
except:
    img = load_img(os.path.join(valid_pos_path, "Firehydrant2.jpg"))

In [None]:
img

In [None]:
img = img_to_array(img)
img = img.reshape((1,) + img.shape)

In [None]:
n_augmentations = 8

In [None]:
save_dir = os.path.join(data_path, "augmentation_preview")
if os.path.exists(save_dir):
    shutil.rmtree(save_dir)
os.mkdir(save_dir)

In [None]:
plt.figure(figsize=(15, 6))    
i = 0

for batch in datagen.flow(img, 
                          batch_size=1, 
                          seed=21, 
                          save_to_dir=save_dir, 
                          save_prefix="hydrant", 
                          save_format="jpeg"):
    
    plt.subplot(2, int(np.ceil(n_augmentations * 1. / 2)), i + 1)
    plt.imshow(array_to_img(batch[0]))
    plt.axis("off")
    
    i += 1
    if i >= n_augmentations:
        break