# ECBM E4040 - Assignment 2 - Task 4: Data Augmentation

One important factor in neural network training is the size of the traininig set. Since it is often not possible to get a clean and large enough dataset for training, one way to improve the network's robustness and generalization ability is to create 'fake' data by injecting random noise or doing random transformations on the available data. A technique which implements this strategy is called __data augmentation__, and has shown to be very effective.

One thing to remember, when you augment your data, is to never change the correct label of a sample. For example, for hand-written digit dataset, flipping a letter 'b' ends up looking like a letter 'd', but you must keep the label for 'b'. So please choose the best augmentation methods for your dataset.

In [1]:
%matplotlib inline
%load_ext autoreload
%autoreload 2

# Import modules
from __future__ import print_function
import numpy as np
from ecbm4040.cifar_utils import load_data
import matplotlib.pyplot as plt

## Load Data

In [2]:
X_train, y_train, X_test, y_test = load_data()

num_train = 49000 
num_val = 1000
num_test = 10000
num_dev = 128

# The development set is used for augmentation practices.
mask = np.random.choice(num_train, num_dev, replace=False)
X_dev = X_train[mask]
y_dev = y_train[mask]

# Seperate Training set into a training set and a validation set
X_val = X_train[num_train:]
y_val = y_train[num_train:]
X_train = X_train[:num_train]
y_train = y_train[:num_train]

# Preprocessing: subtract the mean value across every dimension for training data, and reshape it to be RGB size
mean_image = np.mean(X_train, axis=0)
X_train = X_train.astype(np.float32) - mean_image.astype(np.float32)
X_val = X_val.astype(np.float32) - mean_image

#X_val = X_val.reshape([-1,32,32,3])/255
#X_train = X_train.reshape([-1,32,32,3])/255


print(X_train.shape, X_val.shape, X_test.shape, X_dev.shape)

./data/cifar-10-python.tar.gz already exists. Begin extracting...
(49000, 3072) (1000, 3072) (10000, 3072) (128, 3072)


## Part 1: Visualization

### Visualize some original images

<span style="color:red">__TODO:__</span> Use Pyplot to draw any 16 samples from the __development set__ in a 4-by-4 grid.

__Hint__: The original data is vectorized, you need to find a way to reshape it into 32*32 RGB image.

In [None]:
r = 4
f, axarr = plt.subplots(r, r, figsize=(8,8))

for i in range(r):
    for j in range(r):

        img_flat = X_dev[np.random.choice(X_dev.shape[0],1),:].flatten()
        img_R = img_flat[0:1024].reshape((32, 32))
        img_G = img_flat[1024:2048].reshape((32, 32))
        img_B = img_flat[2048:3072].reshape((32, 32))
        img = np.dstack((img_R, img_G, img_B))
        axarr[i][j].imshow(img)


## Part 2: Automatic batch generator

We want you to create an automatic image generator that does several kinds of data augmentations, and produces a batch of data consisting of random samples every time you call it. 

<span style="color:red">__TODO__:</span> Finish the functions of class __ImageGenerator__ in __ecbm4040/image_generator.py__. The code is fully commented with instructions.

__Hint__: The python keywords __yield__ and __next__ can help you do some tricks.

In [3]:
from ecbm4040.image_generator import ImageGenerator

<span style="color:red">__TODO__:</span> Create an ImageGenerator object using the __development set__, and use __show__ function to plot the top 16 original images.

In [4]:
def reshapeImg(X,idx):
    img_flat = X[idx,:].flatten()
    img_R = img_flat[0:1024].reshape((32, 32)) / 255
    img_G = img_flat[1024:2048].reshape((32, 32)) / 255
    img_B = img_flat[2048:3072].reshape((32, 32)) /255
    img = np.dstack((img_R, img_G, img_B))
    return img

def reshapeArray(X):
    container = np.ndarray((X.shape[0],32,32,3))
    for n in range(X.shape[0]):
        container[n] = reshapeImg(X,n)

    return container



In [None]:
dev_gen = ImageGenerator(reshapeArray(X_dev),y_dev)
dev_gen.show()

### Translation

<span style="color:red">__TODO:__</span> Translate the original __development set__ by several pixels in both directions, and plot the top 16 images like you just did.

In [None]:
del dev_gen
dev_gen = ImageGenerator(reshapeArray(X_dev), y_dev)

dev_gen.translate(shift_height=10, shift_width= -10)

dev_gen.show()

### Rotation

<span style="color:red">__TODO:__</span> Rotate the original __development set__ by several degrees, and plot the top 16 images like you just did. 

In [None]:
# YOUR CODE HERE
del dev_gen
dev_gen = ImageGenerator(reshapeArray(X_dev), y_dev)

dev_gen.rotate(angle=-1)

dev_gen.show()

### Flipping (horizontal and vertical)

<span style="color:red">__TODO:__</span> Flip the original __development set__ as you like (horizontal, vertical, or both), and plot the top 16 images like you just did. 

In [None]:
# YOUR CODE HERE
del dev_gen
dev_gen = ImageGenerator(reshapeArray(X_dev), y_dev)

dev_gen.flip(mode= 'hv')

dev_gen.show()

### Add Noise

<span style="color:red">__TODO:__</span> Inject random noise into the original __development set__, and plot the top 16 images like you just did.

In [None]:
# YOUR CODE HERE
del dev_gen
dev_gen = ImageGenerator(reshapeArray(X_dev), y_dev)

#normally distributed 
dev_gen.add_noise(portion=.05, amplitude=.5)
dev_gen.show()

## Part 3: Data Augmentation + LeNet

<span style="color:red">__TODO__:</span> Now that you have your own data generator. At the end of __ecbm4040/neuralnets/cnn.py__, there is a not-implemented function __my_training_task4()__. Copy the __my_training()__ function above and modify it so that it uses your data generator for training. Train the network again - whether you see and improvement or a drop, record it and analyze why.

In [6]:
Train_X_Gen = ImageGenerator(reshapeArray(X_train), y_train)
X_val       = reshapeArray(X_val)


## mod the images

Train_X_Gen.add_noise(portion = .02, amplitude = .04)
Train_X_Gen.rotate(angle = -1)
Train_X_Gen.translate(shift_height=5, shift_width= -3)


MemoryError: 

In [5]:
from ecbm4040.neuralnets.cnn import my_training_task4
import tensorflow as tf
tf.reset_default_graph()



result , cache = my_training_task4(Train_X_Gen, X_val, y_val, 
             conv_featmap=[10],
             fc_units=[84, 84],
             conv_kernel_size=[5],
             pooling_size=[2],
             l2_norm= .0001,
             seed=235,
             use_adam = True,
             learning_rate= .001,
             epoch=20,
             batch_size=245,
             verbose=False,
             pre_trained_model= 'lenet_1509670317')

# YOUR CODE HERE

MemoryError: 