# Deep Learning week - Day 3 - Exercise 2

In this notebook, we propose to define a simple baseline CNN to distinguish the 10 categories from the CIFAR-10 dataset. Each image is of size (32, 32)

⚠️ **Warning** ⚠️ For now, computations are done on your CPU : bear in mind that a model training will take ~10 minutes in this notebook, so don't waste your trainings !

## The data

❓ **Question** ❓ Load the data and the associated labels.

In [None]:
# YOUR CODE HERE

❓ **Question** ❓ Normalize your data by dividing by the maximum value.

In [None]:
# YOUR CODE HERE

❓ **Question** ❓ Display some of the images using the `imshow` function from matplotlib - and print the corresponding class.

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

# YOUR PLOT HERE

❓ **Question** ❓ Convert the current labels into one-hot encoded labels - stored in `y_train` and `y_test`.

In [None]:
# YOUR CODE HERE

## The Convolutional Neural Network

Now, let's define the Convolutional Neural Network - CNN. 

❓ **Question** ❓ Define a CNN that is composed of:
- a Conv2D layer with 16 filters, a kernel size of (4, 4), the relu activation function, and a padding equal to `same`
- a MaxPooling2D layer with a pool size of (2, 2)
- a Conv2D layer with 32 filters, a kernel size of (3, 3), the relu activation function, and a padding equal to `same`
- a MaxPooling2D layer with a pool size of (3, 3)
- a Conv2D layer with 64 filters, a kernel size of (3, 3), the relu activation function, and a padding equal to `same`
- a MaxPooling2D layer with a pool size of (3, 3)
- a Flatten layer
- a dense function with 75 neurons with the `relu` activation function
- a dense function related to your task
 
 PS: Do not include the compilation here.
 
 ⚠️ **Warning** ⚠️ Do not forget to add the input shape of your data to the first layer. And do not forget that it has three colors ;)

In [None]:
def initialize_model():
    # YOUR CODE HERE

❓ **Question** ❓ What is the number of parameters of your model? 

Hint: `model.summary()`

In [None]:
# YOUR CODE HERE

❓ **Question** ❓ Write a function to compile your model. 

[ Advanced ] It is not mandatory but you can try to use the `adam` optimizer with a learning rate of 0.005

In [None]:
# Import

def compile_model(model):
    # YOUR CODE HERE

❓ **Question** ❓ Initialize a model and compile it. Then, fit it on your training data, with an early stopping (patience to 5 to keep fast computations, and the `restore_best_weights` set to True and `min_delta=1e-2` - you can check what it is in the documentation if interested).

Store the output of the fit in an `history` variable.

In [None]:
# YOUR CODE HERE

❓ **Question** ❓ Run the following function on the previous history.

In [None]:
def plot_history(history, title='', axs=None, exp_name=""):
    if axs is not None:
        ax1, ax2 = axs
    else:
        f, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
    
    if len(exp_name) > 0 and exp_name[0] != '_':
        exp_name = '_' + exp_name
    ax1.plot(history.history['loss'], label='train' + exp_name)
    ax1.plot(history.history['val_loss'], label='val' + exp_name)
    ax1.set_ylim(0., 2.2)
    ax1.set_title('loss')
    ax1.legend()

    ax2.plot(history.history['accuracy'], label='train accuracy'  + exp_name)
    ax2.plot(history.history['val_accuracy'], label='val accuracy'  + exp_name)
    ax2.set_ylim(0.25, 1.)
    ax2.set_title('Accuracy')
    ax2.legend()
    return (ax1, ax2)

In [None]:
# YOUR PLOT HERE

❓ **Question** ❓ Evaluate your model on the test data. Are you satisfied with these performances ? What is the chance level on this task ?

In [None]:
# YOUR CODE HERE

## Data augmentation

We will then introduce data augmentation: a method designed to augment the training data set by changing a bit the training images (mirroring, cropping, intensity changes, etc) while keeping the same labels. This technique is intended to improve the generalization of the model and thus improving its performance.

Here, we will augment the data on the fly, at each epoch, thanks to Keras utils directly. These augmentation can be very various (see [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator)), let's look at some of them. 

The functions might be a little confusing, but don't be disturb: it is intended to create the images on the fly instead of having them loaded into memory, which might be very heavy.

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(
    featurewise_center=False,
    featurewise_std_normalization=False,
    rotation_range=10,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    brightness_range=(0.8, 1.),
    zoom_range=(0.8, 1.2),
    rescale=1./255.) 

datagen.fit(X_train)

Let's now vizualize the input data and what has been generated by the ImageDataGenerator.

In [None]:
import numpy as np

viz_flow = datagen.flow(X_train, shuffle=False, batch_size=1)

for i, (raw_image, augmented_image) in enumerate(zip(X_train, viz_flow)):
    _, (ax1, ax2) = plt.subplots(1, 2, figsize=(6, 2))
    ax1.imshow(raw_image)
    ax2.imshow(augmented_image[0])
    plt.show()
    
    if i > 10:
        break

Now, run the model.

❓ **Question** ❓ Do you estimate the estimation to be faster in terms of number of epochs ? in terms of computation time ?

In [None]:
# The model
model_2 = build_model()
model_2 = compile_model(model_2)

# The data generator
X_tr = X_train[:30000]
y_tr = y_train[:30000]
X_val = X_train[30000:]
y_val = y_train[30000:]
train_flow = datagen.flow(X_tr, y_tr, batch_size=16)

# The early stopping criterion
es = EarlyStopping(patience=5, verbose=1, restore_best_weights=True, min_delta=1e-2)

# The fit
history_2 = model_2.fit_generator(train_flow, 
                                  epochs=100, 
                                  callbacks=[es], 
                                  validation_data=(X_val, y_val))


In [None]:
# YOUR ANSWER HERE

❓ **Question** ❓ Now, let's plot the previous and current run histories. What do you think of the data augmentation?

In [None]:
axs = plot_history(history_2, exp_name='data_augmentation')
plot_history(history ,axs=axs, exp_name='baseline')
plt.show()

In [None]:
# YOUR ANSWER HERE

❓ **Question** ❓ Evaluate the model on the test data. Do you see an improvement ?

In [None]:
test_flow = datagen.flow(X_test, y_test)
model_2.evaluate(test_flow)

In [None]:
model.evaluate(X_test, y_test, verbose=0)

In [None]:
# YOUR ANSWER HERE

##  Remark

In these experiments, we stopped training quickly to have fast experiments. In practice, training must be allowed to last longer, with a a stopping criterion that has a lower delta and higher patience.

Let's see how to make it quicker with the next exercise