<h1><center><font size="6">Final Lab: Transfer Learning with [MobileNet](https://arxiv.org/pdf/1704.04861.pdf) and [Cifar-10](https://www.cs.toronto.edu/~kriz/cifar.html)</font></center></h1>

<img src="https://cdn-images-1.medium.com/max/800/1*QN007xhxgDTPBdNT0pnZ2g.png" width="400"></img>

# Main task

In this notebook, we will apply transfer learning techniques to finetune the [MobileNet](https://arxiv.org/pdf/1704.04861.pdf) CNN on [Cifar-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset.

# Procedures

In general, the main steps that we will follow are:

- <a href='#1'>Load data, analyze and split in *training*/*validation*/*testing* sets</a>  
- <a href='#2'>Load CNN and analyze architecture</a>  
- <a href='#3'>Adapt this CNN to our problem</a>  
- <a href='#4'>Setup data augmentation techniques</a>
- <a href='#5'>Add some keras callbacks</a>   
- <a href='#6'>Setup optimization algorithm with their hyperparameters</a>  
- <a href='#7'>Train model!</a>
- <a href='#8'>Choose best model/snapshot</a>
- <a href='#9'>Evaluate final model on the *testing* set</a>
- <a href='#10'>Conclusions</a>

# Students

* Maximiliano Armesto
* Martín Gonella

# Link to our DiploDatos Repo

* https://github.com/tyncho08/DiploDatos

---

In [4]:
# Setup one GPU for tensorflow (don't be greedy).
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
# The GPU id to use, "0", "1", etc.
os.environ["CUDA_VISIBLE_DEVICES"] = "1" #Select only one GPU 
# Use ---> watch -n 1 nvidia-smi (to see GPU use)

# https://keras.io/applications/#documentation-for-individual-models
from keras.applications.mobilenet import MobileNet
from keras.datasets import cifar10
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D, Dropout
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import cv2
import numpy as np
import tensorflow as tf
from IPython.display import SVG
import warnings
warnings.filterwarnings('ignore')
import matplotlib.pyplot as plt
%matplotlib inline

# Limit tensorflow gpu usage.
# Maybe you should comment this lines if you run tensorflow on CPU.
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.4 #Memory use on GPU
sess = tf.Session(config=config)

---

# <a id="1">Load data, analyze and split in *training*/*validation*/*testing* sets</a>  

In [3]:
# Cifar-10 class names
# We will create a dictionary for each type of label
# This is a mapping from the int class name to 
# their corresponding string class name
LABELS = {
    0: "airplane",
    1: "automobile",
    2: "bird",
    3: "cat",
    4: "deer",
    5: "dog",
    6: "frog",
    7: "horse",
    8: "ship",
    9: "truck"
}

# Load dataset from keras
(x_train_data, y_train_data), (x_test_data, y_test_data) = cifar10.load_data()

############
# [COMPLETED] 
# Some prints to see the loaded data dimensions
print("CIFAR-10 x_train shape: {}".format(x_train_data.shape))
print("CIFAR-10 y_train shape: {}".format(y_train_data.shape))
print("CIFAR-10 x_test shape: {}".format(x_test_data.shape))
print("CIFAR-10 y_test shape: {}".format(y_test_data.shape))
############

CIFAR-10 x_train shape: (50000, 32, 32, 3)
CIFAR-10 y_train shape: (50000, 1)
CIFAR-10 x_test shape: (10000, 32, 32, 3)
CIFAR-10 y_test shape: (10000, 1)


In [None]:
# Some constants
IMG_ROWS = 128
IMG_COLS = 128
NUM_CLASSES = 10
TEST_SIZE = 0.2
RANDOM_STATE = 2018
#Model
NO_EPOCHS = 50
BATCH_SIZE = 32

### Class distribution: Train set images 

In [None]:
############
# [COMPLETED] 
# Analyze the amount of images for each class
# Plot some images to explore how they look
############

def get_classes_distribution(y_data):
    # Get the count for each label
    y = np.bincount(y_data)
    ii = np.nonzero(y)[0]
    label_counts = zip(ii, y[ii])

    # Get total number of samples
    total_samples = len(y_data)

    # Count the number of items in each class
    for label, count in label_counts:
        class_name = LABELS[label]
        percent = (count / total_samples) * 100
        print("{:<15s}:  {} or {:.2f}%".format(class_name, count, percent))
        
    return label_counts

train_label_counts = get_classes_distribution(np.reshape(y_train_data, len(y_train_data)))

In [None]:
def plot_label_per_class(y_data):
    
    classes = sorted(np.unique(y_data))
    f, ax = plt.subplots(1,1, figsize=(12, 4))
    g = sns.countplot(y_data, order=classes)
    g.set_title("Number of labels for each class")
    
    for p, label in zip(g.patches, classes):
        g.annotate(LABELS[label], (p.get_x(), p.get_height() + 0.2))
    
    plt.show()
    
plot_label_per_class(y_train_data)

### Class distribution: Test set images 

In [None]:
get_classes_distribution(np.reshape(y_test_data, len(y_test_data)))

In [None]:
plot_label_per_class(y_test_data)

### Images: Train set

Let's plot some samples for the images.   
We add labels to the train set images, with the corresponding CIFAR-10 item category. 

In [None]:
def sample_images_data(x_data, y_data):
    # An empty list to collect some samples
    sample_images = []
    sample_labels = []

    # Iterate over the keys of the labels dictionary defined in the above cell
    for k in LABELS.keys():
        # Get four samples for each category
        samples = np.where(y_data == k)[0][:4]
        # Append the samples to the samples list
        for s in samples:
            img = x_data[s]
            sample_images.append(img)
            sample_labels.append(y_data[s])

    print("Total number of sample images to plot: ", len(sample_images))
    return sample_images, sample_labels

train_sample_images, train_sample_labels = sample_images_data(x_train_data, y_train_data)

In [None]:
def plot_sample_images(data_sample_images, data_sample_labels, cmap="gray"):
    # Plot the sample images now
    f, ax = plt.subplots(5, 8, figsize=(16, 10))

    for i, img in enumerate(data_sample_images):
        ax[i//8, i%8].imshow(img, cmap=cmap)
        ax[i//8, i%8].axis('off')
        ax[i//8, i%8].set_title(LABELS[data_sample_labels[i]])
    plt.show()    
    
plot_sample_images(train_sample_images, train_sample_labels)

### Images: Test set

In [None]:
test_sample_images, test_sample_labels = sample_images_data(x_test_data, y_test_data)
plot_sample_images(test_sample_images, test_sample_labels)

### Data preprocessing

In [None]:
def data_preprocessing(x_data, y_data):
    out_y = to_categorical(y_data, len(np.unique(y_data)))
    num_images = x_data.shape[0]
    x_shaped_array = np.expand_dims(x_data, axis=-1)
    out_x = x_shaped_array / 255.
    
    return out_x, out_y

In [None]:
# prepare the data
X, y = data_preprocessing(x_train_data, y_train_data)
X_test, y_test = data_preprocessing(x_test_data, y_test_data)

## Split train in train and validation set

We further split the train set in train and validation set. The validation set will be 20% from the original train set, therefore the split will be train/validation of 0.8/0.2.

In [None]:
############
# [COMPLETED] 
# Split training set in train/val sets
# Use the sampling method that you want
############

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=TEST_SIZE, random_state=RANDOM_STATE)

print("CIFAR-10 X_train  -  rows: {}  columns: {}".format(X_train.shape[0], X_train.shape[1:4]))
print("CIFAR-10 X_valid  -  rows: {}  columns: {}".format(X_val.shape[0], X_val.shape[1:4]))
print("CIFAR-10 X_test   -  rows: {}  columns: {}".format(X_test.shape[0], X_test.shape[1:4]))
print("-----------------------------------------------------------")
print("CIFAR-10 y_train  -  rows: {}  columns: {}".format(y_train.shape[0], y_train.shape[1]))
print("CIFAR-10 y_valid  -  rows: {}  columns: {}".format(y_val.shape[0], y_val.shape[1]))
print("CIFAR-10 y_test   -  rows: {}  columns: {}".format(y_test.shape[0], y_test.shape[1]))

In [None]:
get_classes_distribution(np.argmax(y_train, axis=1))
plot_label_per_class(np.argmax(y_train, axis=1))

In [None]:
get_classes_distribution(np.argmax(y_val, axis=1))
plot_label_per_class(np.argmax(y_val, axis=1))

In [None]:
# Save the dataset to disk

def save_to_disk(x_data, y_data, usage, output_dir='cifar10_images'):
    """    
    x_data : np.ndarray
        Array with images.
    
    y_data : np.ndarray
        Array with labels.
    
    usage : str
        One of ['train', 'val', 'test'].

    output_dir : str
        Path to save data.
    """
    assert usage in ['train', 'val', 'test']
    
    # Set paths 
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    for label in np.unique(y_data):
        label_path = os.path.join(output_dir, usage, str(label))
        if not os.path.exists(label_path):
            os.makedirs(label_path)
    
    for idx, img in enumerate(x_data):
        bgr_img = img[..., ::-1]  # RGB -> BGR
        label = y_data[idx][0]
        img_path = os.path.join(output_dir, usage, str(label), 'img_{}.jpg'.format(idx))

        retval = cv2.imwrite(img_path, bgr_img)
        assert retval, 'Problem saving image at index:{}'.format(idx)


############
# [COMPLETED] 
# Use the above function to save all your data, e.g.:
save_to_disk(x_train, y_train, 'data/train', 'cifar10_images')
save_to_disk(x_val, y_val, 'data/val', 'cifar10_images')
save_to_disk(x_test, y_test, 'data/test', 'cifar10_images')
############

---

# <a id="2">Load CNN and analyze architecture</a> 

In [None]:
############
# [COMPLETED] 
# Use the MobileNet class from Keras to load your base model, pre-trained on imagenet.
# We wan't to load the pre-trained weights, but without the classification layer.
# Check the notebook '3_transfer-learning' or https://keras.io/applications/#mobilenet to get more
# info about how to load this network properly.
############

base_model = MobileNet(input_shape=(NET_IMG_ROWS, NET_IMG_COLS, 3), # Input image size
                       alpha = 1                                    # default n° of filters from the paper are used at each layer.
                       weights='imagenet',                          # Use imagenet pre-trained weights
                       include_top=False,                           # Drop classification layer
                       pooling='avg')                               # Global AVG pooling for output feature vector

### Inspect the model

In [None]:
base_model.summary()

In [None]:
output_dir_fig = 'Models_Figs/'
if not os.path.exists(output_dir_fig):
        os.makedirs(output_dir_fig)
        
plot_model(model, to_file='Models_Figs/base_model.png')
SVG(model_to_dot(model).create(prog='dot', format='svg'))

---

# <a id="3">Adapt this CNN to our problem</a> 

### We add some layers

In [None]:
############
# [COMPLETED] 
# Having the CNN loaded, now we have to add some layers to adapt this network to our
# classification problem.
# We can choose to finetune just the new added layers, some particular layers or all the layer of the
# model. Play with different settings and compare the results.
############

# get the output feature vector from the base model
x = base_model.output

# let's add a fully-connected layer with dropout
x = Dense(1024, activation='relu')(x)
x = Dropout(0.5)(x)

# let's add a fully-connected layer  with dropout
x = Dense(512, activation='relu')(x)
x = Dropout(0.25)(x)

# let's add a fully-connected layer  with dropout
x = Dense(256, activation='relu')(x)
x = Dropout(0.25)(x)

# and a logistic layer
predictions = Dense(NUM_CLASSES, activation='softmax')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

In [None]:
# Check the final model
model.summary()

In [None]:
plot_model(model, to_file='Models_Figs/adapted_model.png')
SVG(model_to_dot(model).create(prog='dot', format='svg'))

---

# <a id="4">Setup data augmentation techniques</a> 

In [None]:
############
# [COMPLETED] 
# Use data augmentation to train your model.
# Use the Keras ImageDataGenerator class for this porpouse.
# Note: Given that we want to load our images from disk, instead of using 
# ImageDataGenerator.flow method, we have to use ImageDataGenerator.flow_from_directory 
# method in the following way:
#    generator_train = dataget_train.flow_from_directory('resized_images/train', 
#                                                        target_size=(128, 128), batch_size=32)
#    generator_val = dataget_train.flow_from_directory('resized_images/val', 
#                                                      target_size=(128, 128), batch_size=32)
# Note that we have to resize our images to finetune the MobileNet CNN, this is done using 
# the target_size argument in flow_from_directory. Remember to set the target_size to one of
# the valid listed here: [(128, 128), (160, 160), (192, 192), or (224, 224)].
############

# Training data generator
datagen_train = ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip = False
    rotation_range=15,
    width_shift_range=0.2,
    height_shift_range=0.2,
    featurewise_center=True,
    samplewise_center=True,
    featurewise_std_normalization=True,
    samplewise_std_normalization=True,
    zca_whitening=True
    )

# Validation data generator
datagen_val = ImageDataGenerator(
    rescale=1./255)

---

# <a id="5">Add some keras callbackse</a> 

In [None]:
############
# [COMPLETE] 
# Load and set some Keras callbacks here!
############

from keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard, ReduceLROnPlateau

EXP_ID = 'logs/experiment_000/'

if not os.path.exists(EXP_ID):
    os.makedirs(EXP_ID)

callbacks = [ModelCheckpoint(filepath=os.path.join(EXP_ID, 'weights.{epoch:02d}-{val_loss:.2f}.hdf5'),
                    monitor='val_loss', 
                    verbose=1, 
                    save_best_only=False, 
                    save_weights_only=False, 
                    mode='auto'),
             EarlyStopping(monitor='val_loss', 
                  min_delta=0, 
                  patience=2, 
                  verbose=1, 
                  mode='auto', 
                  baseline=None, 
                  restore_best_weights=False),
             TensorBoard(log_dir=os.path.join(EXP_ID, 'logs'), 
                write_graph=True, 
                write_images=False),
             ReduceLROnPlateau(monitor='val_loss', 
                      factor=0.1, 
                      patience=10, 
                      verbose=0, 
                      mode='auto')
]

If you have installed TensorFlow with pip, you should be able to launch TensorBoard from the command line:

```
tensorboard --logdir=/full_path_to_your_logs
```

---

# <a id="6">Setup optimization algorithm with their hyperparameters</a> 

In [None]:
############
# [COMPLETE] 
# Choose some optimization algorithm and explore different hyperparameters.
# Compile your model.
############

from keras import optimizers

#Choose one optimizer
optimizador = optimizers.SGD(lr=0.0001, momentum=0.9)
#optimizador = optimizers.RMSprop(lr=0.001, rho=0.9, epsilon=None, decay=0.0)
#optimizador = optimizers.Adagrad(lr=0.01, epsilon=None, decay=0.0)
#optimizador = optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=None, decay=0.0)
#optimizador = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
#optimizador = optimizers.Adamax(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0)

model.compile(loss='categorical_crossentropy',
              optimizer=optimizador,
              metrics=['accuracy'])

---

# <a id="7">Train model!</a> 

In [None]:
############
# [COMPLETE] 
# Use fit_generator to train your model.
# e.g.:
#    model.fit_generator(
#        generator_train,
#        epochs=50,
#        validation_data=generator_val,
#        steps_per_epoch=generator_train.n // 32,
#        validation_steps=generator_val.n // 32)
############

'''train_model = model.fit_generator(
    datagen_train.flow(X_train, y_train, target_size=(IMG_ROWS,IMG_COLS), batch_size=BATCH_SIZE),
    epochs=NO_EPOCHS,
    validation_data=datagen_val.flow(X_val, y_val, target_size=(IMG_ROWS,IMG_COLS), batch_size=BATCH_SIZE),
    steps_per_epoch=X_train.shape[0] // BATCH_SIZE,
    validation_steps=X_val.shape[0] // BATCH_SIZE,
    callbacks=callbacks)'''

train_model = model.fit_generator(
    datagen_train.flow_from_directory('data/train', 
                                      target_size=(IMG_ROWS,IMG_COLS), 
                                      batch_size=BATCH_SIZE)
    validation_data=datagen_val.flow_from_directory('data/val', 
                                                    target_size=(IMG_ROWS,IMG_COLS), 
                                                    batch_size=BATCH_SIZE),
    steps_per_epoch=X_train.shape[0] // BATCH_SIZE,
    validation_steps=X_val.shape[0] // BATCH_SIZE,
    callbacks=callbacks)

---

# <a id="8">Choose best model/snapshot</a> 

Using TensorBoard, I evaluated the model and his results. So, I can say that the best choise is the next one:

In [None]:
############
# [COMPLETE] 
# Analyze and compare your results. Choose the best model and snapshot, 
# justify your election. 
############


---

# <a id="9">Evaluate final model on the *testing* set</a> 

In [None]:
############
# [COMPLETED] 
# Evaluate your model on the testing set.
############

score = model.evaluate(X_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

### Validation accuracy and loss

Let's plot the train and validation accuracy and loss, from the train history.

In [None]:
def plot_accuracy_and_loss(train_model):
    hist = train_model.history
    acc = hist['acc']
    val_acc = hist['val_acc']
    loss = hist['loss']
    val_loss = hist['val_loss']
    epochs = range(len(acc))
    f, ax = plt.subplots(1,2, figsize=(14,6))
    ax[0].plot(epochs, acc, 'g', label='Training accuracy')
    ax[0].plot(epochs, val_acc, 'r', label='Validation accuracy')
    ax[0].set_title('Training and validation accuracy')
    ax[0].legend()
    ax[1].plot(epochs, loss, 'g', label='Training loss')
    ax[1].plot(epochs, val_loss, 'r', label='Validation loss')
    ax[1].set_title('Training and validation loss')
    ax[1].legend()
    plt.show()

plot_accuracy_and_loss(train_model)

In [None]:
# get the predictions for the test data
predicted_classes = model.predict_classes(X_test)

In [None]:
p = predicted_classes
y = y_test_data
correct = np.nonzero(p == y)[0]
incorrect = np.nonzero(p != y)[0]

print("Correct predicted classes:", correct.shape[0])
print("Incorrect predicted classes:", incorrect.shape[0])

In [None]:
target_names = ["Class {} ({}) :".format(i, LABELS[i]) for i in range(NUM_CLASSES)]
print(classification_report(y_test_data, predicted_classes, target_names=target_names))

### Visualize classified images

#### Correctly classified images

We visualize few images correctly classified.

In [None]:
def plot_predicted_images(predictions, data_index, x_data, y_data, size=16, cmap="gray"):
    # Plot the sample images now
    f, ax = plt.subplots(4, 4, figsize=(15, 15))

    for i, indx in enumerate(np.random.choice(data_index, size=size, replace=False)):
        ax[i//4, i%4].imshow(x_data[indx].reshape(IMG_ROWS,IMG_COLS), cmap=cmap)
        ax[i//4, i%4].axis('off')
        ax[i//4, i%4].set_title("True:{}  Pred:{}".format(LABELS[y_data[indx]], LABELS[predicted_classes[indx]]))
    plt.show()    
    
plot_predicted_images(predicted_classes, correct, x_test_data, y_test_data, cmap="Greens")

### Incorrectly classified images

Let's see also few images incorrectly classified.

In [None]:
plot_predicted_images(predicted_classes, incorrect, x_test_data, y_test_data, cmap="Reds")

---

# <a id="10">Conclusions</a> 

Here come the bright conclusions that i could extract from the results ;)