# Final Lab

## Main task

In this notebook, we will apply transfer learning techniques to finetune the [MobileNet](https://arxiv.org/pdf/1704.04861.pdf) CNN on [Cifar-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset.

## Procedures

In general, the main steps that we will follow are:

1. Load data, analyze and split in *training*/*validation*/*testing* sets.
2. Load CNN and analyze architecture.
3. Adapt this CNN to our problem.
4. Setup data augmentation techniques.
5. Add some keras callbacks.
6. Setup optimization algorithm with their hyperparameters.
7. Train model!
8. Choose best model/snapshot.
9. Evaluate final model on the *testing* set.


In [None]:
# Setup one GPU for tensorflow (don't be greedy).
import os
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
# The GPU id to use, "0", "1", etc.
os.environ["CUDA_VISIBLE_DEVICES"] = "0" 

# https://keras.io/applications/#documentation-for-individual-models
from keras.applications.mobilenet import MobileNet
from keras.datasets import cifar10
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import cv2
import numpy as np
import tensorflow as tf

# Limit tensorflow gpu usage.
# Maybe you should comment this lines if you run tensorflow on CPU.
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.3
sess = tf.Session(config=config)


## 1. Load data, analyze and split in *training*/*validation*/*testing* sets

In [None]:
# Cifar-10 class names
# We will create a dictionary for each type of label
# This is a mapping from the int class name to 
# their corresponding string class name
LABELS = {
    0: "airplane",
    1: "automobile",
    2: "bird",
    3: "cat",
    4: "deer",
    5: "dog",
    6: "frog",
    7: "horse",
    8: "ship",
    9: "truck"
}

# Load dataset from keras
(x_train_data, y_train_data), (x_test_data, y_test_data) = cifar10.load_data()

############
# [COMPLETE] 
# Add some prints here to see the loaded data dimensions
############



In [None]:
############
# [COMPLETE] 
# Analyze the amount of images for each class
# Plot some images to explore how they look
############



In [None]:
############
# [COMPLETE] 
# Split training set in train/val sets
# Use the sampling method that you want
############



In [None]:
# In order to use the MobileNet CNN pre-trained on imagenet, we have
# to resize our images to have one of the following static square shape: [(128, 128),
# (160, 160), (192, 192), or (224, 224)].
# If we try to resize all the dataset this will not fit on memory, so we have to save all
# the images to disk, and then when loading those images, our datagenerator will resize them
# to the desired shape on-the-fly.

def save_to_disk(x_data, y_data, usage, output_dir='cifar10_images'):
    """
    This function will resize your data using the specified output_size and 
    save them to output_dir.
    
    x_data : np.ndarray
        Array with images.
    
    y_data : np.ndarray
        Array with labels.
    
    usage : str
        One of ['train', 'val', 'test'].

    output_dir : str
        Path to save data.
    """
    assert usage in ['train', 'val', 'test']
    
    # Set paths 
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    for label in np.unique(y_data):
        label_path = os.path.join(output_dir, usage, str(label))
        if not os.path.exists(label_path):
            os.makedirs(label_path)
    
    for idx, img in enumerate(x_data):
        bgr_img = img[..., ::-1]  # RGB -> BGR
        label = y_data[idx][0]
        img_path = os.path.join(output_dir, usage, str(label), 'img_{}.jpg'.format(idx))

        retval = cv2.imwrite(img_path, bgr_img)
        assert retval, 'Problem saving image at index:{}'.format(idx)


############
# [COMPLETE] 
# Use the above function to save all your data, e.g.:
#    save_to_disk(x_train, y_train, 'train', 'cifar10_images')
#    save_to_disk(x_val, y_val, 'val', 'cifar10_images')
#    save_to_disk(x_test, y_test, 'test', 'cifar10_images')
############


## 2. Load CNN and analyze architecture

In [None]:
############
# [COMPLETE] 
# Use the MobileNet class from Keras to load your base model, pre-trained on imagenet.
# We wan't to load the pre-trained weights, but without the classification layer.
# Check the notebook '3_transfer-learning' or https://keras.io/applications/#mobilenet to get more
# info about how to load this network properly.
############


## 3. Adapt this CNN to our problem

In [None]:
############
# [COMPLETE] 
# Having the CNN loaded, now we have to add some layers to adapt this network to our
# classification problem.
# We can choose to finetune just the new added layers, some particular layers or all the layer of the
# model. Play with different settings and compare the results.
############


## 4. Setup data augmentation techniques

In [None]:
############
# [COMPLETE] 
# Use data augmentation to train your model.
# Use the Keras ImageDataGenerator class for this porpouse.
# Note: Given that we want to load our images from disk, instead of using 
# ImageDataGenerator.flow method, we have to use ImageDataGenerator.flow_from_directory 
# method in the following way:
#    generator_train = dataget_train.flow_from_directory('resized_images/train', 
#                                                        target_size=(128, 128), batch_size=32)
#    generator_val = dataget_train.flow_from_directory('resized_images/val', 
#                                                      target_size=(128, 128), batch_size=32)
# Note that we have to resize our images to finetune the MobileNet CNN, this is done using 
# the target_size argument in flow_from_directory. Remember to set the target_size to one of
# the valid listed here: [(128, 128), (160, 160), (192, 192), or (224, 224)].
############


## 5. Add some keras callbacks

In [None]:
############
# [COMPLETE] 
# Load and set some Keras callbacks here!
############


## 6. Setup optimization algorithm with their hyperparameters

In [None]:
############
# [COMPLETE] 
# Choose some optimization algorithm and explore different hyperparameters.
# Compile your model.
############


## 7. Train model!

In [None]:
############
# [COMPLETE] 
# Use fit_generator to train your model.
# e.g.:
#    model.fit_generator(
#        generator_train,
#        epochs=50,
#        validation_data=generator_val,
#        steps_per_epoch=generator_train.n // 32,
#        validation_steps=generator_val.n // 32)
############


## 8. Choose best model/snapshot

In [None]:
############
# [COMPLETE] 
# Analyze and compare your results. Choose the best model and snapshot, 
# justify your election. 
############


## 9. Evaluate final model on the *testing* set

In [None]:
############
# [COMPLETE] 
# Evaluate your model on the testing set.
############
