On this hands-on lab we will perform few activities related to Convolutional Neural Networks (CNN), including their key operations, main layers and observe some results.

To perform those activities it is important to address some requirements beforehand:

1) deploy one AWS EC2 instance (P2.8x type) to be used as sandbox (it could be destroyed after the lab execution)

2) After logging in the instance, run 'source activate tensorflow_p36'

3) Create a directory as 'mkdir -p /models/ai-conference' and enter on it 'cd /models/ai-conference'

4) Clone the github repository containing the labs 'git clone github link'

This notebook includes the following activities:

- basic CNN operations and exercising with kernels
- building a sample CNN for image classifications using fruits dataset
- train the neural network using the fruits dataset and evaluate its performance
- report the performance metrics for that model, including precision, recall, f1score and support
- performing transfer learning to speed up model creation process

## Part I - CNN basics

In [None]:
# validate that the required python modules are installed before starting

!conda install -y seaborn Pillow scikit-learn

In [None]:
# importing the python modules to be used across the notebook

import os
import keras
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.callbacks import ModelCheckpoint
from keras.utils.training_utils import multi_gpu_model
from sklearn.metrics import classification_report
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
from PIL import Image

In [None]:
# For reproducibility

np.random.seed(1000)

# model configuration -- number of GPUs and training option (Yes or No)

n_gpus = 8 # knob to make the model parallel or not
train_model = False # knob to decide if the model will be trained or imported

In [None]:
# creating few utility functions to assist with labs
# convolve2D() helps running the same as a convolution layer on a convolutional neural network

def convolve2D(img, kernel):
    is_gray_scale = len(img.shape) == 2
    
    if is_gray_scale:
        img = np.expand_dims(img, axis=2)

    img = pad_img(img, kernel)

    height, width = img.shape[:2]

    new_img = []

    for i in range(height - kernel.shape[0] + 1):
        row = []
        for j in range(width - kernel.shape[1] + 1):
            channels = []
            for k in range(img.shape[2]):
                slice = img[i:i+kernel.shape[0], j:j+kernel.shape[1], k]
                channels.append(np.expand_dims(np.sum(slice * kernel, keepdims=True), axis=0))
            row.append(np.concatenate(channels, axis=2))
        new_img.append(np.concatenate(row, axis=1))

    res = np.maximum(np.concatenate(new_img, axis=0), 0).astype('uint8')

    if is_gray_scale:
        return res[:, :, 0]

    return res

# pad_img() helps convolve2D() to perform padding on a sample image

def pad_img(img, kernel):
    pad_height = (kernel.shape[0] - 1) // 2
    pad_width = (kernel.shape[1] - 1) // 2

    if len(img.shape) == 2:
        return np.pad(img, ((pad_height, pad_height),
                            (pad_width, pad_width)), 'constant')
    return np.pad(img, ((pad_height, pad_height),
                        (pad_width, pad_width),
                        (0, 0)), 'constant')

# get_kernels() define few CNN kernels (filters) as sample to visualize how it does work

def get_kernels():
    kernels = []
    kernels.append(('Identity',
                     np.array([[0, 0, 0],
                               [0, 1, 0],
                               [0, 0, 0]])))
    kernels.append(('Edge Detection1', 
                     np.array([[1, 0, -1],
                               [0, 0, 0],
                               [-1, 0, 1]])))
    kernels.append(('Edge Detection2', 
                     np.array([[0, 1, 0],
                               [1, -4, 1],
                               [0, 1, 0]])))
    kernels.append(('Edge Detection3', 
                     np.array([[-1, -1, -1],
                               [-1, 8, -1],
                               [-1, -1, -1]])))
    kernels.append(('Sharpen', 
                     np.array([[0, -1, 0],
                               [-1, 5, -1],
                               [0, -1, 0]])))
    kernels.append(('Box Blur', 
                     np.array([[1/9, 1/9, 1/9],
                             [1/9, 1/9, 1/9],
                             [1/9, 1/9, 1/9]])))
    kernels.append(('Gaussian Blur', 
                     np.array([[1/16, 1/8, 1/16],
                               [1/8, 1/4, 1/8],
                               [1/16, 1/8, 1/16]])))
    return kernels

# plot_with_kernels() gets a sample image and show the result of sample kernels on it

def plot_with_kernels(img):
    kernels = get_kernels()
    n_sub_plots = len(kernels)
    
    plt.figure('kernels', figsize=(20, 20))

    for i, kernel in enumerate(kernels):
        plt.subplot(n_sub_plots, 3, (i*3) + 1)
        plt.text(0.5, 0.5, kernel[0],
                 horizontalalignment='center',
                 verticalalignment='center',
                 fontsize=15)
        plt.axis('off')

        plt.subplot(n_sub_plots, 3, (i * 3) + 2)
        sns.heatmap(kernel[1], annot=True, cmap='YlGnBu')
        plt.axis('off')

        plt.subplot(n_sub_plots, 3, (i+1) * 3)
        
        img_ = convolve2D(img, kernel[1])

        if len(img_.shape) == 2:
            plt.imshow(img_, cmap='gray')
        else:
            plt.imshow(img_)

        plt.axis('off')
    plt.show()

# imshow() plots a image with matplotlib.plt

def imshow(img):
    plt.imshow(img)
    plt.axis('off')
    plt.show()

# get_data_from_dir() helps importing image files from a given directory and importing them on variables as np arrays

def get_data_from_dir(path, size=[100, 100]):
    imgs = []
    labels = []
    labels_list = []

    files = os.listdir(path)

    for i, file in enumerate(files):
        img_names = os.listdir(os.path.join(path, file))
        
        labels.append(np.ones(len(img_names)) * i)
        labels_list.append(file)

        for img_name in img_names:
            with Image.open(os.path.join(path, file, img_name)) as img:
                imgs.append(np.asarray(img.resize(size)))

    imgs = np.asarray(imgs)
    print('Found {} images belonging to {} different classes'.format(imgs.shape[0], len(labels_list)))

    return imgs, np.hstack(labels), labels_list

# train_test_split() performs a manual spliting of a given data in train and test datasets

def train_test_split(X, y, val_split=0.2):
    index = int(X.shape[0] * (1 - val_split))
    return X[:index], y[:index], X[index:], y[index:]

# shuffle() is used to shuffle datasets :-)

def shuffle(X, y):
    indices = np.arange(X.shape[0])
    np.random.shuffle(indices)
    return X[indices], y[indices]

# one_hot() is used to convert categorical information in numerical values

def one_hot(y):
    return np.eye(int(np.max(y)) + 1)[y.astype('int')]

# plot_some() simply prints few images from a dataset, including its labels

def plot_some(X, y, y_hat=None, labels_list=None):
    indices = np.random.randint(0, X.shape[0], size=10)
    print(indices)

    if labels_list is None:
        labels_list = np.sort(np.unique(y))

    plt.figure(figsize=(15, 5))

    for i, index in enumerate(indices):
        plt.subplot(2, 5, i+1)
        plt.imshow(X[index])
        plt.title('Y: {}    Y_hat: {}'. \
                  format(labels_list[int(y[index])], 
                         'N/A' if y_hat is None else labels_list[y_hat[index]]))
        plt.axis('off')

    plt.tight_layout(pad=0.4, w_pad=0.5, h_pad=1.0)
    plt.show()

In [None]:
# here we are importing a sample 128x128x3 image to exercise with CNN kernels

img = np.array(Image.open('sample-128x128.jpg'))
imshow(img)

A quick understanding of how a kernel convolves on a given image:
![FilterUrl](https://mlnotebook.github.io/img/CNN/convSobel.gif "CNN kernels")

In [None]:
# using plot_with_kernels() on the sample image to visualize how CNN kernels are defined and what is the result
# of each kernel on the image

plot_with_kernels(img)

## Part II - Creating a CNN for image classification

In [None]:
# Using get_data_from_dir() function to import the images from the dataset
# then shuffles it and split it in train and test datasets

if train_model is True:
    X, y, labels_list = get_data_from_dir('Fruit-Images-Dataset/Training')
    X, y = shuffle(X, y)
    X_train, y_train, X_test, y_test = train_test_split(X, y)
    
    # performs one_hot_encoding on all labels
    y_train_hot = one_hot(y_train)
    y_test_hot = one_hot(y_test)

    print('Number of training Examples: {} \nNumber of testing Examples {}'.format(X_train.shape[0], X_test.shape[0]))

In [None]:
# Using get_data_from_dir() function to import the images from the dataset
# then shuffles it and use it as validation dataset

X_val, y_val, labels_list = get_data_from_dir('Fruit-Images-Dataset/Val')
X_val, y_val = shuffle(X_val, y_val)

# performs one_hot_encoding on all labels
y_val_hot = one_hot(y_val)

In [None]:
# plot few samples from the dataset, including their labels

if train_model is True:
    plot_some(X_train, y_train, labels_list=labels_list)
else:
    plot_some(X_val, y_val, labels_list=labels_list)

Here we gonna start the CNN training. The training happens as below:

![ANNTraining](https://devblogs.nvidia.com/parallelforall/wp-content/uploads/2015/08/training_inference1.png "CNN training")

In [None]:
# defining few hyperparameters for the CNN

input_shape = [100, 100, 3]
n_classes = 93
batch_size = 32
epochs = 2

# start defining the CNN -- input, convolutions, maxpooling, flatten and FC layers

if train_model is True:
    
    input = Input(input_shape)

    model = Conv2D(32, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(input)
    model = Conv2D(32, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(32, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = MaxPooling2D()(model)

    model = Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(64, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = MaxPooling2D()(model)

    model = Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(128, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = MaxPooling2D()(model)

    model = Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = Conv2D(256, kernel_size=(3, 3), strides=(1, 1), padding='same', activation='relu')(model)
    model = MaxPooling2D()(model)

    model = Flatten()(model)
    model = Dense(1024, activation='relu')(model)

    preds = Dense(n_classes, activation='softmax')(model)
    model = Model(inputs=input, outputs=preds)

    # if we are using a multi-GPU EC2 instance, make the model multi-GPU aware

    if n_gpus > 1:
        final_model = multi_gpu_model(model, gpus=8)
    else:
        final_model = model

    # compile the model

    final_model.compile(optimizer='adam',
                        loss='categorical_crossentropy', 
                        metrics=['accuracy'])

    # show a summary of the model -- including layers and tunable parameters
    final_model.summary()

In [None]:
# train the CNN using the train dataset

if train_model is True:
    history = final_model.fit(X_train,
                              y_train_hot,
                              batch_size=batch_size,
                              epochs=epochs,
                              validation_data=(X_test, y_test_hot))

In [None]:
# save the model details

if train_model is True:
    
    # save the model
    model.save('cnn_model.h5')

In [None]:
# summarize history for accuracy

if train_model is True:
    
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('model accuracy')
    plt.ylabel('accuracy')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
# summarize history for loss

if train_model is True:
    
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.show()

In [None]:
# evaluating the model accuracy

if train_model is not True:

    from keras.models import load_model
    final_model = load_model('cnn_model.h5')
    final_model.compile(optimizer='adam',
                        loss='categorical_crossentropy', 
                        metrics=['accuracy'])

results = final_model.evaluate(X_val, y_val_hot)
print('Loss on Validation set: {}  and Accuracy on Validation Set: {}'.format(results[0], results[1]))

In [None]:
# visualizing the model precision, recall, f1score and support 

predictions = final_model.predict(X_val)
report = classification_report(y_val_hot.argmax(axis=1), predictions.argmax(axis=1))
print(report)

## Part II - Transfer Learning

To avoid memory and/or cpu usage issues, it is important to reset the Jupyter Notebook kernel.

This task can be performed as:

- go to the Jupyter notebook menu (up there)
- click on 'Kernel'
- click on 'Restart'
- wait for the kernel to restart

Once the restart procedure is finished, go ahead on the next steps.

In [None]:
# We will use VGG16 pre-trained model present on Keras framework
# import the pre-trained model

from keras import applications

In [None]:
# load the pre-trained weights and model definition, including the FC layer
# this pre-trained model is configured to use 224x224x3 images

vgg_model = applications.VGG16(weights='imagenet', include_top=True)

In [None]:
# visualize the model configuration

vgg_model.summary()

In [None]:
# import the modules important to run this drill

import numpy as np
from PIL import Image
from matplotlib import pyplot as plt
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions

In [None]:
img = load_img('sample-224x224.jpg', target_size=(224, 224))
img

In [None]:
# preparing the image to be used with the model

img = img_to_array(img)
img = img.reshape((1, img.shape[0], img.shape[1], img.shape[2]))
img = preprocess_input(img)

In [None]:
# predict the probability across all output classes

prediction = vgg_model.predict(img)

In [None]:
# convert the probabilities to class labels

label = decode_predictions(prediction)

# retrieve the most likely result, e.g. highest probability

label = label[0][0]

# print the classification

print('%s (%.2f%%)' % (label[1], label[2]*100))

## Cleaning things up

Not much actions must be taken to clean the environment used on this lab.

As a new EC2 instance was created for this purpose, simply terminate the instance.