## Transfer Learning: Using Pre-Trained Xception Network
## Using Xception for Feature Extraction (with Data Augmentation)

This notebook contains Transfer learning code for extracting features using a pre-trained Xception Network. This code assumes that our custom dataset is small, and, because of that, we use Keras data augmentation methods to augment the dataset by applying a set of random transformations to each of the original images without increasing the number of the images.Data augmentation is a form of regularization that reduces overfitting and enables the trained model to generalize better (to unseen data). Data augmThis code should ONLY be run on a GPU as it is too expensive to run on a CPU

## Procedure

Procedure:

1) Add the convolutional base model to a Sequential model
2)Flatten the convolutional base outputs (before feeding them to the densely-connected classifier
3) Add a densely connected classifier on top of the flattened convolutional base model 
3) Freeze the convolutional base model 
4) Compile the model 
5) Train the model [end-to-end training with data augmentation] 
6) Save the trained model

## Import the Necessary Libraries

In [None]:
# import the necessary libraries/packages 
# set the matplotlib backend so figures can be saved in the background
import matplotlib
#matplotlib.use("Agg")

# Import the necessary libraries
from keras.applications.xception import Xception
from keras.preprocessing import image
from keras.applications.xception import preprocess_input, decode_predictions
from keras.preprocessing.image import img_to_array
from keras.layers import Input, Flatten, Dense, Dropout
from keras import optimizers
from keras.optimizers import Adam
from keras.layers import merge, Input
from keras import models
from keras import layers
from keras.models import Model
from keras.utils import np_utils
from keras.utils import to_categorical
from keras.preprocessing.image import load_img
from keras.preprocessing.image import ImageDataGenerator
import numpy as np
from sklearn.utils import shuffle 
from sklearn.preprocessing import LabelBinarizer
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
%matplotlib inline
import matplotlib.pyplot as plt
from imutils import paths
import random
import os
from pathlib import Path
import time

## Load the Convolutional Base Model

In [None]:
#Get the feature extraction part of the Xception network trained on ImageNet (convolutional base)

Xception_conv_base = Xception(weights='imagenet', include_top=False, input_shape=(299, 299, 3))
print(Xception_conv_base.summary())
print(Xception_conv_base.output_shape)

## Load Custom Data

In [None]:
# Load the custom dataset
# Use Python's pathlib library to enable the use of foward slashes even in Windows
# For details on pathlib, See: https://bit.ly/2HTbaEY 

dataset_path = Path('Path/to/custom/dataset/directory')
data_dir_list = os.listdir(dataset_path)

# initialize the data and labels
data = []
labels = []

## Pre-Process the Input Data (Imagery)

In [None]:
# Get the list of the image paths in our custom dataset and randomly 
# shuffle them to allow for easy training and testing splits via
# array slicing during training time

imagePaths = sorted(list(paths.list_images(dataset_path)))
random.seed(42)
random.shuffle(imagePaths)
print(len(imagePaths))

# Loop over the input images
for imagePath in imagePaths:
    # load the image,pre-process the image, and store it in the data list
    img= image.load_img(imagePath, target_size= (299,299))
    x = image.img_to_array(img)
    #print(x.shape)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
#   print('Input image shape:', x.shape)
    data.append(x)
#   print(data[0])
    # extract the class labels from the image paths then encode the
    # labels, assuming # that our image paths has the format shown below:
    # /path/to/dataset/{class}/{image}.jpg
    label = imagePath.split(os.path.sep)[-2] 
    # store the label in the labels list
    labels.append(label)
print(len(labels))


## Reshape the Data and Encode the Labels

In [None]:
#convert data and labels into NumPy array
data = np.array(data)
print(data.shape)

#Change the shape of the data to(number of images, 299, 299, 3)
data=np.rollaxis(data,1,0)
print (data.shape)
data=data[0]
print (data.shape)

labels = np.array(labels)
print(labels.shape)

#Encode the labels (from integers to vectors)
le = LabelEncoder()  
labels = le.fit_transform(labels)

## Split the Data

In [None]:
# split the data into training and testing splits using 80% of
# the data for training and the remaining 20% for testing (validation)
(trainX, testX, trainY, testY) = train_test_split(data, labels,
test_size=0.2, random_state=42)

# optionqlly check the shapes of the data splits
print('Shape of training images is:', trainX.shape)
print('Shape of validation images is:', testX.shape)
print('Shape of training labels is:', trainY.shape)
print('Shape of validation labels is:', testY.shape)

#get the length (number of samples) of the training and validation data
nTrain = len(trainX)
nVal = len(testX)

print('Total number of training images/samples is:', nTrain)
print('Total number of validation images/samples is:',nVal)

# convert the labels from integers to vectors (one-hot encoding)
# This is necessary because the .flow method does not accept the class_mode parameter
# as is the case with .flow_from_directory method

# One-Hot Encoding of the labels
trainY = to_categorical(trainY, num_classes=3)
testY = to_categorical(testY, num_classes=3)

## Add a custom Densely-Connected classifier on Top of the Convolutional Base

In [None]:
#Create a custom classification layer
num_classes = 3

#Create custom Xception Classifier Model 
model = models.Sequential()
model.add(Xception_conv_base) 
model.add(layers.Flatten()) # Flatten convolutional base
model.add(layers.Dense(256, activation='relu')) 
model.add(layers.Dropout(0.5))
model.add(layers.Dense(num_classes, activation='softmax'))
print(model.summary())

## Create an ImageDataGenerator Object Class to Load the Images and Use the flow Method to Generate Batches of Images and Labels

In [None]:
# Create a data augmentation configuration to prevent overfitting due to our small custom dataset
train_datagen = ImageDataGenerator(rescale=1./255,
                            rotation_range=40,
                            width_shift_range=0.2,
                            height_shift_range=0.2,
                            shear_range=0.2,
                            zoom_range=0.2,
                            horizontal_flip=True,
                            fill_mode='nearest')

#Create the validation data generator (validation dataset is not augmented)
val_datagen = ImageDataGenerator(rescale=1./255)

#Set the batch size
batch_size = 30

# Create the Python image generators to generate batches of images and labels
train_generator = train_datagen.flow(trainX, trainY, 
                                    batch_size=batch_size,
                                    shuffle=shuffle)
val_generator = val_datagen.flow(testX, testY, 
                                batch_size=batch_size)                       
                                # No shuffling for the validation set                                          

## Freeze the Convolutional Base

In [None]:
# the following gives a sense of the change in the number of trainable weights after freezing the convolutional base of the Xception model
print('This is the number of trainable weights before freezing the convolutional base:', len(model.trainable_weights))

Xception_conv_base.trainable = False

print('This is the number of trainable weights after freezing the convolutional base:', len(model.trainable_weights))

## Compile the Model

In [None]:
model.compile(optimizer=optimizers.Adam(lr=2e-5),
              loss='categorical_crossentropy',
              metrics=['acc'])

## Train the Model

In [None]:
%%time
history = model.fit_generator(train_generator,
                    steps_per_epoch=nTrain // batch_size,
                    epochs=20,
                    validation_data= val_generator,
                    validation_steps=nVal // batch_size)

## Save the Model

In [None]:
# save model architecture and weights to HDF5

model.save(Path('Path/to/where/to/save/the/model/model_name.h5'))

## Plot the Results of Training and Validation Accuracies & Losses

In [None]:
# This code will plot the curves of loss and accuracy during training

# Get values that were specified during model compilation which are saved in the 
# history object
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

# Get the number of epochs from the values in the 'acc' list
epochs = range(1, len(acc) + 1)

# Training and validation accuracy plot [Accuracy at each epoch]
plt.plot(epochs, acc, 'b', label='Training acc')       
plt.plot(epochs, val_acc, 'r', label='Validation acc')  
plt.title('Training and Validation Accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(loc='lower right')
plt.figure()

# Training and validation loss plot [Loss at each epoch]
plt.plot(epochs, loss, 'b', label='Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and Validation Loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(loc='upper right')
plt.show()
