<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#F21DL-Data-Mining-&amp;-Machine-Learning" data-toc-modified-id="F21DL-Data-Mining-&amp;-Machine-Learning-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>F21DL Data Mining &amp; Machine Learning</a></span><ul class="toc-item"><li><span><a href="#Coursework-2-:--Convolutional-Network-for-Image-Classification-with-Transfer-Learning" data-toc-modified-id="Coursework-2-:--Convolutional-Network-for-Image-Classification-with-Transfer-Learning-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Coursework 2 :  Convolutional Network for Image Classification with Transfer Learning</a></span><ul class="toc-item"><li><span><a href="#Imports-&amp;-constants" data-toc-modified-id="Imports-&amp;-constants-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Imports &amp; constants</a></span></li></ul></li></ul></li></ul></div>

# F21DL Data Mining & Machine Learning

## Coursework 2 :  Convolutional Network for Image Classification with Transfer Learning
---
*ACCAD Dimitri, AUZIMOUR Antoine, DELTEL Clarence, DI MARTINO Thomas*

This notebook presents our work in the research question. 
In our research work, we wanted to benchmark the performances of a more common Deep Learning algorithm used to deal with data that have an architecture in it, whether it is spatial or temporal: the convolutional neural networks.

<figure style="margin-top, margin-bottom:10px; text-align: center">
        <img src="https://www.learnopencv.com/wp-content/uploads/2017/11/convolution-example-matrix.gif">
    <figcaption style="display:block;margin-left:auto;margin-right:auto"><u>Animation of the convolutional process (source: <a href="https://www.learnopencv.com/image-classification-using-convolutional-neural-networks-in-keras/">link</a>)</u></figcaption>
</figure>

These neural networks use the process of convolution (c.f. above) to generate <b>features maps</b> that are dense in information and that can help a classifier to have more ease to find the correct class.

This process of extracting features from images is transferable to a lot of different tasks: given a dataset where we want to discriminate cats from dogs, cars from motorcycles, a CNN will always look for patterns, more or less complex, depending on the depth of the network, that makeup any object (it could be a line, a curve, a round, gaps in colors...). This is the combination of these objects that leads a CNN to detect a given object in an image. 
Hence, the first part, consisting in finding the said patterns is easily transferable from one task to another: this famous and common process is called "**Transfer Learning**".
</br>
<figure style="margin-top: 10px;margin-bottom: 10px;text-align:center">
        <img src="https://ruder.io/content/images/2017/03/transfer_learning_setup.png">
    <figcaption><u>Illustration of the general concept of Transfer Learning (source: <a href="https://ruder.io/transfer-learning/">link</a>)</u></figcaption>
    
</figure>

For our research question, we will use the **VGG16 architecture** to detect the class of our streetsigns.
<figure style="margin-top: 10px;margin-bottom: 10px; text-align:center">
    <img src="https://s3-ap-south-1.amazonaws.com/av-blog-media/wp-content/uploads/2018/10/googlenet.png">
    <figcaption><u>Illustration of the architecture of the Inception network (source: <a href="https://www.analyticsvidhya.com/blog/2018/10/understanding-inception-network-from-scratch/">link</a>)</u></figcaption>
    
</figure>

We can here see that the VGG16 architecture is made of different part:
 - A convolutional part used to extract features from the input image;
 - A fully-connected part used to classify the extracted features to adequate classes.
 
In our situation, we will used the VGG16 that was trained on the '*imagenet*' dataset. However, in this dataset, 1000 classes are used to make predictions: hence, we will not keep the fully connected layers of the original network and we will build our own classifier on top of it to predict our 10 classes of panels. This process of scrapping away the connected layers from a network to only keep the convolutional part is very common when applying transfer learning.

Now for the code part !

### Imports & constants

In [1]:
# MAIN IMPORTS

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import cv2
import os
import numpy as np
import sklearn.preprocessing
import sklearn.metrics as metrics
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import math
import PIL
import tensorflow as tf

# KERAS IMPORTS

from keras.preprocessing import image
from keras.applications.inception_v3 import InceptionV3
from keras_preprocessing.image import ImageDataGenerator
from keras.models import Sequential, Model
from keras.layers import Dense, Activation, Flatten, Input, Conv2D, Dropout, BatchNormalization, MaxPooling2D, GlobalAveragePooling2D
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, EarlyStopping
from keras_tqdm import TQDMNotebookCallback

# SKLEARN IMPORTS

from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import OneHotEncoder

NUM_FOLDS = 10
ORIGINAL_IMAGE_SHAPE = (48, 48)
VGG_IMAGE_SHAPE = (224,224)
NUM_CLASSES = 10
LR = 0.001
BATCH_SIZE = 4
SEED = 42
IMG_FOLDER = "data"
ROOT_FOLDER = os.getcwd()
NB_EPOCHS = 20

Using TensorFlow backend.


To make it easier for Keras to work with the images, we will save all of them as real .png file in a data folder.

In [2]:
def from_csv_to_disk(PATH_TO_CSV="x_train_gr_smpl.csv", PATH_TO_DISK="data"):
    print("Opening csv..")
    data = pd.read_csv(PATH_TO_CSV)
    print("Csv opened..")
    print("Saving images to {}".format(os.path.join(os.getcwd(),PATH_TO_DISK)))
    for index, row in tqdm(data.iterrows()):
        np_row = row.to_numpy()
        img = np_row.reshape(ORIGINAL_IMAGE_SHAPE).astype(np.float32)
        cv2_img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
        cv2.imwrite(
            os.path.join(os.path.join(os.getcwd(),PATH_TO_DISK), f"img{index}.png"),#path to img
            cv2_img
        )

In [3]:
#from_csv_to_disk()

We now define the useful classes for our problem:
 - The model class that we will use to interact with our model
 - The Generator Keras object that will provide data to our network

In [19]:
def get_generator(dataframe, path_column_name="imgpath", class_column_name="class", is_test = False):
    """
    Dataframe should contain two columns:
     - img path column (default name: 'imgpath')
     - img class column (default name: 'class')
     
    If is_test: return a single generator
    If is_test is False: return a tuple of train and val datasets
    """
    
    
    if (is_test):
        
        generator = ImageDataGenerator(rescale=1./255)
        
        test_generator = generator.flow_from_dataframe(
            directory=os.path.join(ROOT_FOLDER, IMG_FOLDER),
            dataframe=dataframe,  
            x_col=path_column_name, 
            y_col=class_column_name, 
            class_mode="categorical", 
            target_size=VGG_IMAGE_SHAPE,
            shuffle=False,
            color_mode="rgb", #the last two parameters induce the resizing to (224,224,3)
            seed=SEED
        )
        
        return test_generator
    
    else:      
    
        generator = ImageDataGenerator(
            rescale=1./255,
            validation_split=0.2, # set validation split
            #data augmentation
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True
        )
        train_generator = generator.flow_from_dataframe(
            directory=os.path.join(ROOT_FOLDER, IMG_FOLDER),
            dataframe=dataframe,  
            x_col=path_column_name, 
            y_col=class_column_name, 
            class_mode="categorical", 
            target_size=VGG_IMAGE_SHAPE,
            color_mode="rgb", #the last two parameters induce the resizing to (224,224,3)
            batch_size=BATCH_SIZE,
            seed=SEED,
            subset='training') 

        validation_generator = generator.flow_from_dataframe(
            directory=os.path.join(ROOT_FOLDER, IMG_FOLDER),
            dataframe=dataframe,  
            x_col=path_column_name, 
            y_col=class_column_name, 
            class_mode="categorical", 
            target_size=VGG_IMAGE_SHAPE,
            color_mode="rgb", #the last two parameters induce the resizing to (224,224,3)
            batch_size=BATCH_SIZE,
            seed=SEED,
            subset='validation') 
    
        return train_generator, validation_generator

In [5]:
class PreTrainedCNN:
    
    def __init__(self):
        
        base_model = InceptionV3(weights='imagenet', include_top=False)
        for layer in base_model.layers:
            layer.trainable = False
        
        x = base_model.output
        x = GlobalAveragePooling2D()(x)
        
        x = Dense(1024, activation="relu")(x)
        x = Dropout(0.3)(x)
        x = Dense(256, activation="relu")(x)
        x = Dropout(0.3)(x)
        x = Dense(64, activation="relu")(x)
        x = Dropout(0.3)(x)
        x = Dense(NUM_CLASSES, activation="softmax")(x)
        
        
        self.model = Model(inputs= base_model.input, outputs=x)
        
        opt = Adam(learning_rate=LR)
        
        self.model.compile(loss='categorical_crossentropy',
              optimizer=opt,
              metrics=['accuracy'])
        
    def train(self, datagen):
        """
        Make sure to create the datagen object with the 'get_generator' method: it should be a tuple of training and validation generators
        """
        train_gen, val_gen = datagen
        
        filepath="toplayer_training-{epoch:02d}-{val_accuracy:.2f}.hdf5"
        checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', save_best_only=True, mode='max')
        
        return self.model.fit_generator(    
            train_gen,
            epochs=NB_EPOCHS,
            steps_per_epoch = train_gen.samples,
            validation_data = val_gen, 
            validation_steps = val_gen.samples,
            verbose=0, 
            callbacks=[
                TQDMNotebookCallback(), 
                checkpoint,
                EarlyStopping(monitor='val_accuracy', mode='max', min_delta=0.1)
            ]
        )
    
    def finetune(self,datagen):
        """
        Make sure to create the datagen object with the 'get_generator' method: it should be a tuple of training and validation generators
        """
        train_gen, val_gen = datagen
        
        filepath="finetune_training-{epoch:02d}-{val_accuracy:.2f}.hdf5"
        checkpoint = ModelCheckpoint(filepath, monitor='val_accuracy', save_best_only=True, mode='max')
        
        for layer in self.model.layers[:249]:
            layer.trainable = False
        for layer in self.model.layers[249:]:
            layer.trainable = True
            
        opt = Adam(learning_rate=LR)

        self.model.compile(optimizer=opt, loss='categorical_crossentropy',
              metrics=['accuracy'])

        # we train our model again (this time fine-tuning the top 2 inception blocks
        # alongside the top Dense layers
        return self.model.fit_generator(
            train_gen,
            epochs=NB_EPOCHS,
            steps_per_epoch = train_gen.samples,
            validation_data = val_gen, 
            validation_steps = val_gen.samples,
            verbose=0, 
            callbacks=[
                TQDMNotebookCallback(), 
                checkpoint,
                EarlyStopping(monitor='val_accuracy', mode='max', min_delta=0.025)
            ]
        )
    
    def test(self, datagen, steps=1):
        """
        Make sure to create the datagen object with the 'get_generator' method
        """
        return self.model.evaluate_generator(datagen, steps=steps)
    
    def predict(self, datagen):
        """
        Make sure to create the datagen object with the 'get_generator' method        
        """
        return self.model.predict_generator(datagen)
    
    def load_from_fileweights(self, path):
        """
        
        """
        self.model.load_weights(os.path.join(os.getcwd(), path))

We will now load data and launching training sessions using the same 10-fold validation procedure.

In [7]:
images_path = np.array(
                sorted(os.listdir(os.path.join(ROOT_FOLDER, IMG_FOLDER)), 
                     key=lambda x: eval(x[3:][:-4])# we sort images with their index number
                    )
                )

In [8]:
classes = pd.read_csv("y_train_smpl.csv").to_numpy()

In [9]:
c = pd.DataFrame(classes)
c = c.astype(str)
df = pd.concat([pd.DataFrame(np.expand_dims(images_path, axis=1)), c], axis=1)
df.columns = ["imgpath", "class"]
df.head()

Unnamed: 0,imgpath,class
0,img0.png,0
1,img1.png,0
2,img2.png,0
3,img3.png,0
4,img4.png,0


Our dataframe has now the correct information, we can create the generator.

In [10]:
X_train, X_test, y_train, y_test = train_test_split(df["imgpath"], df["class"], test_size=0.20, random_state=SEED)

train_df = pd.concat([X_train, y_train], axis=1)
test_df = pd.concat([X_test, y_test], axis=1)

#with tf.device('/cpu:0'):
model = PreTrainedCNN()

datagen_train = get_generator(train_df) # tuple (train_gen, val_ge)
datagen_test = get_generator(test_df, is_test=True) # single generator for test

model.load_from_fileweights("toplayer_training-02-0.70.hdf5")

#hist = model.train(datagen_train)

Found 8103 validated image filenames belonging to 10 classes.
Found 2025 validated image filenames belonging to 10 classes.
Found 2532 validated image filenames belonging to 10 classes.


In [11]:
print("\t\tTested metrics are:", model.model.metrics_names)
print("\t\tPerf: ",model.test(datagen_test))

		Tested metrics are: ['loss', 'accuracy']
		Perf:  [1.400138258934021, 0.75]


We just tested our network after finetuning the top fully connected layer. We will now see its performance when launching a training session on its last 250 layers.

In [17]:
#hist_bis = model.finetune(datagen_test)
model.load_from_fileweights("finetune_0.875_acc_test_set.hdf5")

In [13]:
print("\t\tTested metrics are:", model.model.metrics_names)
print("\t\tPerf: ",model.test(datagen_test))

		Tested metrics are: ['loss', 'accuracy']
		Perf:  [0.4105914235115051, 0.875]


In [21]:
y_pred = model.predict(datagen_test)

In [22]:
pd.DataFrame(metrics.classification_report(y_test,np.argmax(y_pred, axis=1).astype(str) ,  output_dict=True))

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,accuracy,macro avg,weighted avg
precision,0.355641,0.635838,1.0,0.97992,0.786948,0.997691,0.986014,1.0,0.778022,0.0,0.755134,0.752007,0.775413
recall,0.648084,0.284974,0.114943,0.924242,0.997567,0.929032,0.898089,0.568182,0.969863,0.0,0.755134,0.633498,0.755134
f1-score,0.459259,0.39356,0.206186,0.951267,0.879828,0.962138,0.94,0.724638,0.863415,0.0,0.755134,0.638029,0.733179
support,287.0,386.0,87.0,264.0,411.0,465.0,157.0,44.0,365.0,66.0,0.755134,2532.0,2532.0
