# **Convolutional Neural Network** 
# **Binary classification of face images: comics or real?**

**Task**: to implement a deep-learning-based system discriminating between real faces and comics, using the «Comics faces» kaggle dataset

**Dataset**: The «[Comic faces](https://https://www.kaggle.com/datasets/defileroff/comic-faces-paired-synthetic-v2)» dataset is published on Kaggle and released under the CC-BY 4.0 license, with attribution required. It contains 10.000 real faces images and 10.000 comics faces images in 1024x1024 format.




### **1) Importing the required libraries**

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import zipfile
import pandas as pd
from sklearn.model_selection import train_test_split
import random
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
import time
import sklearn
import seaborn as sn

This command will allow us to display the **tensorboard inline** in this Google Colab Notebook

In [None]:
%load_ext tensorboard

### **2) Loading the Dataset**

This cell code is needed for **Kaggle API authentication**.
You have to upload your kaggle.json authentication token.

In [None]:
from google.colab import files

uploaded = files.upload()

for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(
      name=fn, length=len(uploaded[fn])))
  
# Then move kaggle.json into the folder where the API expects to find it.
!mkdir -p ~/.kaggle/ && mv kaggle.json ~/.kaggle/ && chmod 600 ~/.kaggle/kaggle.json

Here are the command for **Kaggle API installation** and authentication.

In [None]:
!pip install kaggle
import kaggle
from kaggle.api.kaggle_api_extended import KaggleApi
api = KaggleApi()
api.authenticate()

Now we can **download the dataset** and unzip it.

In [None]:
!kaggle datasets download -d defileroff/comic-faces-paired-synthetic-v2

In [None]:
with zipfile.ZipFile('comic-faces-paired-synthetic-v2.zip','r') as zipref:
     zipref.extractall()

With the next line we eliminate the folder containing not useful images.
Keeping only the two folders with real faces and comic faces will simplify the code.

In [None]:
!rm -rf /content/face2comics_v2.0.0_by_Sxela/face2comics_v2.0.0_by_Sxela/samples

Let's display our data!

In [None]:
DATADIR = "/content/face2comics_v2.0.0_by_Sxela/face2comics_v2.0.0_by_Sxela"

CATEGORIES = ["faces", "comics"]


for category in CATEGORIES:  
    path = os.path.join(DATADIR,category)  
    for img in os.listdir(path):  
        img_array = cv2.imread(os.path.join(path,img) ,cv2.IMREAD_GRAYSCALE)  
        plt.imshow(img_array, cmap='gray')  
        plt.show()  

        break  
    break  

In [None]:
print(img_array)

In [None]:
print(img_array.shape)

### **3) Preprocessing the images**
In this section I'm reshaping and putting in **gray scale** the images.
**150x150** seems to be a reasonable size in order to still recognize the content and save some storage, computational and time capacity.
The gray scale has been applied for the same purpous.

In [None]:
IMG_SIZE = 150

new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
plt.imshow(new_array, cmap='gray')
plt.show()

Then I created the **complete labeled dataset**

In [None]:
all_data = []

def create_all_data():
    for category in CATEGORIES:  

        path = os.path.join(DATADIR,category)  
        class_num = CATEGORIES.index(category)  

        for img in os.listdir(path):  
            try:
                img_array = cv2.imread(os.path.join(path,img) ,cv2.IMREAD_GRAYSCALE)  
                new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))  
                all_data.append([new_array, class_num])  
            except Exception as e:  # in the interest in keeping the output clean...
                pass
            #except OSError as e:
            #    print("OSErrroBad img most likely", e, os.path.join(path,img))
            #except Exception as e:
            #    print("general exception", e, os.path.join(path,img))

create_all_data()

print(len(all_data))

**Shuffling** the data is needed. Keeping them sequentially would likely lead our classifier to perform bad. It would learn always to predict a label and at a certain point to switch.

In [None]:
random.shuffle(all_data)

In [None]:
for sample in all_data[:10]:
    print(sample[1])

Separating images from labels. 

In [None]:
X = []

y = []

for features,label in all_data:
    X.append(features)
    y.append(label)

print(X[0].reshape(-1, IMG_SIZE, IMG_SIZE, 1)) #if I use colours instead of gray scale, I have to change from 1 to 3

X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1) #if I use colours instead of gray scale, I have to change from 1 to 3
y = np.array(y)

**Scaling** the images such that their pixel values are normalized between 0 and 1 will make the work a lot easier for the CNN

In [None]:
x = X/255.0 

### **4) Training set and Test set**
I randomly sampled a portion of the complete dataset to keep away in order to perform the the final test of my trained classifiers. 

In [None]:
x_train,x_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
print(len(x_test))

In [None]:
x_test[1:10]

In [None]:
type(x_train)

In [None]:
type(y_train)

In [None]:
print(len(x_train))
print(len(y_train))

### **5) First round of model selection**
In this section I used only a small portion of the training set (**4.800 images**) in order to compare the performanceas of a large number of models.

In [None]:
x_train_try1, x_train_notused1, y_train_try1, y_train_notused1 = train_test_split(x_train,y_train,test_size=0.7)

In [None]:
print(len(x_train_try1))
print(len(y_train_try1))

This first round of model selection takes into account **3 hyperparameters**: the **number of convolutional layers**, the **number of dense layers** and the **size of the layers**. Having already tried some models singularly taken I know that a layer size larger than 128 would take really long time to train. Furthermore, 0, 1 and 2 for the  number of dense layers and 1, 2 and 3 for the  number of convolutional layers seem reasonable choices.

In [None]:
dense_layers = [0, 1, 2]
layer_sizes = [32, 64, 128]
conv_layers = [1, 2, 3]

for dense_layer in dense_layers:
    for layer_size in layer_sizes:
        for conv_layer in conv_layers:
            NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv_layer, layer_size, dense_layer, int(time.time()))
            print(NAME)

            model = Sequential()

            model.add(Conv2D(layer_size, (3, 3), input_shape=x_train_try1.shape[1:]))
            model.add(Activation('relu'))
            model.add(MaxPooling2D(pool_size=(2, 2)))

            for l in range(conv_layer-1):
                model.add(Conv2D(layer_size, (3, 3)))
                model.add(Activation('relu'))
                model.add(MaxPooling2D(pool_size=(2, 2)))

            model.add(Flatten())

            for _ in range(dense_layer):
                model.add(Dense(layer_size))
                model.add(Activation('relu'))

            model.add(Dense(1))
            model.add(Activation('sigmoid'))

            tensorboard = TensorBoard(log_dir="logs_opt/{}".format(NAME))

            model.compile(loss='binary_crossentropy',
                          optimizer='adam',
                          metrics=['accuracy'],
                          )

            model.fit(x_train_try1, y_train_try1,
                      batch_size=32,
                      epochs=10,
                      validation_split=0.3,
                      callbacks=[tensorboard])


With the next line we can display the performances of our models with tensorboard inline.

In [None]:
%tensorboard --logdir logs_opt

####**First round considerations**:
Looking at the best 8 models in terms of validation loss we can draw some considerations.

1.   Models without any dense layer are completely instable
2.   Models with only 1 convolutional layer have bad performances and models with 3 convolutional layers are definitely the best
3.   Models with 128 as layer sizes have very large training duration without too much improvement





###**6) Second round of model selection**
In this round I increased the portion of the training set (**10.400 images**) used to perform Grid Search.
Larger number of observations (*n*) increase the training duration. However it is needed to see if modifing *n* also the models ranking would vary.

In [None]:
x_train_try2, x_train_notused2, y_train_try2, y_train_notused2 = train_test_split(x_train,y_train,test_size=0.35)
print(len(x_train_try2))
print(len(y_train_try2))

Now, the Grid Search is performed on a smaller number of models based on previous conclusions.
The number of convolutional layers is fixed to 3 and I also excluded models with 0 dense layers.


In [None]:
dense_layers = [1, 2]
layer_sizes = [32, 64, 128]
conv_layers = [3]

for dense_layer in dense_layers:
    for layer_size in layer_sizes:
        for conv_layer in conv_layers:
            NAME = "{}-conv-{}-nodes-{}-dense-{}".format(conv_layer, layer_size, dense_layer, int(time.time()))
            print(NAME)

            model = Sequential()

            model.add(Conv2D(layer_size, (3, 3), input_shape=x_train_try2.shape[1:]))
            model.add(Activation('relu'))
            model.add(MaxPooling2D(pool_size=(2, 2)))

            for l in range(conv_layer-1):
                model.add(Conv2D(layer_size, (3, 3)))
                model.add(Activation('relu'))
                model.add(MaxPooling2D(pool_size=(2, 2)))

            model.add(Flatten())

            for _ in range(dense_layer):
                model.add(Dense(layer_size))
                model.add(Activation('relu'))

            model.add(Dense(1))
            model.add(Activation('sigmoid'))

            tensorboard = TensorBoard(log_dir="logs_opt_2r/{}".format(NAME))

            model.compile(loss='binary_crossentropy',
                          optimizer='adam',
                          metrics=['accuracy'],
                          )

            model.fit(x_train_try2, y_train_try2,
                      batch_size=32,
                      epochs=8,
                      validation_split=0.2,
                      callbacks=[tensorboard])


In [None]:
%tensorboard --logdir logs_opt_2r

####**Second round considerations**:
Not including models with zero dense layer was definitely a good choice: all the models are now stable.

The best 4 models in terms of validation loss are:

1.   3 Convolutional Layers, Layer sizes 128 and 1 dense layer. Very large training duration: 1h, 34 min and 18 sec
2.   3 Convolutional Layers, Layer sizes 64 and 2 dense layer. Quick training duration: 18 min and 25 sec
3.   3 Convolutional Layers, Layer sizes 128 and 2 dense layer. Large training duration: 1h, 5 min and 33 sec
4.   3 Convolutional Layers, Layer sizes 32 and 1 dense layer. Definitely the most quick for training: 6 min and 36 sec




### **7) Final training of the best models**
Now its time to train this 4 models with the entire training set (**16.000 images**) for the final evaluation.

In [None]:

NAME = "faces-3-32-1-{}".format(int(time.time()))

model1 = Sequential()

model1.add(Conv2D(32, (3, 3), input_shape=x_train.shape[1:]))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))


model1.add(Conv2D(32, (3, 3)))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))

model1.add(Conv2D(32, (3, 3)))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2, 2)))

model1.add(Flatten())


model1.add(Dense(32))
model1.add(Activation('relu'))

model1.add(Dense(1))
model1.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_finalmodels/{}".format(NAME))

model1.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model1.fit(x_train, y_train,
          batch_size=128,
          epochs=10,
          validation_split=0.2,
          callbacks=[tensorboard])

model1.save('faces3-32-1.model')

In [None]:
NAME = "faces-3-64-2-{}".format(int(time.time()))

model2 = Sequential()

model2.add(Conv2D(64, (3, 3), input_shape=x_train.shape[1:]))
model2.add(Activation('relu'))
model2.add(MaxPooling2D(pool_size=(2, 2)))


model2.add(Conv2D(64, (3, 3)))
model2.add(Activation('relu'))
model2.add(MaxPooling2D(pool_size=(2, 2)))

model2.add(Conv2D(64, (3, 3)))
model2.add(Activation('relu'))
model2.add(MaxPooling2D(pool_size=(2, 2)))

model2.add(Flatten())


model2.add(Dense(64))
model2.add(Activation('relu'))

model2.add(Dense(64))
model2.add(Activation('relu'))

model2.add(Dense(1))
model2.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_finalmodels/{}".format(NAME))

model2.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model2.fit(x_train, y_train,
          batch_size=32,
          epochs=10,
          validation_split=0.2,
          callbacks=[tensorboard])

model2.save('faces3-64-2.model')

In [None]:
NAME = "faces-3-128-1-{}".format(int(time.time()))

model3 = Sequential()

model3.add(Conv2D(128, (3, 3), input_shape=x_train.shape[1:]))
model3.add(Activation('relu'))
model3.add(MaxPooling2D(pool_size=(2, 2)))

model3.add(Conv2D(128, (3, 3)))
model3.add(Activation('relu'))
model3.add(MaxPooling2D(pool_size=(2, 2)))

model3.add(Conv2D(128, (3, 3)))
model3.add(Activation('relu'))
model3.add(MaxPooling2D(pool_size=(2, 2)))

model3.add(Flatten())

model3.add(Dense(128))
model3.add(Activation('relu'))

model3.add(Dense(1))
model3.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_finalmodels/{}".format(NAME))

model3.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model3.fit(x_train, y_train,
          batch_size=128,
          epochs=5,
          validation_split=0.2,
          callbacks=[tensorboard])

model3.save('faces3-128-1.model')

In [None]:
NAME = "faces-3-128-2-{}".format(int(time.time()))

model4 = Sequential()

model4.add(Conv2D(128, (3, 3), input_shape=x_train.shape[1:]))
model4.add(Activation('relu'))
model4.add(MaxPooling2D(pool_size=(2, 2)))


model4.add(Conv2D(128, (3, 3)))
model4.add(Activation('relu'))
model4.add(MaxPooling2D(pool_size=(2, 2)))

model4.add(Conv2D(128, (3, 3)))
model4.add(Activation('relu'))
model4.add(MaxPooling2D(pool_size=(2, 2)))

model4.add(Flatten())


model4.add(Dense(128))
model4.add(Activation('relu'))

model4.add(Dense(128))
model4.add(Activation('relu'))

model4.add(Dense(1))
model4.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_finalmodels/{}".format(NAME))

model4.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model4.fit(x_train, y_train,
          batch_size=128,
          epochs=5,
          validation_split=0.2,
          callbacks=[tensorboard])

model4.save('faces3-128-2.model')

In [None]:
%tensorboard --logdir logs_train_finalmodels

###**8) Can we further improve our models?**
Definitely with **larger dataset** we could have improved our trained models. However we don't have it.<br><br>
Instead, something we could have definitely done is a deeper hyperparameters tuning.<br>
I have only perform model selection on few hyperparameter values. For this project we don't have neither the computational capacity nor the time to perform a larger Grid Search.<br>
However based on what we learned from previous steps there is something we can try.<br><br>
We know that the **layer sizes** can not be enlarge for time resources and it seemed not to imporve too much performances.<br>
Excluding the case with zero **dense layers**, this hyperparameter value alone seemed not to change the performances.<br>
The **number of convolutional layers** instead could definitely be a leading factor in model performances.<br> <br>
We can try our best performing model (among the quick ones) with 4 or 5 convolutional layers!

In [None]:
NAME = "faces-4-64-2-{}".format(int(time.time()))

model5 = Sequential()

model5.add(Conv2D(64, (3, 3), input_shape=x_train.shape[1:]))
model5.add(Activation('relu'))
model5.add(MaxPooling2D(pool_size=(2, 2)))


model5.add(Conv2D(64, (3, 3)))
model5.add(Activation('relu'))
model5.add(MaxPooling2D(pool_size=(2, 2)))

model5.add(Conv2D(64, (3, 3)))
model5.add(Activation('relu'))
model5.add(MaxPooling2D(pool_size=(2, 2)))

model5.add(Conv2D(64, (3, 3)))
model5.add(Activation('relu'))
model5.add(MaxPooling2D(pool_size=(2, 2)))


model5.add(Flatten())


model5.add(Dense(64))
model5.add(Activation('relu'))

model5.add(Dense(64))
model5.add(Activation('relu'))

model5.add(Dense(1))
model5.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_manyconv/{}".format(NAME))

model5.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model5.fit(x_train, y_train,
          batch_size=128,
          epochs=10,
          validation_split=0.2,
          callbacks=[tensorboard])

model5.save('faces4-64-2.model')

In [None]:
NAME = "faces-5-64-2-{}".format(int(time.time()))

model6 = Sequential()

model6.add(Conv2D(64, (3, 3), input_shape=x_train.shape[1:]))
model6.add(Activation('relu'))
model6.add(MaxPooling2D(pool_size=(2, 2)))


model6.add(Conv2D(64, (3, 3)))
model6.add(Activation('relu'))
model6.add(MaxPooling2D(pool_size=(2, 2)))

model6.add(Conv2D(64, (3, 3)))
model6.add(Activation('relu'))
model6.add(MaxPooling2D(pool_size=(2, 2)))

model6.add(Conv2D(64, (3, 3)))
model6.add(Activation('relu'))
model6.add(MaxPooling2D(pool_size=(2, 2)))

model6.add(Conv2D(64, (3, 3)))
model6.add(Activation('relu'))
model6.add(MaxPooling2D(pool_size=(2, 2)))

model6.add(Flatten())


model6.add(Dense(64))
model6.add(Activation('relu'))

model6.add(Dense(64))
model6.add(Activation('relu'))

model6.add(Dense(1))
model6.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_manyconv/{}".format(NAME))

model6.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model6.fit(x_train, y_train,
          batch_size=128,
          epochs=10,
          validation_split=0.2,
          callbacks=[tensorboard])

model6.save('faces5-64-2.model')

In [None]:
%tensorboard --logdir logs_train_manyconv

It definitely seems that the models with 4 and 5 convolutional layers perform better.<br><br>
The last attempt I want to do is to see if reducing the layers sizes of the model with 5 convolutional layers would reduce the training duration without worsening the accuracy

In [None]:
NAME = "faces-5-32-2-{}".format(int(time.time()))

model7 = Sequential()

model7.add(Conv2D(32, (3, 3), input_shape=x_train.shape[1:]))
model7.add(Activation('relu'))
model7.add(MaxPooling2D(pool_size=(2, 2)))


model7.add(Conv2D(32, (3, 3)))
model7.add(Activation('relu'))
model7.add(MaxPooling2D(pool_size=(2, 2)))

model7.add(Conv2D(32, (3, 3)))
model7.add(Activation('relu'))
model7.add(MaxPooling2D(pool_size=(2, 2)))

model7.add(Conv2D(32, (3, 3)))
model7.add(Activation('relu'))
model7.add(MaxPooling2D(pool_size=(2, 2)))

model7.add(Conv2D(32, (3, 3)))
model7.add(Activation('relu'))
model7.add(MaxPooling2D(pool_size=(2, 2)))

model7.add(Flatten())


model7.add(Dense(32))
model7.add(Activation('relu'))

model7.add(Dense(32))
model7.add(Activation('relu'))

model7.add(Dense(1))
model7.add(Activation('sigmoid'))

tensorboard = TensorBoard(log_dir="logs_train_manyconv/{}".format(NAME))

model7.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'],
              )

model7.fit(x_train, y_train,
          batch_size=128,
          epochs=10,
          validation_split=0.2,
          callbacks=[tensorboard])

model7.save('faces5-32-2.model')

Yes. The model with 5 convolutional layers, 2 dense layers and 32 as layers sizes is much faster loosing only very little validation accuracy.

We definitely have to include this 3 models in the testing part.

####**In conclusion**
Larger numebr of convolutional layers could be tested<br>
Other technologies could have been added to try to improve the performances.<br>
A typical choice for this kind of models is to add a **Drop Out Layer**. It is used to prevent overfitting from happening. I didn't include it because definitely I had no overfitting issues.

*The next cell can be used to connect your Google Drive account with Google colab. It is very useful to save your trained models because when the Colab Runtime is stopped you lose all your variables and models.*

In [None]:
#from google.colab import drive
#drive.mount('/content/gdrive')

### **9) Testing final models**
Now it's time to use the test set (**4.000 images**) to evaluate the performance of our final models.
For each model I displayed **test loss**, **test accuracy** and the **confusion matrix**
To interpret the confusion matri: 0 correspond to real faces and 1 to comics faces

####**4conv-64size-2dense Model test performances**

In [None]:
model5 = tf.keras.models.load_model("faces4-64-2.model")

results5 = model5.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results5)

In [None]:
mod5_pred=model5.predict(x_test)
mod5_prediction=np.around(mod5_pred,1)

In [None]:
cm5=tf.math.confusion_matrix(
    y_test,
    mod5_prediction)
plt.figure(figsize = (5,4))
sn.heatmap(cm5, annot=True, annot_kws={"size": 16},cmap='Blues')

####**5conv-64size-2dense Model test performances**

In [None]:
model6 = tf.keras.models.load_model("faces5-64-2.model")

results6 = model6.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results6)

In [None]:
mod6_pred=model6.predict(x_test)
mod6_prediction=np.around(mod6_pred,1)

In [None]:
cm6=tf.math.confusion_matrix(
    y_test,
    mod6_prediction)
plt.figure(figsize = (5,4))
sn.heatmap(cm6, annot=True, annot_kws={"size": 16},cmap='Blues')

####**5conv-32size-2dense Model test performances**

In [None]:
model7 = tf.keras.models.load_model("faces5-32-2.model")

results7 = model7.evaluate(x_test, y_test, batch_size=128)
print("test loss, test acc:", results7)

In [None]:
mod7_pred=model7.predict(x_test)
mod7_prediction=np.around(mod7_pred,1)

In [None]:
cm7=tf.math.confusion_matrix(
    y_test,
    mod7_prediction)
plt.figure(figsize = (5,4))
sn.heatmap(cm7, annot=True, annot_kws={"size": 16},cmap='Blues')