The goal of this notebook is to present simple approaches for classifying images.

I will start with a very simple CNN and make it deeper, and more complex as I move forward. Feel free to leave some comments as I am still a student in ML and I could be doing things with mistakes.

---

# 1. Import Librairies

In [None]:
# Basics
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import time
import IPython
import random
from random import getrandbits
import os

# Tensorflow
import tensorflow as tf
from tensorflow import keras
import tensorflow.keras.layers as layers
from tensorflow.keras import datasets
from tensorflow.keras import callbacks

from sklearn.model_selection import train_test_split

# Metrics
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score

# Graphs
import matplotlib.pyplot as plt

print("Numpy: " + np.__version__)
print("Tensorflow: " + tf.__version__)
print("Keras: " + keras.__version__)

---

# 2. Loading Data

In [None]:
(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()

# Normalize pixel values to be between 0 and 1
train_images, test_images = train_images / 255.0, test_images / 255.0

## Verify Data

In [None]:
class_names = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer',
               'Dog', 'Frog', 'Horse', 'Ship', 'Truck']

plt.figure(figsize=(10, 10))
for i in range(25):
    plt.subplot(5, 5, i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    # The CIFAR labels happen to be arrays, 
    # which is why you need the extra index
    plt.xlabel(class_names[train_labels[i][0]])
plt.show()

---

# 3. Re-Create a Basic CNN from TensorFlow

This first, and simplest model, come from an architecture found of the website of tensorflow. You can find the model, [and the notebook here](https://www.tensorflow.org/tutorials/images/cnn).

In [None]:
model1 = keras.Sequential([
    # First Convolutional Block
    layers.Conv2D(32, (3, 3), activation="relu", input_shape=(32, 32, 3)),
    layers.MaxPool2D((2, 2)),

    # Second Convolutional Block
    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPool2D((2, 2)),

    # Third Convolutional Block
    layers.Conv2D(64, (3, 3), activation="relu"),

    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10)
])

model1.summary()

In [None]:
model1.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history1 = model1.fit(
    train_images, train_labels, 
    epochs=10, 
    validation_split=0.2
)

---

# 4. Evaluate Model 1

In [None]:
plt.plot(history1.history['accuracy'], label='accuracy')
plt.plot(history1.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')
plt.show()

---

# 5. Create a CNN from Kaggle Course

In this second model, I make the CNN a bit wilder and deeper. I also took the architecture from a tutorial, this one comes from the Kaggle Course, [the notebook can be found here](https://www.kaggle.com/ryanholbrook/custom-convnets).

In [None]:
model2 = keras.Sequential([
    # First Convolutional Block
    layers.Conv2D(32, kernel_size=5, activation="relu", padding='same',
                  # give the input dimensions in the first layer
                  # [height, width, color channels(RGB)]
                  input_shape=[32, 32, 3]),
    layers.MaxPool2D(),

    # Second Convolutional Block
    layers.Conv2D(64, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),

    # Third Convolutional Block
    layers.Conv2D(128, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),

    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10)
])

model2.summary()

In [None]:
model2.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history2 = model2.fit(
    train_images, train_labels, 
    epochs=10, 
    validation_split=0.2
)

---

# 6. Evaluate Model 2

In [None]:
plt.plot(history2.history['accuracy'], label='accuracy')
plt.plot(history2.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')
plt.show()

## First Conclusion
 * We can see that the second model is overfitting. The validation accuracy isn't following the train accuracy.
 * First, create One Version with DropOut and BatchNormalization Layers.
 * Then, create One Version with Augmentation Techniques.
 
 ---
 
 # 7. Create a Model with DropOut Technics
 
 In this model, I added DropOut and BatchNormalization Layers. The Kaggle Course introducting them [can be found here](https://www.kaggle.com/ryanholbrook/dropout-and-batch-normalization).

In [None]:
model3 = keras.Sequential([
    # First Convolutional Block
    layers.Conv2D(32, kernel_size=5, activation="relu", padding='same',
                  # give the input dimensions in the first layer
                  # [height, width, color channels(RGB)]
                  input_shape=[32, 32, 3]),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Second Convolutional Block
    layers.Conv2D(64, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Third Convolutional Block
    layers.Conv2D(128, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    layers.Dense(10)
])

model3.summary()

In [None]:
model3.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history3 = model3.fit(
    train_images, train_labels, 
    epochs=10, 
    validation_split=0.2
)

---

# 8. Evaluate Model 3

In [None]:
plt.plot(history3.history['accuracy'], label='accuracy')
plt.plot(history3.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')
plt.show()

---

# 9. Create a Model with Augmentation Technics

For this model, I tried to implement Augmentation Layers. I used two notebooks to learn how to implement them:
* [The notebook from TensorFlow](https://www.tensorflow.org/tutorials/images/data_augmentation)
* [The notebook from Kaggle's Course](https://www.kaggle.com/ryanholbrook/data-augmentation)

In [None]:
model4 = keras.Sequential([
    # First Convolutional Block
    layers.InputLayer(input_shape=(32, 32, 3)),
    layers.experimental.preprocessing.RandomFlip("horizontal"),
    layers.experimental.preprocessing.RandomRotation(0.2),
    layers.Conv2D(32, kernel_size=5, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Second Convolutional Block
    layers.Conv2D(64, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Third Convolutional Block
    layers.Conv2D(128, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    layers.Dense(10)
])

model4.summary()

In [None]:
model4.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history4 = model4.fit(
    train_images, train_labels, 
    epochs=10, 
    validation_split=0.2
)

---

# 10. Evaluate Model 4

In [None]:
plt.plot(history4.history['accuracy'], label='accuracy')
plt.plot(history4.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0, 1])
plt.legend(loc='lower right')
plt.show()

## Second Conclusion

* The Drop Layer made the Model generalize better and not overfitting anymore but we probably need to let him train on more epochs.
* We lost accuracy while implementing and using Augmentation Technics.

---

# 11. Creation of a Model with more epochs

Here, I did some little modifications to my previous model, and added more epochs.

In [None]:
model5 = keras.Sequential([
    # First Convolutional Block
    layers.Conv2D(32, kernel_size=5, activation="relu", padding='same',
                  # give the input dimensions in the first layer
                  # [height, width, color channels(RGB)]
                  input_shape=[32, 32, 3]),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Second Convolutional Block
    layers.Conv2D(64, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Third Convolutional Block
    layers.Conv2D(128, kernel_size=3, activation="relu", padding='same'),
    layers.MaxPool2D(),
    layers.Dropout(0.3),
    layers.BatchNormalization(),
    
    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10)
])

model5.summary()

In [None]:
model5.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history5 = model5.fit(
    train_images, train_labels, 
    epochs=50, 
    validation_split=0.2
)

---

# 12. Evaluate Model 5

In [None]:
plt.plot(history5.history['accuracy'], label='accuracy')
plt.plot(history5.history['val_accuracy'], label = 'val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0, 1])
plt.legend(loc='lower right')
plt.show()

## Third Conclusion

* We successfully improved the accuracy of our Model.
* Taking off the DropOut and BatchNormalization in the head of the Model also helped: 71% -> 75% in 10 epochs
* Adding more epochs also helped: 10 -> 50
* Now, let's move on to an other technic: **Transfer Learning**.

---

# 13. Creation of a Model using a Pre-trained Model

For this model, I tried, for the first time to use a Transfer Learning Technic. For that I used two notebooks:
* [The notebook from TensorFlow](https://www.tensorflow.org/tutorials/images/transfer_learning)
* [The notebook from Kaggle](https://www.kaggle.com/ryanholbrook/data-augmentation)

In [None]:
preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input

In [None]:
base_model = tf.keras.applications.MobileNetV2(
    input_shape=(32, 32, 3),
    include_top=False,
    weights='imagenet'
)

base_model.trainable = False

In [None]:
base_model.summary()

In [None]:
model6 = keras.Sequential([
    layers.InputLayer(input_shape=(32, 32, 3)),
    
    # Preprocessing
    #preprocess_input,
    
    # Base Model: Efficient Net
    base_model,
    
    # Classifier Head
    layers.Flatten(),
    layers.Dense(64, activation="relu"),
    layers.Dense(10)
])

model6.summary()

In [None]:
model6.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"]
)

history6 = model6.fit(
    train_images, train_labels, 
    epochs=50, 
    validation_split=0.2
)

## Fourth Conclusion

* Our first attempt on using Tranfer Learning was a failure. It is probably due to the small size of our images, while most Big Models to use with transfer learning are trained on bigger images.

---

# 14. Performances on the Test Set

To check the performance on the test set, we are going to reuse our best model which is the model 5

## A. Performances on the Training Set

plt.xlabel(class_names[train_labels[i][0]])

In [None]:
train_labels.shape

In [None]:
# Predict
train_pred = model5.predict(train_images)
#train_pred = train_pred.reshape(-1)
train_pred.shape

## B. Evaluate the Test Set

In [None]:
test_loss, test_acc = model5.evaluate(test_images, test_labels, verbose=2)

## C. Performances on the Test Set

This last part for plotting a confusion matrix is a failure for now since the output of the CNN isn't the same as the labels. I need to do some research about that.

# Predict
test_pred = model5.predict(test_images)
test_pred = test_pred.reshape(-1)

# Confusion Matrix
cm_test = confusion_matrix(test_labels, test_pred, normalize='true')

# Graph
plt.figure(figsize=(12, 7))
plt.title('Accuracy Score on the Test Set: ' + str(accuracy_score(test_labels, test_pred).round(4)), size=25)
sns.heatmap(cm_test, annot=True, fmt='.2%', cmap='Blues')
plt.xlabel('Predicted Values', size=20)
plt.ylabel('True Values', size=20)
plt.show()

# To be Continued...