# Image Classification Model using CNN [Tutorial]

### Problem Statement

Use CIFAR10 data (from [`keras.datasets.cifar10`](https://keras.io/api/datasets/cifar10/)) to build an image classification model using Convolutional Neural Network.

### Load and prepare data for modeling

In [None]:
# labels
class_labels = ['airplane', 'automobile', 'bird', 'cat', 'deer', 
                'dog', 'frog', 'horse', 'ship', 'truck']

In [None]:
from tensorflow.keras.datasets import cifar10

(X_train_raw, y_train_raw), (X_test_raw, y_test_raw) = cifar10.load_data()

X_train_raw.shape, y_train_raw.shape, X_test_raw.shape, y_test_raw.shape

In [None]:
# reshape the target labels into vector form
y_train_raw = y_train_raw.reshape(-1, )
y_test_raw = y_test_raw.reshape(-1, )

y_train_raw.shape, y_test_raw.shape

In [None]:
# create dummies for the target labels
from tensorflow.keras.utils import to_categorical

y_train = to_categorical(y_train_raw, 10)
y_test = to_categorical(y_test_raw, 10)

y_train.shape, y_test.shape

In [None]:
# normalize the training data
from tensorflow.keras.utils import normalize

X_train = normalize(X_train_raw, axis=1)
X_test = normalize(X_test_raw, axis=1)

X_train.shape, y_train.shape, X_test.shape, y_test.shape

* Interesting link:
    * [What is the purpose of `keras utils normalize`?](https://stackoverflow.com/questions/52571752/what-is-the-purpose-of-keras-utils-normalize)

For each lable (target value), we have equal number of records (5,000).

### Visualize data

In [None]:
import matplotlib.pyplot as plt

fig, axes = plt.subplots(4, 4, 
                         figsize=(6, 6),
                         subplot_kw={'xticks':[], 'yticks':[]})

for i, ax in enumerate(axes.flatten()):
    ax.imshow(X_train_raw[i])
    
    act = class_labels[y_train_raw[i]]
    ax.text(0.05, 0.05, act, color='white', fontsize=10,
            weight='semibold', transform=ax.transAxes)

plt.show();

### Build a NN model with hidden layers

Before we start, we need to take a small digression. The keras Neural Network model training results are not easily reproducible since it involves a lot of shuffling and random initializations. In order to maintain consitency, we will have to initialize some random seeds before every model run. We will create a function to do this.

In [None]:
from tensorflow import random as tf_random
import numpy as np
import random

def init_seeds(s):
    '''
    Initializes random seeds prior to model training 
    to ensure reproducibality of training results.
    '''
    tf_random.set_seed(s)
    np.random.seed(s)
    random.seed(s)

Let's build a NN model with three hidden layers.

In [None]:
#--

# initialize seeds
init_seeds(314)

# prepare the model architecture
mlp1 = Sequential()

#--

# fit and validate the model
#--

These results don't look very good. Let's try to add more neurons to the hidden nodes.

* Supplimentary Resource:
    * [Early-stopping](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStopping)    
* Interesting link:
    * [What is the difference between `sparse_categorical_crossentropy` and `categorical_crossentropy`?](https://stackoverflow.com/questions/58565394/what-is-the-difference-between-sparse-categorical-crossentropy-and-categorical-c)

In [None]:
# initialize seeds
init_seeds(314)

mlp2 = Sequential(
    [
        Flatten(input_shape=(32, 32, 3)),
        Dense(1024, activation='relu'),
        Dense(1024, activation='relu'),
        Dense(64, activation='relu'),
        Dense(10, activation='softmax')
    ], 
    name='mlp_3hidden_v2')

mlp2.compile(optimizer='adam', loss='categorical_crossentropy', metrics='accuracy')

mlp2.fit(X_train, y_train, epochs=5, shuffle=True);

Again, the NN model is not able to achieve high accuracy. We could try more complicated models but there's a better way to improve the model --> CNN!

### Build a Convolutional Neural Network (CNN) model

In [None]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D

# initialize seeds
init_seeds(314)

cnn = Sequential()

cnn.compile(optimizer='adam', loss='categorical_crossentropy', metrics='accuracy')

cnn.fit(X_train, y_train, epochs=10, shuffle=True);

In [None]:
loss, accuracy = cnn.evaluate(X_test, y_test)
print(f'Loss: {loss:.2%}, Accuracy: {accuracy:.2%}')

Further fine-tuning would improve the model accuracy. For now, let's proceed with this model.

* Reflections:
    * [How to avoid overfitting in Deep Learning Neural Networks?](https://machinelearningmastery.com/introduction-to-regularization-to-reduce-overfitting-and-improve-generalization-error/)
    * [Something that bothers me about deep neural nets.](https://www.johndcook.com/blog/2017/10/09/something-that-bothers-me-about-deep-neural-nets/)

In [None]:
# predicted probabilities for each class
probs = 

In [None]:
# grab the predictions (predicted labels) from the model
preds = 

### Visualize the predictions

In [None]:
_, axes = plt.subplots(10, 10, figsize=(12, 12))

for i, ax in enumerate(axes.flatten()):
    ax.imshow(X_test_raw[i])
    ax.set_xticklabels([])
    ax.set_yticklabels([])
    pred = preds[i]
    act = class_labels[y_test_raw[i]]
    if pred == act:
        ax.text(0.05, 0.05, preds[i], color='white',
                weight='semibold', transform=ax.transAxes)
    else:
        ax.text(0.05, 0.05, preds[i], color='tomato',
                weight='semibold', transform=ax.transAxes)
plt.show();

## Exercises:

**Exercise 1:** Try different hyper-parameters to improve the model accuracy. 

**Exercise 2:** Capture the "history" of model fitting (i.e., the output of the `model.fit()` function) and plot (1) model accuracy, (2) validation accuracy, (3) model loss, and (4) validation loss, using `matplotlib`. (You can use `epoch` for the x-axis and put `accuracy` (or `loss`) on the y-axis.)