# Digit recognizer using CNN, Tensorflow v2 and Keras

# Intro
We have pretty simple looking task to classify pictures containing number 0-9. As dataset is pretty much clean, you will have no issues to get accuraccy above 95%, but when you want to cross 99.5% accuracy, things get more complicated. What I've learned here, that there is no "best" pattern and was more-less playing with composition of layers.

Initialy I've started simple CNN and then was adding and changing layers, but at the end I've added also image augmentation to make model more general. It slightly modify (rotate, flip, ...) images so it works better for unseen data. This notebook is as short as possible ;)

# Load libraries and data

In [None]:
# first load libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, Flatten, Dropout, BatchNormalization
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split

In [None]:
# load data
df_train = pd.read_csv('/kaggle/input/digit-recognizer/train.csv')
df_test = pd.read_csv('/kaggle/input/digit-recognizer/test.csv')

# Extract our features and target

In [None]:
# split our data into features & target
X_train = df_train.drop('label', axis=1).values
y_train = df_train['label'].values.reshape(-1,1)

X_test = df_test.values

# Rescale features
Deep networks are sensitive on extreme values and is a must to scale features before running model

In [None]:
# rescale variables
X_train = X_train.astype('float32')/255.0
X_test = X_test.astype('float32')/255.0

# Data check
Let's show few images how it looks like in our dataset. Images are in grayscale (color channel dimension is equal to one) and is in range 0:255

In [None]:
# check first few images
plt.figure(figsize=(15,15))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.imshow(X_train[i].reshape(28,28), cmap='gray')
    plt.title('Number:' + str(y_train[i][0]))
    plt.axis('off')

# Reshape data
Tensorflow require data in specific format (-1, image width, image height, number of channels), in our case it's (-1, 28, 28, 1)

In [None]:
# reshape features for tensorflow
X_train = X_train.reshape(-1,28,28,1)
X_test = X_test.reshape(-1,28,28,1)

# one hot encode for target variable
y_train = to_categorical(y_train)
target_count = y_train.shape[1]

# Image augmentation generator
To make our model robust and work well on unseen images, we will include also data augmentation step. This simply slightly modify our images in training process, each time in different random way. Our parameters are rotation in range of 10 degrees, width shift in range of 10%, height shift in range of 10%. Flip does not make sense too much in my opinion, it could i.e. in brain screening.

## Augmentation performance
Based on my testing, augmentation of images did not bring much better performance on data validation dataset (was like 99.5%, but increased accuracy on unseen data from 99.4% to 99.6% that's really great.

In [None]:
# image augmentation 
datagen = ImageDataGenerator(
    featurewise_center=False,
    samplewise_center=False,
    featurewise_std_normalization=False,
    samplewise_std_normalization=False,
    zca_whitening=False,
    rotation_range=10,
    zoom_range = 0.1,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=False,
    vertical_flip=False)

# fit generator on our train features
datagen.fit(X_train)

# Split data into training & validation set
This might not be needed as tensorflow supports validation split, but we are using data image generator as well as want to check few of incorrectly classified images how good was our model

In [None]:
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

# Modelling
Model structure:
* 3 convolutional layers using 64 filters and kernel of size 3x3 with relu activation, each followed by batch normalization
* max pooling with pool size 2x2, followed by droput (to avoid overfitting)
* 3 convolutional layers using 64 filters and kernel of size 3x3 with relu activation, each followed by batch normalization
* max pooling with pool size 2x2, followed by droput (to avoid overfitting)
* flatten layer to make our data as 1 dimensional for dense layer
* 2 dense layers (2nd is output one), batch normalization between dense layers
* RMSprop algorithm as optimizer, using 0.001 learning rate and 0.99 rho
* ReduceLROnPlateau function as callback to reduce learning rate when a metric has stopped improving.

In [None]:
model = Sequential()

model.add(Conv2D(filters=64, kernel_size=(3,3), padding='valid', activation='relu', input_shape=(28, 28, 1)))
model.add(BatchNormalization())

model.add(Conv2D(filters=64, kernel_size=(3,3), padding='valid', activation='relu'))
model.add(BatchNormalization())

model.add(Conv2D(filters=64, kernel_size=(3,3), padding='valid', activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(BatchNormalization())

model.add(Conv2D(filters=128, kernel_size=(3,3), padding='valid', activation='relu'))
model.add(BatchNormalization())

model.add(Conv2D(filters=128, kernel_size=(3,3), padding='valid', activation='relu'))
model.add(BatchNormalization())

model.add(Conv2D(filters=128, kernel_size=(3,3), padding='valid', activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))
model.add(BatchNormalization())

model.add(Flatten())

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(BatchNormalization())

model.add(Dense(target_count, activation='softmax'))


optimizer = RMSprop(learning_rate=0.001,rho=0.99)
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.3, verbose=1,patience=2, min_lr=0.00000001)

callback = EarlyStopping(monitor='loss', patience=5)
history = model.fit(datagen.flow(X_train,y_train, batch_size=64), epochs = 50, validation_data=(X_val, y_val), verbose = 1, callbacks=[reduce_lr, callback])

# Model evaluation
We could do confusion matrix, but not needed I think. Rather look on numbers those were incorrectly classified also discover for incorrectly classified, what were probabilities for other numbers. This will help us identify if our model is average, good, or excelent.

In [None]:
# prepare data for evaluation
y_val_m = y_val.argmax(axis=1)
y_val_hat_prob = model.predict(X_val)
y_val_hat = y_val_hat_prob.argmax(axis=1)
X_val_inc = X_val[y_val_m != y_val_hat, :, :, :]
y_val_inc = y_val_m[y_val_m != y_val_hat]
y_val_hat_inc = y_val_hat[y_val_m != y_val_hat]
y_val_hat_prob_inc = y_val_hat_prob[y_val_m != y_val_hat]

## Show incorrectly classified images
What is expecatation? Simply to let human look on images and tell if it was possible to correctly classify them :) You will find some of them are almost impossible to classify to their actual value. Would you do it better than model?

In [None]:
plt.figure(figsize=(15,15))
for i in range(16):
    plt.subplot(4,4,i+1)
    plt.imshow(X_val_inc[i, :, :, :].reshape(28,28), cmap='gray')
    plt.axis('off')
    plt.title('Actual: {}; Predicted: {}'.format(y_val_inc[i], y_val_hat_inc[i]))

## Probabilities of predicted vs actuals value
Just look if in some cases model had high confidence in different value as well or was just keen on predicted value.

In [None]:
for i in range(0,10):
    act = y_val_inc[i]
    pred = y_val_hat_inc[i]
    print('Actual: {}; Confidence (act/pred): \t{} - {:.0f}%  \t{} - {:.0f}%'.format(act, act, y_val_hat_prob_inc[i][act]*100, pred, y_val_hat_prob_inc[i][pred]*100))

# Submission
Let's submit our data and hope in results at least 99.6% !

In [None]:
# predict our test data
y_test_hat = model.predict(X_test).argmax(axis=1)

df_submission = pd.read_csv('/kaggle/input/digit-recognizer/sample_submission.csv')
df_submission['Label'] = y_test_hat.astype('int32')
df_submission.to_csv('Submission.csv', index=False)
print('Submission saved!')