<a class="anchor" id="0"></a>
# Digit Recognizer

## Building neural networks in Keras to recognize digits from the MNIST Dataset

In [None]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "https://eu-images.contentstack.com/v3/assets/blt6b0f74e5591baa03/blt790f1b7ac4e04301/6543ff50fcf447040a6b8dc7/News_Image_(47).png", width=600, height=400)

This notebook is created to submit a prediction for the Digit Recognizer competition on Kaggle.

The objective of this competition is stated below:

*Your goal is to correctly identify digits from a dataset of tens of thousands of handwritten images. We’ve curated a set of tutorial-style kernels which cover everything from regression to neural networks. We encourage you to experiment with different algorithms to learn first-hand what works well and how techniques compare.*

The dataset for this competition is the classic MNIST dataset, described below:

*MNIST ("Modified National Institute of Standards and Technology") is the de facto “hello world” dataset of computer vision. Since its release in 1999, this classic dataset of handwritten images has served as the basis for benchmarking classification algorithms. As new machine learning techniques emerge, MNIST remains a reliable resource for researchers and learners alike.*

It can be assumed that this is the perfect starting block for exploring Deep Neural Networks with Keras. This dataset can also be used to evaluate new classification models as the field of machine learning continues to grow. 

Therefore, in this notebook I will build a train a neural network to accurately recognize digits from handwritten images. I hope that this serves as proof that I have a rudimentary understanding of deep neural networks. 


**Please consider upvoting if this is helpful to you.**

This notebook builds on the marvelous kernel created by Aditya Soni: [MNIST with Keras for Beginners](https://www.kaggle.com/code/adityaecdrid/mnist-with-keras-for-beginners-99457)

Thankyou for helping me understand image recognition via Deep Neural Networks in Keras.

## Import Necessary Modules


In [None]:
# Import useful python libraries
import numpy as np # Numerical computing
import pandas as pd # Data transforming 
import matplotlib.pyplot as plt # Plotting
from collections import Counter
from sklearn.metrics import confusion_matrix
import itertools
import seaborn as sns
from subprocess import check_output

# Print the list of files/directories in the "../input" directory
print(check_output(["ls", "../input"]).decode("utf8"))
%matplotlib inline

# Load the Dataset from CSV

In [None]:
# Load the training dataset from input files
train = pd.read_csv("../input/train.csv")
print(train.shape)
train.head() # Show first five rows of dataset

In [None]:
# Use 'counter' to count the occurences of each unique label in the 'label' column of 'train' Dataframe
z_train = Counter(train['label'])
z_train

In [None]:
# Create a plot counting the number of unique labels in 'label' column
sns.set_palette('viridis')
sns.countplot(train['label'])

label_counts = train['label'].value_counts()

plt.show()

In [None]:
# Load the testing dataset from input files
test= pd.read_csv("../input/test.csv")
print(test.shape)
test.head()

In [None]:
x_train = (train.iloc[:,1:].values).astype('float32') # all pixel values
y_train = train.iloc[:,0].values.astype('int32') # only labels i.e targets digits
x_test = test.values.astype('float32')

In [None]:
# Disable the dispaly of matplotlib plots inline
%matplotlib inline

# Generate a grid of images to preview the first 20 samples
plt.figure(figsize=(12,10))
x, y = 10, 4
for i in range(20):  
    plt.subplot(y, x, i+1)
    plt.imshow(x_train[i].reshape((28,28)),interpolation='nearest')
plt.show()

# Normalising The Data 

For image data like the MNIST dataset, normalization typically involves scaling the pixel values to a range between 0 and 1. This is because the pixel values in grayscale images typically range from 0 to 255, with 0 representing black and 255 representing white. Scaling to a range between 0 and 1 involves dividing each pixel value by 255.

In [None]:
# Divide the training and testing sets by 255 to normalise 
x_train = x_train/255.0
x_test = x_test/255.0

In [None]:
y_train

## Printing the shape of the Datasets

In [None]:
print('x_train shape is:', x_train.shape)
print('x_train # of samples:', x_train.shape[0])
print('x_test shape is:', x_test.shape)
print('x_test # of samples:', x_test.shape[0])



## Reshape the Datasets To Match Keras' Expectations

In [None]:
X_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
X_test = x_test.reshape(x_test.shape[0], 28, 28, 1)

## Import Required Libraries from Keras

In [None]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.layers.normalization import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau
from sklearn.model_selection import train_test_split

# Set parameters
batch_size = 64
num_classes = 10
epochs = 20
input_shape = (28, 28, 1)

In [None]:
# convert class vectors to binary using One Hot Encoding
y_train = keras.utils.to_categorical(y_train, num_classes)
X_train, X_val, Y_train, Y_val = train_test_split(X_train, y_train, test_size = 0.1, random_state=42)

# Building a Neural Network Model

Next I will build a convolutional neural network (CNN) model for image classification.

Here's a breakdown of what each part of the model is doing:

1. Model Architecture:
    * The Sequential model is initialized, which represents a linear stack of layers.
    * Convolutional layers (Conv2D) are added to extract features from input images. Each convolutional layer uses rectified linear unit (ReLU) activation and the He normal initializer.
    * Max pooling layers (MaxPool2D) are added to downsample the feature maps.
    * Dropout layers (Dropout) are added for regularization to prevent overfitting by randomly setting a fraction of input units to zero during training.
    * Flatten layer (Flatten) is added to flatten the 2D feature maps into a 1D vector.
    * Dense layers (Dense) are added for classification, with ReLU activation in the hidden layers and softmax activation in the output layer.
    * Batch normalization (BatchNormalization) is applied to stabilize and accelerate the learning process.
2. Model Compilation:
    * The model is compiled with categorical crossentropy loss, RMSprop optimizer, and accuracy metrics.
3. Learning Rate Reduction:
    * A callback ReduceLROnPlateau is defined to reduce the learning rate when a metric has stopped improving.
4. Data Augmentation:
    * An ImageDataGenerator is initialized for real-time data augmentation. It applies various transformations such as rotation, zooming, shifting, and flipping to augment the training data and increase the diversity of training samples.

In [None]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal',input_shape=input_shape))
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal'))
model.add(MaxPool2D((2, 2)))
model.add(Dropout(0.20))
model.add(Conv2D(64, (3, 3), activation='relu',padding='same',kernel_initializer='he_normal'))
model.add(Conv2D(64, (3, 3), activation='relu',padding='same',kernel_initializer='he_normal'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu',padding='same',kernel_initializer='he_normal'))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.25))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.RMSprop(),
              metrics=['accuracy'])

learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', 
                                            patience=3, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=0.0001)

datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=15, # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.1, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=False,  # randomly flip images
        vertical_flip=False)  # randomly flip images

In [None]:
model.summary()

In [None]:
datagen.fit(X_train)
h = model.fit_generator(datagen.flow(X_train,Y_train, batch_size=batch_size),
                              epochs = epochs, validation_data = (X_val,Y_val),
                              verbose = 1, steps_per_epoch=X_train.shape[0] // batch_size
                              , callbacks=[learning_rate_reduction],)

## Basic Simple Plot And Evaluation

In [None]:
final_loss, final_acc = model.evaluate(X_val, Y_val, verbose=0)
print("Final loss: {0:.6f}, final accuracy: {1:.6f}".format(final_loss, final_acc))

We can see from the above evaluation of the trained model on the validation dataset, the model is very accurate. 

**Our model was 99.4% accurate in predicting the correct number when compared to the validation dataset.**

## Looking at the Loss

Let's take a closer look at the mistakes our model made. We can use a confusion matrix to quickly identify incorrectly predicted values.

In [None]:
# Look at confusion matrix 
#Note, this code is taken straight from the SKLEARN website, an nice way of viewing confusion matrix.
def plot_confusion_matrix(cm, classes,
                          normalize=False,
                          title='Confusion matrix',
                          cmap=plt.cm.Blues):
    """
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)

    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, cm[i, j],
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")

    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

# Predict the values from the validation dataset
Y_pred = model.predict(X_val)
# Convert predictions classes to one hot vectors 
Y_pred_classes = np.argmax(Y_pred, axis = 1) 
# Convert validation observations to one hot vectors
Y_true = np.argmax(Y_val, axis = 1) 
# compute the confusion matrix
confusion_mtx = confusion_matrix(Y_true, Y_pred_classes) 
# plot the confusion matrix
plot_confusion_matrix(confusion_mtx, classes = range(10))

We see from the confusion matrix that although our model was very accurate, it did make some errors. These are easily identifed in the chart above by looking at the light blue shaded area.

(i.e. Looking at the top row in one instance when the true label was 0, our model predicted 5. In another instance the true label was 0 and our model predicted 8.)

## Training and Validation Performance Metrics

We want to take a closer look into how the model's performance changes over the training epochs. This will help us assess if the model is underfitting or overfitting, and whether adjustments to the model or training process are necessary.

An explanation of the evaluation code is provided below:

1. Print History Keys:
    * Print the keys of the history object (h), which contains the training metrics collected during model training. These keys typically include 'acc' (training accuracy), 'val_acc' (validation accuracy), 'loss' (training loss), and 'val_loss' (validation loss).
2. Extract Metrics:
    * Extract the training accuracy values from the history object.
    * Extract the validation accuracy values from the history object.
    * Extract the training loss values from the history object.
    * Extract the validation loss values from the history object.
3. Plot Training and Validation Accuracy:
    * Plot the training accuracy values against the number of epochs.
    * Plot the validation accuracy values against the number of epochs.
4. Plot Training and Validation Loss:
    * Plot the training and validation loss values against the number of epochs.
    * Plot the training loss values against the number of epochs.
    * Plot the validation loss values against the number of epochs.

In [None]:
print(h.history.keys())
accuracy = h.history['acc']
val_accuracy = h.history['val_acc']
loss = h.history['loss']
val_loss = h.history['val_loss']
epochs = range(len(accuracy))
plt.plot(epochs, accuracy, 'bo', label='Training accuracy')
plt.plot(epochs, val_accuracy, 'b', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
plt.show()
plt.figure()
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.show()

## Predict Values

At this point we will perform prediction on the validation dataset using a trained neural network model. After executing this code segment, Y_pred_classes will contain the predicted class labels for each sample in the validation dataset, and Y_true_classes will contain the true class labels. These can be used for evaluating the performance of the model on the validation dataset, such as calculating accuracy, precision, recall, and F1-score.

In [None]:
# Predict the values from the validation dataset
Y_pred = model.predict(X_val)
# Convert predictions classes to one hot vectors 
Y_pred_classes = np.argmax(Y_pred, axis = 1)
Y_true_classes = np.argmax(Y_val, axis = 1)

In [None]:
Y_pred_classes[:5], Y_true_classes[:5]

In [None]:
from sklearn.metrics import classification_report
target_names = ["Class {}".format(i) for i in range(num_classes)]
print(classification_report(Y_true_classes, Y_pred_classes, target_names=target_names))

We can see from the above observations, that it is theoretically possible to improve upon this model. However, I am quite content with a 99.5% accuracy and will happily move forward with this model.

# Final Predictions

Finally we perform predictions on the test dataset using the trained Neural Network Model and save the predictions to a CSV file. 

In [None]:
predicted_classes = model.predict_classes(X_test)
submissions=pd.DataFrame({"ImageId": list(range(1,len(predicted_classes)+1)),
                         "Label": predicted_classes})
submissions.to_csv("submission.csv", index=False, header=True)

In [None]:
model.save('my_model_1.h5')
json_string = model.to_json()

## Thank you!