# MNIST Classification using Tensorflow

Training a Deep Learning model on handwritten digits using Tensorflow and Keras to accurately predict the test set

Kaggle link to the challenge - https://www.kaggle.com/c/digit-recognizer

## Contents

## 0: Importing libraries and the dataset

In [1]:
# Importing the libraries
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Sequential

import warnings
warnings.filterwarnings("ignore")

In [1]:
# Reading the data
train = pd.read_csv("../input/digit-recognizer/train.csv")
test = pd.read_csv("../input/digit-recognizer/test.csv")

In [1]:
# Seperating features and label
y_train = train["label"]
X_train = train.drop(["label"], axis=1)

## 1: Exploratory Data Analysis

Performing EDA on the training dataset

In [1]:
# Size of the training data
X_train.shape

In [1]:
X_train.head(5)

In [1]:
# Visualize the number of digits under each class
plt.figure(figsize=(9, 5))
sns.countplot(y_train)
plt.title("Count plot");

In [1]:
y_train.value_counts()

There seem to be an equal distribution of class labels

In [1]:
# Displaying some training samples
img_matrix = X_train.iloc[100].to_numpy()
img_matrix = img_matrix.reshape(28, 28)

plt.imshow(img_matrix, cmap=plt.cm.binary)

## 3: Feature Engineering

Feature engineering is an essential step of data preprocessin, this helps the model converge much faster.
We will be performing 3 main steps here
1. Normalization - We will perform grey scale normalization to reduce the illumination difference it also helps the model train faster
2. Reshape - We will drop the shape to 28x28. We will be adding an extra dimension, i.e 28x28x1 (Greyscale data). We have to add this extra dimension for keras, as 
3. Label Encoding - encoding labels to one hot vectors

In [1]:
# Normalization
X_train = tf.keras.utils.normalize(X_train, axis=1)
test = tf.keras.utils.normalize(test, axis=1)

In [1]:
# Reshape
X_train = X_train.values.reshape(-1, 28, 28)
test = test.values.reshape(-1, 28, 28)

## 4: Train Test Split

Splitting `X_train` into training data (90%) and test data (10%) using `sklearn.model_selection.train_test_split`

In [1]:
X_train, X_eval, y_train, y_eval = train_test_split(X_train, y_train, test_size=0.1, random_state=42)

In [1]:
X_train.shape, X_eval.shape, y_train.shape, y_eval.shape

## 5: Creating, Training and Fitting the model

In [1]:
# 1: Building the model
model = Sequential()
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dense(10, activation=tf.nn.sigmoid))

# 2. Compile the model
model.compile(optimizer="adam",
             loss="sparse_categorical_crossentropy",
             metrics=["accuracy"])

# 3. Train the model
history = model.fit(X_train, y_train,
         validation_data=(X_eval, y_eval),
         epochs=100)

## 6: Visualize Train and Validation Results

In [1]:
history_df = pd.DataFrame(history.history)
history_df.loc[:, ["loss", "accuracy"]].plot()
history_df.loc[:, ["val_loss", "val_accuracy"]].plot()

In [1]:
# Summary of the scores
history_df.describe()

Since the validation accuracy is over ~ 97.9, the same model will be used to submit the result.

## 7. Submit Results

In [1]:
results = model.predict(test)
results = np.argmax(results, axis=1)

submission = pd.DataFrame({"ImageId": [i for i in range(1, 28001)] , "Label": list(results)})

submission.to_csv("mnist_submission.csv", index=False)

# 8. Random Validation

A random sample of the test set is taken to validate with the predictions made by the model

In [1]:
plt.imshow(test[10].reshape(28, 28))

In [1]:
submission.iloc[10]