Handwritten Digit Recognition is a classic problem in Machine Learning and Computer Vision. It involves recognizing digits from 0 through 9 from images or scanned documents. This project does this task through Neural Networks.

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

The dataset used contains 42,000 rows where each row represents an image.
First column (label) indicated the digit (0-9) and the remaining columns represent the pixel values of the image.
These pixels are separated into x (pixel values) and y (labels).
x has 42,000 samples with 784 features (28x28 pixels), and y has 42,000 labels

In [4]:
train_data = pd.read_csv("data/Train.csv")
print("Shape of train_data:", train_data.shape)

X = train_data.iloc[:, 1:] # Get pixel values
y = train_data.iloc[:, 0] # Get labels

print("Shape of X after separating features:", X.shape)

Shape of train_data: (42000, 785)
Shape of X after separating features: (42000, 784)


We can see that the shape of X has one less feature then the train data. We will now preprocess the data.
We will do the following:
- Ensure X is in the correct format, i.e. Pandas DataFrame.
- Convert all pixel values to numeric format and replace any missing values with 0.
- Normalize the pixel values to [0,1] by dividing them by 255.0. This optimizes the model learning.
- Reshape the data to include a channel dimension making it compatible with NNs.

In [5]:
if not isinstance(X, pd.DataFrame):
    X = pd.DataFrame(X)

X = X.apply(pd.to_numeric, errors='coerce')
X = X.fillna(0)
X = X.values / 255.0
X = X.reshape(-1, 28, 28, 1)
print("Shape of X after reshaping:", X.shape)

Shape of X after reshaping: (42000, 28, 28, 1)


Neural networks work best labels follow the one-hot encoding format. It's essentially a way to convert categorical data into numerical form so models can process it correctly. (e.g. if categorical variable is Color = {Red, Blue, Green}, we assign it to 1 if the category is one of the colors, and all of the remaining colors to 0). We will tag divide it into 10 classes (i.e. numbers 0 through 9).

In [6]:
y = to_categorical(y, num_classes=10)
print("Shape of y after one-hot encoding:", y.shape)

Shape of y after one-hot encoding: (42000, 10)


Now, we will split the data! 80% of the data will be used for training, and the rest for test.

In [7]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
print("X_train shape:", X_train.shape)

X_train shape: (33600, 28, 28, 1)
