# MNIST Digit Recognizer - Simple Neural Network Classification

**Authors: Clement, Calvin, Tilova**

---

Welcome to the second notebook by **Tequila Chicas**! We will be classifying images of hand written numbers to their corresponding digits. This project follows the guidelines and uses the data set provide from the Kaggle Competition [here](https://www.kaggle.com/competitions/digit-recognizer/overview). 

## Introduction  

In this notebook we will be fitting the dataset in a simple neural network to see how well we can predict the digits of the MNIST Dataset.

<a id = 'toc'></a>
    
## Table of Contents
---
1. [Simple Neural Network](#simple)

**Importing Libraries**

In [16]:
import numpy as np
import pandas as pd

# data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Train_Test_Split
from sklearn.model_selection import train_test_split

# Scaling
from sklearn.preprocessing import StandardScaler

# PyTorch
import torch
import torch.nn as nn

# ignores the filter warnings
import warnings
warnings.filterwarnings('ignore')

<a id = 'simple'></a>
### 1. Simple Neural Network
---
Loading the test and train set CSVs files.

In [4]:
df_train = pd.read_csv('../data/train.csv')
df_test = pd.read_csv('../data/test.csv')
df_train.shape, df_test.shape

((42000, 785), (28000, 784))

We need to set our independent (X) and dependent (y) variables as `numpy arrays` from the dataset.

In [10]:
X = df_train.iloc[:, 1:].to_numpy()
y = df_train.iloc[:, 0].to_numpy()

# sanity check
print(X.shape, y.shape)

(42000, 784) (42000,)


We will perform a **train_test_split()** to split our dataset into train and validation sets.
- Validation size of 25% of the data.
- Stratify=y to make sure distribution of the classes remain the same in both training and validation set.

In [11]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.25, stratify=y)
X_train.shape, y_train.shape

((31500, 784), (31500,))

We can start by implementing a simple `linear` network.
- Since it's linear, we would obtain better results when scaling the data.

In [18]:
# instantiate standard scaler
ss = StandardScaler()

# fit and transform training
X_train = ss.fit_transform(X_train)

# ONLY transform X_val
X_test = ss.transform(X_val)

Now we need to convert the 1-D `arrays` into torch `tensors`
- Using float32 to cut down memory usage
- Using torch.long for classification labels.

In [19]:
# Independent Variables
X_train = torch.tensor(X_train, dtype=torch.float32)
X_val = torch.tensor(X_val, dtype=torch.float32)

# Dependent Variable
y_train = torch.tensor(y_train, dtype=torch.long)
y_val = torch.tensor(y_val, dtype=torch.long)

# Sanity Check
print(X_train.shape, y_train.shape, X_val.shape, y_val.shape)

torch.Size([31500, 784]) torch.Size([31500]) torch.Size([10500, 784]) torch.Size([10500])


In [21]:
# Simple neural net layers
simple_neural_net = nn.Sequential(
    nn.Linear(784, 100),
    nn.Linear(100, 10)
    )
simple_neural_net

Sequential(
  (0): Linear(in_features=784, out_features=100, bias=True)
  (1): Linear(in_features=100, out_features=10, bias=True)
)

In [23]:
single_row = X_train[[0], :]

In [24]:
output_logit_values = simple_neural_net(single_row)
output_values = nn.functional.softmax(output_logit_values, dim=1)
print(output_values)

tensor([[0.0877, 0.0890, 0.0941, 0.0933, 0.0747, 0.1531, 0.0928, 0.0652, 0.1357,
         0.1144]], grad_fn=<SoftmaxBackward0>)
