# MNIST Digit Recognizer - Simple Neural Network Classification

**Authors: Clement, Calvin, Tilova**

---

Welcome to the second notebook by **Tequila Chicas**! We will be classifying images of hand written numbers to their corresponding digits. This project follows the guidelines and uses the data set provide from the Kaggle Competition [here](https://www.kaggle.com/competitions/digit-recognizer/overview). 

## Introduction  

In this notebook we will be fitting the dataset in a simple neural network to see how well we can predict the digits of the MNIST Dataset.

<a id = 'toc'></a>
    
## Table of Contents
---
1. [Simple Neural Network](#simple)

**Importing Libraries**

In [1]:
import numpy as np
import pandas as pd

# data visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Train_Test_Split
from sklearn.model_selection import train_test_split

# Scaling
from sklearn.preprocessing import StandardScaler

# Metrics
from sklearn.metrics import accuracy_score

# PyTorch
import torch
import torch.nn as nn

# Progress bar from tqdm
from tqdm.notebook import tnrange

# ignores the filter warnings
import warnings
warnings.filterwarnings('ignore')

<a id = 'simple'></a>
### 1. Simple Neural Network
---
Loading the test and train set CSVs files.

In [2]:
df_train = pd.read_csv('../data/train.csv')
df_test = pd.read_csv('../data/test.csv')
df_train.shape, df_test.shape

((42000, 785), (28000, 784))

We need to set our independent (X) and dependent (y) variables as `numpy arrays` from the dataset.

In [3]:
X = df_train.iloc[:, 1:].to_numpy()
y = df_train.iloc[:, 0].to_numpy()

# sanity check
print(X.shape, y.shape)

(42000, 784) (42000,)


We will perform a **train_test_split()** to split our dataset into train and validation sets.
- Validation size of 25% of the data.
- Stratify=y to make sure distribution of the classes remain the same in both training and validation set.

In [4]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.25, stratify=y)
X_train.shape, y_train.shape

((31500, 784), (31500,))

We can start by implementing a simple `linear` network.
- Since it's linear, we would obtain better results when scaling the data.

In [5]:
# instantiate standard scaler
ss = StandardScaler()

# fit and transform training
X_train = ss.fit_transform(X_train)

# ONLY transform X_val
X_test = ss.transform(X_val)

Now we need to convert the 1-D `arrays` into torch `tensors`
- Using float32 to cut down memory usage
- Using torch.long for classification labels.

In [6]:
# Independent Variables
X_train = torch.tensor(X_train, dtype=torch.float32)
X_val = torch.tensor(X_val, dtype=torch.float32)

# Dependent Variable
y_train = torch.tensor(y_train, dtype=torch.long)
y_val = torch.tensor(y_val, dtype=torch.long)

# Sanity Check
print(X_train.shape, y_train.shape, X_val.shape, y_val.shape)

torch.Size([31500, 784]) torch.Size([31500]) torch.Size([10500, 784]) torch.Size([10500])


In [9]:
# Simple neural net layers
simple_neural_net = nn.Sequential(
    nn.Linear(784, 200),
    nn.ReLU(),
    nn.Linear(200, 50),
    nn.ReLU(),
    nn.Linear(50, 10)
    )
simple_neural_net

Sequential(
  (0): Linear(in_features=784, out_features=200, bias=True)
  (1): ReLU()
  (2): Linear(in_features=200, out_features=50, bias=True)
  (3): ReLU()
  (4): Linear(in_features=50, out_features=10, bias=True)
)

In [10]:
# Creating a single row
single_row = X_train[[0], :]
single_target = y_train[[0]]


# Instantiate Optimizer
optimizer = torch.optim.SGD(simple_neural_net.parameters(), lr=0.01)

#### Forward pass ####
output_values = simple_neural_net(single_row)

# Cross Entropy Loss
cross_entropy_loss = nn.CrossEntropyLoss()
loss = cross_entropy_loss(output_values, single_target)

#### Backward pass ####
loss.backward()

# Update Weights
optimizer.step()

# New Outputs
new_output = simple_neural_net(single_row)
new_loss = cross_entropy_loss(new_output, single_target)

# Comparing old and new loss
print(f"Old loss: {loss}\nNew loss: {new_loss}")

Old loss: 2.2717764377593994
New loss: 2.192401647567749


Someone plz explain this what `__init__` does again

In [None]:
class SimpleNN(nn.Module):
    """Basic multi-layer architecture."""

    def __init__(self):
        """Define the main components of the network"""
        super(SimpleNN, self).__init__()
        
        self.layer_1 = nn.Linear(784, 100) # transition from input into hidden layer
        self.activation_1 = nn.ReLU()   # Activation function
        self.layer_2 = nn.Linear(100, 10)  # transition from hidden layer into output

    def forward(self, x):
        """Perform forward pass."""

        # pass through the layers
        hidden_1 = self.activation_1(self.layer_1(x))
        output = self.layer_2(hidden_1)

        # return output
        return output

    def predict(self, x):
        '''
        The class based interface allows you
        add your own functionality, like a familiar
        .predict method we all know and love
        '''

        # Predict class probabilities
        predictions = self.forward(x)

        # Find highest class prediction, notice we don't need to convert to
        # probabilities to do hard predictions, we can simply choose the
        # highest values
        hard_class_predictions = torch.argmax(predictions, dim=1)

        return hard_class_predictions

In [None]:
# Instantiate the model, the loss criterion, and the optimizer
NN_model = SimpleNN()

cross_entropy_loss = nn.CrossEntropyLoss() # this includes the softmax
optimizer = torch.optim.SGD(NN_model.parameters(), lr=.01, momentum=0.9)

In [None]:
### COMMON PYTORCH RECIPE FOR TRAINING A NETWORK ###


# Now run for 100 epochs
for epoch in tnrange(100, desc="Total epochs: "):

    # Clear gradients (pytorch accumulates gradients by default)
    optimizer.zero_grad()

    # Calculate outputs
    output_values = NN_model(X_train)

    # Calculate loss
    loss = cross_entropy_loss(output_values, y_train)

    # Backpropagation & weight adjustment
    loss.backward()
    optimizer.step()

print(f"Optimization ended successfully")

In [None]:
# Make predictions
binary_classification = NN_model.predict(X_train)

# Calculate the score on the test set
accuracy = accuracy_score(y_train, binary_classification)
print(f"Accuracy score on train set: {round(accuracy * 100, 2)}%")

In [None]:
# Make predictions
binary_classification = NN_model.predict(X_val)

# Calculate the score on the test set
accuracy = accuracy_score(y_val, binary_classification)
print(f"Accuracy score on test set: {round(accuracy * 100, 2)}%")