<a href="https://colab.research.google.com/github/MrRox1337/CST3170/blob/main/Coursework%202/coursework2Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Multi Layer Perceptron Model for Handwritten Number Identification

By: Aman Mishra
MISIS: M00983641
Professor: Dr. Maha Saadeh

This notebook demonstrates the development of a neural network model to recognize handwritten digits. It uses a training dataset to teach the network to identify digits (0-9) based on their features and applies the trained model to predict digits in a separate test dataset. The implementation employs Multi-Layer Perceptrons (MLP), focusing on forward propagation, activation functions, and iterative weight adjustments through training.

## Setup
In this section, we configure the environment for model development. This involves loading the datasets, defining necessary mathematical and neural network functions, and initializing parameters for training the Multi-Layer Perceptron model.

### Import datasets from Google Drive

Before proceeding, the notebook connects to the Drive to access the training and testing datasets.

In [52]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


### Load training and testing datasets from Google Drive

Once the notebook connects, read the datasets for training and testing

In [65]:
# Load the training data from CSV
training_data_path = '/content/drive/MyDrive/cw2DataSet1.csv'  # Ensure this is the correct path
training_data = pd.read_csv(training_data_path)

# Load the testing data from CSV
testing_data_path = '/content/drive/MyDrive/cw2DataSet2.csv'  # Ensure this is the correct path
testing_data = pd.read_csv(training_data_path)

### Import modules for matrix manipulation

This section imports essential libraries for numerical and data manipulation:
*  NumPy: For efficient mathematical operations like matrix multiplication and random number generation.
*  Pandas: For loading and preprocessing data from the training and testing datasets.



In [54]:
import numpy as np
import pandas as pd

### Hidden layers and weights
Here, we define the hidden layer parameters for the neural network:
*   Hidden neurons: Each of the 64 neurons represents a pixel in the 8x8 input image.
*   Weights: Randomly initialize weights connecting input to hidden layers with a 64x64 matrix and hidden to output layers with a 10x64 matrix.

In [55]:
# Set h (number of hidden neurons)
h = 64  # Number of hidden neurons (can be modified)

# Global arrays for outputs
outputHidden = np.zeros(h)  # Hidden layer outputs
outputneuron = np.zeros(10)  # Output layer outputs

# Initialize random weights for the input-hidden and hidden-output connections
WH = np.random.uniform(-1, 1, (h, 64))  # Weights between input and hidden layer
WO = np.random.uniform(-1, 1, (10, h))  # Weights between hidden and output layer

### Sigmoid function
The sigmoid function is a mathematical activation function that introduces non-linearity into the model. It outputs values between 0 and 1, making it ideal for binary classification and probabilistic interpretations.

In [56]:
# Sigmoid activation function
def sigmoid(weighted_sum):
    return 1 / (1 + np.exp(-weighted_sum))

### Step function

The step activation function is a threshold-based function that outputs 0 or 1 (a fixed value) depending on whether the input surpasses a specified threshold.

In [57]:
def step(weighted_sum):
  return 1 if weighted_sum >= 0 else 0

### Feedfoward function

This section defines the feedforward function, a core component of the neural network:
* It calculates the outputs of the hidden and output layers using matrix multiplication and activation functions.
* The sigmoid function is used for hidden layer activation, while a step function determines output layer values.




In [58]:
# Feedforward method to calculate outputs
def feedforward(dataSample):
    global outputHidden, outputneuron

    # Compute the output of the hidden neurons
    for i in range(h):
        weighted_sum_hidden = np.dot(dataSample, WH[i])  # Weighted sum for hidden neuron i
        outputHidden[i] = sigmoid(weighted_sum_hidden)  # Apply sigmoid

    # Compute the output of the output neurons
    for i in range(10):
        weighted_sum_output = np.dot(outputHidden, WO[i])  # Weighted sum for output neuron i
        outputneuron[i] = step(weighted_sum_output)  # Apply step

### Validation function
The validation function compares the predicted output with the expected target output (map). It identifies mismatches, which indicate errors that will guide weight adjustments during training or assist with calculating success during testing.

In [59]:
# Test error method to compare outputneuron with Map
def testError(Map):
    for i in range(10):
        if Map[i] != outputneuron[i]:
            return True
    return False

### Model training function
This function performs supervised learning:
* It calculates errors for output and hidden neurons.
*   Updates weights between layers based on the error gradients, using backpropagation.
*   The learning rate controls the speed of weight updates to optimize the model.

In [60]:
# Training method to adjust weights based on the errors
def training(Map, dataSample, learningRate=0.0125):
    global WH, WO

    # Compute error for the output neurons
    errorOutput = np.zeros(10)  # Error for the output neurons
    for i in range(10):
        errorOutput[i] = Map[i] - outputneuron[i]

    # Compute error for the hidden neurons
    errorHidden = np.zeros(h)  # Error for the hidden neurons
    for j in range(h):
        errorTemp = 0
        for i in range(10):
            errorTemp += errorOutput[i] * WO[i][j]
        errorHidden[j] = outputHidden[j] * (1 - outputHidden[j]) * errorTemp

    # Adjust weights between hidden and output neurons (WO)
    for i in range(10):  # Loop over output neurons
        for j in range(h):  # Loop over hidden neurons
            WO[i][j] += learningRate * outputHidden[j] * errorOutput[i]

    # Adjust weights between input and hidden neurons (WH)
    for i in range(h):  # Loop over hidden neurons
        for j in range(64):  # Loop over input data sample
            WH[i][j] += learningRate * dataSample[j] * errorHidden[i]

---
## Training Attempt 1

### Load training dataset and set hyperparameters
In this section, the training dataset is loaded, and hyperparameters are defined:
* Learning rate: Controls the rate of weight updates; set to 2.5%
* Total cycles: Number of iterations for training the model; set to 300.

In [None]:
# Set learning rate and number of cycles
learningRate = 0.025
total_cycles = 300  # Total number of training cycles
display_interval = 50  # Display accuracy after every 50 cycles

### Build prediction model
This code runs the training loop:
* Processes each data sample in the training dataset.
* Uses the feedforward function to predict outputs.
* Compares predictions to expected outputs, updating weights to reduce errors.
* Tracks and displays model accuracy after specified intervals.

In [None]:
# Iterate through the cycles
for cycle in range(total_cycles):
    success = 0
    total_rows = len(training_data)

    # Process each data sample
    for _, row in training_data.iterrows():
        dataSample = row[:-1].values  # First 64 columns as dataSample
        targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

        # Initialize the Map array to zeros
        Map = np.zeros(10)
        Map[targetOutput] = 1  # Set the target output index to 1

        # Feedforward
        feedforward(dataSample)

        # Test error and update weights if needed
        if testError(Map):
            training(Map, dataSample, learningRate)
        else:
            success += 1

    # Calculate accuracy at the end of each cycle
    accuracy = success / total_rows

    # Display the accuracy after every 50 cycles
    if ((cycle + 1) % display_interval == 0) or (cycle <= 20):
        print(f"Cycle {cycle + 1}, Accuracy: {accuracy:.4f}")

Cycle 1, Accuracy: 0.4806
Cycle 2, Accuracy: 0.6878
Cycle 3, Accuracy: 0.7369
Cycle 4, Accuracy: 0.7536
Cycle 5, Accuracy: 0.7779
Cycle 6, Accuracy: 0.7807
Cycle 7, Accuracy: 0.7818
Cycle 8, Accuracy: 0.7786
Cycle 9, Accuracy: 0.8138
Cycle 10, Accuracy: 0.8099
Cycle 11, Accuracy: 0.8334
Cycle 12, Accuracy: 0.8758
Cycle 13, Accuracy: 0.8387
Cycle 14, Accuracy: 0.8904
Cycle 15, Accuracy: 0.8622
Cycle 16, Accuracy: 0.8750
Cycle 17, Accuracy: 0.8893
Cycle 18, Accuracy: 0.8387
Cycle 19, Accuracy: 0.8558
Cycle 20, Accuracy: 0.8587
Cycle 21, Accuracy: 0.8982
Cycle 50, Accuracy: 0.9163


KeyboardInterrupt: 

## Testing Attempt 1
This section evaluates the model's performance on unseen data using the testing dataset:
* Predicts outputs for each sample using the trained model.
* Calculates and reports overall accuracy, providing insight into the model's generalization capability.

### Test generated model

In [None]:
success = 0
total_rows = len(testing_data)

# Process each data sample
for _, row in testing_data.iterrows():
    dataSample = row[:-1].values  # First 64 columns as dataSample
    targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

    # Initialize the Map array to zeros
    Map = np.zeros(10)
    Map[targetOutput] = 1  # Set the target output index to 1

    # Feedforward
    feedforward(dataSample)

    # Test error and update weights if needed
    if not testError(Map):
        success += 1

# Calculate accuracy at the end of testing
accuracy = success / total_rows
print(f"Testing Accuracy: {accuracy:.4f}")

Testing Accuracy: 0.9555


---
## Training Attempt 2
> Note: Rerun Setup before attempting a new training session to reset the hidden and output layer weights.

### Set hyperparameters
In this section, the following hyperparameters are refined:
* Learning rate: Decreased to 0.15%
* Total cycles: Decreased to 50

In [None]:
# Set learning rate and number of cycles
learningRate = 0.015
total_cycles = 50  # Total number of training cycles
display_interval = 50  # Display accuracy after every 50 cycles

### Build prediction model

In [None]:
# Iterate through the cycles
for cycle in range(total_cycles):
    success = 0
    total_rows = len(training_data)

    # Process each data sample
    for _, row in training_data.iterrows():
        dataSample = row[:-1].values  # First 64 columns as dataSample
        targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

        # Initialize the Map array to zeros
        Map = np.zeros(10)
        Map[targetOutput] = 1  # Set the target output index to 1

        # Feedforward
        feedforward(dataSample)

        # Test error and update weights if needed
        if testError(Map):
            training(Map, dataSample, learningRate)
        else:
            success += 1

    # Calculate accuracy at the end of each cycle
    accuracy = success / total_rows

    # Display the accuracy after every 50 cycles
    if ((cycle + 1) % display_interval == 0) or (cycle <= 20):
        print(f"Cycle {cycle + 1}, Accuracy: {accuracy:.4f}")

Cycle 1, Accuracy: 0.3489
Cycle 2, Accuracy: 0.5799
Cycle 3, Accuracy: 0.6960
Cycle 4, Accuracy: 0.7326
Cycle 5, Accuracy: 0.7675
Cycle 6, Accuracy: 0.7732
Cycle 7, Accuracy: 0.7996
Cycle 8, Accuracy: 0.8078
Cycle 9, Accuracy: 0.8095
Cycle 10, Accuracy: 0.8359
Cycle 11, Accuracy: 0.8526
Cycle 12, Accuracy: 0.8594
Cycle 13, Accuracy: 0.8604
Cycle 14, Accuracy: 0.8669
Cycle 15, Accuracy: 0.8711
Cycle 16, Accuracy: 0.8800
Cycle 17, Accuracy: 0.8804
Cycle 18, Accuracy: 0.8936
Cycle 19, Accuracy: 0.8964
Cycle 20, Accuracy: 0.8982
Cycle 21, Accuracy: 0.9067
Cycle 50, Accuracy: 0.9505


## Testing Attempt 2

### Test generated model

In [None]:
success = 0
total_rows = len(testing_data)

# Process each data sample
for _, row in testing_data.iterrows():
    dataSample = row[:-1].values  # First 64 columns as dataSample
    targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

    # Initialize the Map array to zeros
    Map = np.zeros(10)
    Map[targetOutput] = 1  # Set the target output index to 1

    # Feedforward
    feedforward(dataSample)

    # Test error and update weights if needed
    if not testError(Map):
        success += 1

# Calculate accuracy at the end of testing
accuracy = success / total_rows
print(f"Testing Accuracy: {accuracy:.4f}")

Testing Accuracy: 0.9402


---
## Training Attempt 3
> Note: Rerun Setup before attempting a new training session to reset the hidden and output layer weights.

### Set hyperparameters
In this section, the following hyperparameters are refined:
* Learning rate: Decreased to 0.95%
* Total cycles: Increased to 550

In [62]:
# Set learning rate and number of cycles
learningRate = 0.0095
total_cycles = 550  # Total number of training cycles
display_interval = 50  # Display accuracy after every 50 cycles

### Build prediction model

In [63]:
# Iterate through the cycles
for cycle in range(total_cycles):
    success = 0
    total_rows = len(training_data)

    # Process each data sample
    for _, row in training_data.iterrows():
        dataSample = row[:-1].values  # First 64 columns as dataSample
        targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

        # Initialize the Map array to zeros
        Map = np.zeros(10)
        Map[targetOutput] = 1  # Set the target output index to 1

        # Feedforward
        feedforward(dataSample)

        # Test error and update weights if needed
        if testError(Map):
            training(Map, dataSample, learningRate)
        else:
            success += 1

    # Calculate accuracy at the end of each cycle
    accuracy = success / total_rows

    # Display the accuracy after every 50 cycles
    if ((cycle + 1) % display_interval == 0) or (cycle <= 20):
        print(f"Cycle {cycle + 1}, Accuracy: {accuracy:.4f}")

Cycle 1, Accuracy: 0.3937
Cycle 2, Accuracy: 0.6476
Cycle 3, Accuracy: 0.7159
Cycle 4, Accuracy: 0.7779
Cycle 5, Accuracy: 0.8092
Cycle 6, Accuracy: 0.8234
Cycle 7, Accuracy: 0.8423
Cycle 8, Accuracy: 0.8601
Cycle 9, Accuracy: 0.8637
Cycle 10, Accuracy: 0.8704
Cycle 11, Accuracy: 0.8750
Cycle 12, Accuracy: 0.8822
Cycle 13, Accuracy: 0.8968
Cycle 14, Accuracy: 0.9039
Cycle 15, Accuracy: 0.9039
Cycle 16, Accuracy: 0.9089
Cycle 17, Accuracy: 0.9160
Cycle 18, Accuracy: 0.9142
Cycle 19, Accuracy: 0.9156
Cycle 20, Accuracy: 0.9270
Cycle 21, Accuracy: 0.9288
Cycle 50, Accuracy: 0.9566
Cycle 100, Accuracy: 0.9744
Cycle 150, Accuracy: 0.9847
Cycle 200, Accuracy: 0.9961
Cycle 250, Accuracy: 0.9957
Cycle 300, Accuracy: 0.9975
Cycle 350, Accuracy: 0.9982
Cycle 400, Accuracy: 1.0000
Cycle 450, Accuracy: 1.0000
Cycle 500, Accuracy: 1.0000
Cycle 550, Accuracy: 1.0000


## Testing Attempt 3

### Test generated model

In [66]:
success = 0
total_rows = len(testing_data)

# Process each data sample
for _, row in testing_data.iterrows():
    dataSample = row[:-1].values  # First 64 columns as dataSample
    targetOutput = int(row.iloc[-1])  # Last column as targetOutput using iloc

    # Initialize the Map array to zeros
    Map = np.zeros(10)
    Map[targetOutput] = 1  # Set the target output index to 1

    # Feedforward
    feedforward(dataSample)

    # Test error and update weights if needed
    if not testError(Map):
        success += 1

# Calculate accuracy at the end of testing
accuracy = success / total_rows
print(f"Testing Accuracy: {accuracy:.4f}")

Testing Accuracy: 1.0000
