### Making imports of the required libraries which would help us implement the MLP Network from scratch in an elegant manner



In [2]:
import csv #for importing CSV
import random #for generating random number
import math #basic math module
import numpy as np #for a faster matrix, array mathematical operation
from tqdm.auto import tqdm #for visualization of progress using a bar graph

## Dataset Class

This class represents a dataset. It would be constructed by providing:
* **data_path**: This path points to the csv containing the training_data
* **labels_path**: This path points to the csv containing the labels_data

This class has multiple methods to make operating with the Dataset, it is representing at the moment, easier:
```python
dataset = Dataset("/foo/data.csv", "/foo/labels.csv")
```
* `dataset.load()`: This method takes the data_path and labels_path attributes of `dataset` object, scans through their corresponding CSVs and loads the entire data in the `dataset._data` and `dataset._labels` attributes.

* `dataset.get_data_dimensions()`: This method returns the dimensions of the training_data loaded to this dataset.

* `dataset.get_labels_dimensions()`: This method returns the dimensions of the training_labels loaded to this dataset.

* `dataset.load_data()` and `dataset.load_labels()`: These methods are called by `dataset.load()` under the hood to individually load the training_data and training_labels respectively in the `dataset` object.


* `dataset.normalize_data()`: This method goes through the loaded training_data `dataset._data` and normalizes every data point against the highest value of the data points present in the training_data.

* `data.k_fold_split()`: This method goes through the loaded training_data and training_labels and generates `k` splits of dataset which, when fed to any neural network during its training and validation phase, can lead to a more fair outcome and analysis of its accuracies.

In [1]:
# Class to create objects with different data set
class Dataset:
  def __init__(self, data_path, labels_path=""):
    self.data_path = data_path
    self.labels_path = labels_path

    self._data_loaded, self._labels_loaded = False, False

  # Method to load the data and label point
  def load(self, force=False):
    self.load_data(force)
    self.load_labels(force)

  # Method to get data matrix dimensions
  def get_data_dimensions(self):
    return len(self._data), len(self._data[0])

  # Method to get label matrix dimensions
  def get_labels_dimensions(self):
    return len(self._labels), len(self._labels[0])

  # Method to return the parsed data
  def get_parsed_data(self):
    return self._data

  # Method to load the data
  def load_data(self, force=False):
    if (not force) and self._data_loaded:
      return

    data = []
    with open(self.data_path, 'r') as file:
      reader = csv.reader(file)
      print("Loading the data...")
      for row in tqdm(reader):
          data.append([float(val) for val in row])
    self._data_loaded = True
    self._data = data

  # Method to load the label points
  def load_labels(self, force=False):
    if (not force) and self._labels_loaded:
      return

    labels = []
    with open(self.labels_path, 'r') as file:
        reader = csv.reader(file)
        print("Loading the labels...")
        for row in tqdm(reader):
            label = [int(round(float(val))) for val in row]
            labels.append(label)
    self._labels_loaded = True
    self._labels = np.array(labels)

  # Method to normalize data to bring input features to a similar scale
  def normalize_data(self, force_load=False):
    self.load(force=force_load)

    max_val = max([max(row) for row in self._data])
    normalized_data = [[val / max_val for val in row] for row in self._data]
    self.data = normalized_data

  # Method to split the data into k folds for k cross validation
  def k_fold_split(self, k, force_load=False):
    self.load(force=force_load)

    combined_data = list(zip(self._data, self._labels))
    random.shuffle(combined_data)
    data, labels = zip(*combined_data)
    fold_size = len(data) // k
    folds = []
    for i in range(k):
        start = i * fold_size
        end = start + fold_size
        val_data = data[start:end]
        val_labels = labels[start:end]
        train_data = data[:start] + data[end:]
        train_labels = labels[:start] + labels[end:]
        folds.append((train_data, train_labels, val_data, val_labels))
    return folds

## NeuralNetwork Class

This class represents a NeuralNetwork. It would be constructed by providing:
* **input_layer_size**: This parameter is used to set the number of input features
* **hidden_layer_size**: This parameter is used to set the number of hidden layer perceptrons.
* **output_layer_size**: This parameter is used to set the number of output classes
* **activation_func**: This parameter is used to set the type of input activation function

This class has multiple methods to make operating with the NeuralNetwork, it is representing at the moment, easier:
```python
neural_network = NeuralNetwork(input_layer_size, hidden_size, output_layer_size, activation_func=activation_func)
```
* `neural_network.initialize_weights()`: This method is used to initialize weights `W1,W2` and bias `b1,b2` with random values between -1 to 1.

* `neural_network._forward_propagate()`: This method performs forward propagation operations using a given input.

* `neural_network.backward_propagate()`: This method performs backward propagation operations and updates the weights `W1,W2` and bias `b1,b2` based on the `output_error, hidden_error` and `learning_rate`.

* `neural_network.train()` : This method is used to train k-1 fold data and validate with 1 fold for k times.


* `neural_network._set_training_accurary()` and `neural_network.get_training_accurary()`: This method set and get last training accuracy.

* `neural_network.predict()`: This method is used to predict output of one fold data.

* `neural_network._calculate_accuracy()`: This method is used to calculate accuracy of the predicted labels.

* `neural_network.sigmoid()`, `neural_network.relu()`, `neural_network.tanh()` and `neural_network._softmax()`: This method helps in using different activation function based on `activation_func` parameter for the perceptrons in input and output layers.

In [3]:
class NetworkUntrainedError(Exception):
    def __init__(self, msg):
        self.msg = msg

# Class to create NeuralNetwork objects
class NeuralNetwork:
  def __init__(self, input_layer_size, hidden_layer_size, output_layer_size, activation_func):
    self.input_size = input_layer_size
    self.hidden_size = hidden_layer_size
    self.output_size = output_layer_size
    self.activation_func = activation_func

    self._trained_before, self._last_training_accuracy = False, 0.0

    self.initialize_weights()

  # Method to initialize random weights and bias between -1 to 1 range
  def initialize_weights(self):
    self.W1 = np.random.uniform(-1, 1, (self.input_size, self.hidden_size))
    self.b1 = np.random.uniform(-1, 1, self.hidden_size)
    self.W2 = np.random.uniform(-1, 1, (self.hidden_size, self.output_size))
    self.b2 = np.random.uniform(-1, 1, self.output_size)

  # Method to perform forward propagation operations
  def _forward_propagate(self, input_X):
    hidden_layer_input = np.dot(input_X, self.W1) + self.b1

    hidden_layer_output = self.activation_func(hidden_layer_input)

    output_layer_input = np.dot(hidden_layer_output, self.W2) + self.b2
    output_layer_output = NeuralNetwork._softmax(output_layer_input)

    return hidden_layer_output, output_layer_output

  # Method to perform backward propagation operations
  def _backward_propagate(self, input_X, target_Y, hidden_layer_output, output_layer_output, learning_rate):
    output_error = output_layer_output - target_Y
    hidden_error = np.dot(output_error, self.W2.T) * self.activation_func(hidden_layer_output)

    dW2 = np.dot(hidden_layer_output.T, output_error)
    db2 = np.sum(output_error, axis=0)

    dW1 = np.dot(input_X.T, hidden_error)
    db1 = np.sum(hidden_error, axis=0)

    # weight adjustment
    self.W2 -= learning_rate * dW2
    self.b2 -= learning_rate * db2
    self.W1 -= learning_rate * dW1
    self.b1 -= learning_rate * db1

  # Method to train k-1 fold data and validate with 1 fold for k times
  def train(self, dataset, learning_rate, num_epochs, k):
    if self._trained_before:
      print("Retraining the neural network with the provided configuration...")

    self._trained_before = True

    folds = dataset.k_fold_split(k)

    accuracies = []
    print(f"\nTraining through the {len(folds)} folds with {num_epochs} epochs each...")
    for i, fold in tqdm(enumerate(folds), total=len(folds)):
        self.initialize_weights()
        train_data_fold, train_labels_fold, val_data_fold, val_labels_fold = fold

        for epoch in range(num_epochs):
            for input_X, target_Y in zip(train_data_fold, train_labels_fold):
                input_X = np.array([input_X])
                target_Y = np.array([target_Y])

                # forward propagation
                hidden_layer_output, output_layer_output = self._forward_propagate(input_X)

                # backwards propagation
                self._backward_propagate(input_X, target_Y, hidden_layer_output, output_layer_output, learning_rate)

        val_predictions = self.predict(val_data_fold)
        val_accuracy = NeuralNetwork._calculate_accuracy(val_predictions, val_labels_fold)
        print(f"Fold-{i+1}'s validation set's accuracy = {val_accuracy}")
        accuracies.append(val_accuracy)

    avg_training_accuracy = sum(accuracies) / k
    self._set_training_accurary(avg_training_accuracy)

  # Method to set last training accuracy
  def _set_training_accurary(self, accuracy):
    self._last_training_accuracy = accuracy

  # Method to get last training accuracy
  def get_last_training_accuracy(self):
    if not self._trained_before:
      raise(NetworkUntrainedError("training accuracy not found because the network wasn't ever trained"))
    return self._last_training_accuracy

  # Method to predict output of one fold data
  def predict(self, test_data):
    if not self._trained_before:
      raise(NetworkUntrainedError("Attempt to predict found to happen before getting the neural network trained"))

    predictions = []
    for X in test_data:
        X = np.array([X])
        _, output_layer_output = self._forward_propagate(X)
        predicted_label = np.argmax(output_layer_output)
        one_hot_label = [1 if i == predicted_label else 0 for i in range(4)]
        predictions.append(np.array(one_hot_label))
    return predictions

  # Method to calculate accuracy of predicted output
  @staticmethod
  def _calculate_accuracy(predictions, labels):
    num_correct = 0
    num_total = len(labels)
    for pred, true_label in zip(predictions, labels):
        if np.array_equal(pred, true_label):
            num_correct += 1
    accuracy = num_correct / num_total
    return accuracy

  # Static Method for sigmoid activation function for input layer
  @staticmethod
  def sigmoid(x):
    return 1 / (1 + np.exp(-x))

  # Static Method for relu activation function for input layer
  @staticmethod
  def relu(x):
    return np.maximum(0, x)

  # Static Method for tanh activation function for input layer
  @staticmethod
  def tanh(x):
    return np.tanh(x)

  # Static Method for softmax activation function for output layer(for multi-class classification)
  @staticmethod
  def _softmax(x):
    exps = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exps / np.sum(exps, axis=1, keepdims=True)

## Executing the above setup against the input data provided in the assignment

In [None]:
# Reading Data set
dataset = Dataset(data_path='/content/train_data.csv', labels_path='/content/train_labels.csv')
dataset.normalize_data()

_, input_layer_size = dataset.get_data_dimensions()
_, output_layer_size = dataset.get_labels_dimensions()

# Set hyperparameters
hidden_size = 64
learning_rate = 0.01
num_epochs = 1
k = 5

def find_accuracy_with_activation(activation_func):
  neural_network = NeuralNetwork(input_layer_size, hidden_size, output_layer_size, activation_func=activation_func)
  neural_network.train(dataset, learning_rate, num_epochs, k)

  accuracy = neural_network.get_last_training_accuracy()
  return neural_network, accuracy

# Finding accuracy with different activation function
print("---------- With RELU Activation ----------")
relu_net, relu_accuracy = find_accuracy_with_activation(NeuralNetwork.relu)
print("\nAverage Validation Accuracy:", relu_accuracy)
trained_neural_network = relu_net
trained_net_accuracy = relu_accuracy

print("\n\n---------- With SIGMOID Activation ----------")
sigmoid_net, sigmoid_accuracy = find_accuracy_with_activation(NeuralNetwork.sigmoid)
print("\nAverage Validation Accuracy:", sigmoid_accuracy)
if sigmoid_accuracy > trained_net_accuracy:
  trained_neural_network = sigmoid_net
  trained_net_accuracy = sigmoid_accuracy

print("\n\n---------- With TANH Activation ----------")
tanh_net, tanh_accuracy = find_accuracy_with_activation(NeuralNetwork.tanh)
print("\nAverage Validation Accuracy:", tanh_accuracy)
if tanh_accuracy > trained_net_accuracy:
  trained_neural_network = tanh_net
  trained_net_accuracy = tanh_accuracy

print(f"\n\nUltimately, trained model chosen with {trained_net_accuracy*100}% accuracy!")

Loading the data...


0it [00:00, ?it/s]

Loading the labels...


0it [00:00, ?it/s]

---------- With RELU Activation ----------

Training through the 5 folds with 1 epochs each...


  0%|          | 0/5 [00:00<?, ?it/s]

Fold-1's validation set's accuracy = 0.9666666666666667
Fold-2's validation set's accuracy = 0.9626262626262626
Fold-3's validation set's accuracy = 0.9682828282828283
Fold-4's validation set's accuracy = 0.9723232323232324
Fold-5's validation set's accuracy = 0.9662626262626263

Average Validation Accuracy: 0.9672323232323233


---------- With SIGMOID Activation ----------

Training through the 5 folds with 1 epochs each...


  0%|          | 0/5 [00:00<?, ?it/s]

Fold-1's validation set's accuracy = 0.9656565656565657
Fold-2's validation set's accuracy = 0.9646464646464646
Fold-3's validation set's accuracy = 0.9670707070707071
Fold-4's validation set's accuracy = 0.9705050505050505
Fold-5's validation set's accuracy = 0.965050505050505

Average Validation Accuracy: 0.9665858585858584


---------- With TANH Activation ----------

Training through the 5 folds with 1 epochs each...


  0%|          | 0/5 [00:00<?, ?it/s]

Fold-1's validation set's accuracy = 0.9454545454545454
Fold-2's validation set's accuracy = 0.9531313131313132
Fold-3's validation set's accuracy = 0.9402020202020202
Fold-4's validation set's accuracy = 0.9482828282828283
Fold-5's validation set's accuracy = 0.9501010101010101

Average Validation Accuracy: 0.9474343434343435


Ultimately, trained model chosen with 96.72323232323234% accuracy!


## Steps to run a test_data against the above trained model

* Run all the above cells to train the model.
* The trained model would be stored in the variable called `neural_network`
* Say, the path to your test_data is `/path/content/test_data.csv`
Run the following code to get predictions against your test_data

```python
testdata = Dataset(data_path="/path/content/test_data.csv")
testdata.load_data()

predictions = trained_neural_network.predict(testdata)
print(predictions)
```