# Neural Network from Scratch

Our objective is to build a neural network for the classification of the MNIST dataset.
*  [MNIST](http://yann.lecun.com/exdb/mnist/) dataset contains images of handwritten digits of size 28x28. We will be importing this dataset using keras.

<center>
<img src="https://user-images.githubusercontent.com/81357954/166119893-4ca347b8-b1a4-40b8-9e0a-2e92b5f164ae.png">
</center>

 This neural network will comprise of: an output layer with 10 nodes, a hidden layer of 256 nodes, and an input layer with 784 nodes corresponding to the image pixels. The specific structure of the neural network is outlined below, where $X$ represents the input, $A^{[0]}$ denotes the first layer, $Z^{[1]}$ signifies the unactivated layer 1, $A^{[1]}$ stands for the activated layer 1, and so forth. The weights and biases are represented by $W$ and $b$ respectively:

<div align="center">

$A^{[0]}=X$

$Z^{[1]}=W^{[1]}A^{[0]}+b^{[1]}$

$A^{[1]}=\text{ReLU}(Z^{[1]})$

$Z^{[2]}=W^{[2]}A^{[1]}+b^{[2]}$

$A^{[2]}=\text{softmax}(Z^{[2]})$

$Loss=\text{cross-entropy-loss}(A^{[2]})$
</div>

## Getting Started

### Importing Libraries

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
mnist = tf.keras.datasets.mnist
import matplotlib.pyplot as plt

### Implementation of the activation function(ReLU) and softmax function

* Use ReLU for the hidden layer and Softmax function for the output layer for classification into 10 different neurons, each corresponding to a particular number between $1$ and $10$


In [None]:
def ReLU(Z):
    pass

def softmax(Z):
    pass

### Define the NN class

In [None]:
class NN:
    def __init__(self, input_size, hidden_size, output_size, learning_rate=0.01):
        # initialized basic stats of NN
        self.input_size=input_size
        self.hidden_size=hidden_size
        self.output_size=output_size
        self.learning_rate=learning_rate

        #initialized weights and biases
        self.W1=np.random.randn(hidden_size, input_size)*0.01
        self.B1= np.zeros((1,hidden_size))
        self.W2=np.random.randn(output_size, hidden_size)*0.01
        self.B2=np.zeros((1,output_size))

        #initialized activations and gradients
        self.AO=None
        self.Z1=None
        self.A1=None
        self.Z2=None
        self.A2=None
        self.dZ2=None
        self.dW2=None
        self.dB2=None
        self.dZ1=None
        self.dW1=None
        self.dB1=None

    # do the forward pass and evaluate the values of A0, Z1, A1, Z2, A2
    def forward_propagation(self, X):
        pass
        
    
    # convert the input y, into a one hot encoded array.
    '''
    one hot encoding is:
    you have an array with values [2, 5, 6] and you know the max value can be 8, then one hot encoded array will be:
    [[0,0,1,0,0,0,0,0,0], [0,0,0,0,0,1,0,0,0], [0,0,0,0,0,0,1,0,0]]
    Note that the index 2, 5, 6 have values 1 and all others have values 0
    '''
    def one_hot(self, y):
        pass

    # calculate the derivative of the cost function with respect to W2, B2, W1, B1 in dW2, dB2, dW1, dB1 respectively
    def backward_propagation(self, X, y):
        pass
    
    # update the parameters W1, W2, B1, B2
    def update_params(self):
        pass
    
    # get the predictions for the dataset
    def get_predictions(self,X):
        pass

    # get the accuracy of the model on the dataset
    # (after the gradient descent has been run separately)
    def get_accuracy(self, X, y):
        pass
    
    # run gradient descent on the model to get the values of the parameters
    def gradient_descent(self, X, y, iters=1000):
        pass
    
    # evaluate cost using cross-entropy-loss formula.
    def cross_entropy_loss(self, X, y):
        pass

    # Function for random outputs to get an intuitive feel
    def show_predictions(self, X, y, num_samples=10):
        random_indices = np.random.randint(0, X.shape[0], size=num_samples)

        for index in random_indices:
            sample_image = X[index, :].reshape((28, 28))
            plt.imshow(sample_image, cmap='gray')
            plt.title(f"Actual: {y[index]}, Predicted: {self.get_predictions(X[index])}")
            plt.show()

### Prepare the data

In [None]:
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

miu = np.mean(X_train, axis=(0, 1), keepdims=True)
stds = np.std(X_train, axis=(0, 1), keepdims=True)

mius = np.mean(X_test, axis=(0, 1), keepdims=True)
stdse = np.std(X_test, axis=(0, 1), keepdims=True)

X_normal_train = (X_train - miu) / (stds + 1e-7)
X_normal_test = (X_test - mius) / (stdse + 1e-7)

X_normal_train = X_normal_train.reshape((60000, -1))
X_normal_test = X_normal_test.reshape((10000, -1))

### Train the model
* Train the model on X_normal_train and Y_train_dataset
* Print the accuracy on X_normal_test and Y_test_dataset

In [1]:
# Initialise the model as an instance of the NN class and train it


In [4]:
# Use the show_predictions function to print the actual output and see your model work :)
