##  Lab 3(a) - Implementation of Multi layer perceptron from scratch
##  Weightage - 2.4%

Maximum Points in the Lab: 90

---
Important points to remember :


1.  Observations for the experiments should be explained.
2. All the code should be submitted in the form of a single Jupyter notebook itself.
3. Points for each sub-section are mentioned in the appropriate question.
4. Make sure to begin early as a few experiments may consume more time to run.
5. You can use Google colab to run in jupyter notebook (https://colab.research.google.com/) How to load data in Google Colab ?(https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92)
6. The lab must be submitted on Google classroom. The code as well as the accompanying observations should be made part of the python notebook.
7. **Code Readability** is very important. Hence use self explanatory variable names and add comments to describe your approach wherever necessary.
8. You are expected to submit your **detailed inferences** and not just an error free code.
9. The lab is due on **March 20th 11.59pm**.
10. The lab should be completed **individually**. Students are expected to follow the **honor code** of the class.

For any doubts regarding lab please mail to 2018csm1011@iitrpr.ac.in


Below is the multi layer perceptron architechture used for implementation. Notations of the neural network are mentioned below :

![MLP implementation diagram](pictures/mlp.png)


## Implementation details :

1) In this lab you will be using MLP for classifying MNIST digits. a. Let us consider a MLP network with H hidden units, O outputs, and
inputs of size D.

2) According to above diagram dimensions of network will be :
```
 W - H * D+1   -- Weights from Input layer to Hidden layer
 V - K * H+1   -- Weights from Hidden layer to Output layer
 X - N * D+1   -- Weights of Input data
 Y - N * K     -- Weights of Output label data
 Z - H+1 * 1   -- Hidden layer Weights
 O - K * 1     -- Output layer weights
```
3) Please note that +1 in above notations is to indicate bias term.

4) tanh is used as the activation function.

5) During forward pass will be :

$\mathbf{z}=\tanh (\mathbf{W} \mathbf{x})$

and $O_{i}=\frac{\exp v_{i}^{T} z}{\sum_{k=1}^{K} \exp v_{k}^{T} z}$

Overall loss function will be :

Total loss = $-\sum_{n=1}^{N} \sum_{i=1}^{K} y_{n i} \log O_{n i}$




6) The dataset is included in zip file ("data.txt" and "label.txt"). Number of hidden layer units to be 500, learning rate is set to be 0.01. 

7) Inorder to update the weights during back propogation we will modified version of stochastic gradient descent, where instead of updating weights after each data point, the updates are made once with batch of input data, Let batch size = 25. Number of epochs = 100.

8) Divide the data into train, validation, and test splits using a preset ratio. Please define the ratio you are using.

9) Please plot the below : 

    i)  Training error, Validation error Vs epochs [average over 5 runs.]
  
    ii) Mean Training error Vs epochs [average over 5 runs.]
    
    iii)  Mean Validation error Vs epochs [average over 5 runs.]
  
    iv) Variance Training error Vs epochs [average over 5 runs.]
    
     v)  Variance validation error Vs epochs [average over 5 runs.]


# Note : 

1) All weight update equations during back propogation should be done using $\textbf{Matrix operations}$ only (not for loops).

 For example : 
 
![MLP implementation diagram](pictures/equation.png)





# Maximum points : 90 points.

1) Derive the weight update equation for W in form of Matrix operations similar to V matrix operations defined above.(write in the jupyter notebook itself using Latex or image) - 10 pts

![W update Eqn](pictures/W_update.jpeg)

2) Splitting train, validation and test using preset ratio - 5 pts

3) Random weights assign to W,V values -  5 pts.

4) During Forward pass :
   
       updating Z values - 5 pts.

       updating O values - 5 pts.

       Applying softmax values - 5pts.

       Calculating error values - 5pts.

5) During backward pass : ( only matrix operations allowed)
   
       Gradient between hidden to output - 10 pts.

       Gradient betweem input to hidden - 15 pts.
   
6) Using batch size = 25 to update weights - 5 pts.

7) averaging over 5 runs - 5 pts.

8) Please plot the below : - 10pts. 

    i)  Training error, Validation error Vs epochs [average over 5 runs.]
  
    ii) Mean Training error Vs epochs [average over 5 runs.]
    
    iii)  Mean Validation error Vs epochs [average over 5 runs.]
  
    iv) Variance Training error Vs epochs [average over 5 runs.]
    
     v)  Variance validation error Vs epochs [average over 5 runs.]   
     
9) Discuss the observations from above plots. - 5pts.

In [11]:
import matplotlib.pyplot as plt
import numpy as np
import random

####################################################################################
# Derivative of weights as defined in Lab3.a	####################################
####################################################################################
####################################################################################

# Dimension of all the Matrices Used
# W - H * D+1   -- Weights from Input layer to Hidden layer
# V - K * H+1   -- Weights from Hidden layer to Output layer
# X - N * D+1   -- Weights of Input data
# Y - N * K     -- Weights of Output label data
# Z - H+1 * 1   -- Hidden layer Weights
# O - K * 1     -- Output layer weights


# Function to read data and store it in form of 2D array
def read_data(file_name) :
    data = np.loadtxt(file_name, delimiter=',')
    return data


# Implementation of Forward pass function
def forward_pass(data_points,W,V,Y) :
    # use compute_Z_values, compute_O_values, compute_softmax, calculate_error to compute error, O_softmax
    # During forward pass compute Z values, O_values, O_softmax, error
    # Insert code here
    Z = compute_Z_values(W, data_points)
    O = compute_O_values(V, Z)
    O_softmax = compute_softmax(O)
    error = calculate_error(O_softmax, Y)

    #return the error, O_softmax
    return error, O_softmax


# Implementation of Cross Entropy Error Function takes as input O_softmax, Y
def calculate_error(predictions, targets, epsilon=1e-10):
    # Caculate cross entropy error between output of softmax (predictions) , actual values (targets)
    #Insert code here
    predictions = np.clip(predictions, epsilon, 1 - epsilon)
    cross_entropy_error = - np.sum(targets * np.log(predictions)) / predictions.shape[0]

    #returns cross entropy error
    return cross_entropy_error


# Implementation of Softmax Error Function takes as input O
def compute_softmax(output_matrix) :
    # return output_matrix after apply softmax function ( hint: use np.exp function )
    y_length = 10
    #Insert code here
    for i in range(len(output_matrix)):
        output_matrix[i] = np.exp(output_matrix[i] - np.max(output_matrix[i]))
        output_matrix[i] /= np.sum(output_matrix[i])

    # returns output_matrix
    return output_matrix


# Implementation of Backward pass using Backpropagation Algorithm to calculate V_new, W_new, bias_v
def backward_pass(O_softmax,Y,V,Z,W,X,bias_z):
    # use gradient_hidden_to_output, gradient_input_to_hidden functions to compute V_new, bias_v, W_new 
    #Insert code here
    V_new, bias_v = gradient_hidden_to_output(O_softmax, Y, Z, bias_z)
    W_new = gradient_input_to_hidden(O_softmax, Y, V, Z, X)

    #returns V_new, W_new, bias_v
    return  W_new, V_new, bias_v


#Implementation of Graident back propogation from Hidden to Input Layer
def gradient_hidden_to_output(O_softmax,Y,Z,bias_z) :
    # function to update V values using backpropogation using matrix operations.
    #Insert code here
    Z_ = initilaise_weights(Z.T).T
    final_result_matrix = np.dot((O_softmax - Y).T, Z_.T)
    bias_v = final_result_matrix.T[final_result_matrix.shape[1]-1]
    final_result_matrix = final_result_matrix[..., :-1]
    bias_v = bias_v.reshape((bias_v.shape[0], 1))

    return final_result_matrix, bias_v

#Implementation of Graident back propogation from Input to Hidden Layer
def gradient_input_to_hidden(O_softmax,Y,V,Z,X) :
    # function to update W values using backpropogation using only matrix operations.
    #Insert code here
    z_ = np.ones(Z.shape) - Z*Z
    z_ = np.mean(z_, axis=1)
    V = V[:, :V.shape[1]-1]
    result_matrix = np.dot(np.dot(X.T,O_softmax - Y), V*z_)
    
    # returns updated W values
    return result_matrix.T



# Function to calculate Z values during forward pass
def compute_Z_values(weights,data_points) :
    # function to update Z during forward pass using matrix operations.
    #Insert code here
    z_values = np.dot(weights, data_points.T)
    z_values = np.tanh(z_values)

    #return calculated z_values
    return z_values


# Function to Calculate output matrix during forward pass
def compute_O_values(weights,z_values) :
    # function to update O during forward pass using matrix operations.
    #Insert code here
    o_values = np.dot(weights, initilaise_weights(z_values.T).T)

    #return calculated o_values
    return o_values.T



# Function to Intialise weights with bias term
def initilaise_weights(data) :
    # function to append bias term.
    # insert code here
    rows, cols = data.shape
    biasColumn = np.ones(shape = (rows, 1))
    final_data = np.concatenate((data, biasColumn), axis=1)

    return final_data



# To intiliase random weights to Matrices such as W, V
def random_weights(number_of_rows,number_of_columns) :
    # Function to assign random weights to W, V
    #Insert code here 
    new_data = np.random.randn(number_of_rows, number_of_columns)

    # return random weights with number_of_rows * number_of_columns
    return new_data


# To divide the data into test train data
def train_test_split(X,Y,fraction) :
    # Function to divide train, validation and test data based on fraction. let fraction = 0.8 then train = 0.75, validation= 0.05 and test = 0.2 
    #Insert code here 
    
    shuffledX, shuffledY = shuffle(X, Y, 107)
    testFraction = 1 - fraction
    
    test_data_x, validation_data_x, data_train_x = np.split(X, [int(testFraction * len(X)), 
                                                        int((testFraction + testFraction*testFraction) * len(X))])
    
    
    test_data_y, validation_data_y, data_train_y = np.split(Y, [int(testFraction * len(Y)), 
                                                        int((testFraction + testFraction*testFraction) * len(Y))])
    
    # return data_train_x,data_train_y,validation_data_x,validation_data_y,test_data_x,test_data_y
    return data_train_x,data_train_y,validation_data_x,validation_data_y,test_data_x,test_data_y


# Shuffle in same order for X,Y
def shuffle(a, b, seed): 
#     temp = list(zip(a, b))
    
#     random.seed(seed)
#     random.shuffle(temp)
    
#     a, b = zip(*temp)
    
    indices = np.arange(a.shape[0])
    
    random.seed(seed)
    np.random.shuffle(indices)
    
    a = a[indices]
    b = b[indices]


    # return shuffled values a,b in same order
    return a,b


if __name__ == "__main__" :
    data = read_data("data.txt")
    Y = read_data("label.txt")
    X = initilaise_weights(data)
    W = random_weights(500,401)
    V = random_weights(10,501)
    Z = compute_Z_values(W,X[:25,:])
    O = compute_O_values(V,Z)
    bias_z = np.empty(shape=(25, 1))
    bias_z.fill(1.0)
    i=0
    learning_rate = 0.01
    train_test_fraction  = 0.8
    train_validation_split = 0.2
    train_data_x,train_data_y,validation_data_x,validation_data_y,test_data_x,test_data_y =train_test_split(X,Y,train_test_fraction)
    number_of_epocs=100
    train_error_epoch = []*(5*number_of_epocs)
    #X = train_data_x
    #Y = train_data_y
    #Dividing the data into training data and test data into 0.8 ratio same ration for train and validation data,values below correspond to 0.8 ratio
    train_data_len = 3200
    validation_data_len = 800
    test_data_len =1000
    validation_error_epoch = [] * (5*number_of_epocs)
    # Running for 5 trails using 100 Epocs and Batch size = 25
    batch_size = 25
    print("Started calculating Training error in 5 Trails for each epoch with Batch size = 25 ")
    print("Started calculating Validation error in 5 Trails for each epoch with Batch size = 25 ")
    # Different trails are performed for 5 times.
    # 5 different trails
    trails = 5
    for k in range(trails) :
        W = random_weights(500, 401)
        V = random_weights(10, 501)
        error_train=0
        error_validation = 0
        # Randomising the data
        seed = random.randint(10000,10000000)
        X,Y = shuffle(X,Y,seed)
        print("Training Error for 100th epoch for Trail Number : "+str(k+1))
        for j in range(number_of_epocs) :
            i=0
            error_train = 0.0
            error_validation = 0.0
            count = 0
            #X,Y = shuffle(X,Y,12345)
            while i < (train_data_len)  :
                i1=i
                # Batch size is 25
                i= i+25
                error,O_softmax=forward_pass(X[i1:i, :], W, V, Y[i1:i, :])
                W_new,V_new,bias_v=backward_pass(O_softmax,Y[i1:i,:],V,Z,W,X[i1:i,:],bias_z)
                #print(W_new.shape)
                W = W - (learning_rate/25)*W_new
                #print(W)
                V_new = np.append(learning_rate*bias_v,V_new,axis=1)
                V = V - (learning_rate/25)*V_new
                error_train+= error
                count+=1
                #print(error1)
                #print(V.shape)
            #print(error)
            #print(j)
            error, O_softmax = forward_pass(X[0:3200, :], W, V, Y[0 : 3200, :])
            error1, O_softmax = forward_pass(X[3200:4000, :], W, V, Y[3200:4000, :])
            print
            error_validation = error1
            count = train_data_len/batch_size
            count1= validation_data_len/batch_size
            error_train = error_train/count
            print("Training error after  epoch : "+str(j+1)+" here every batch size = 25")
            print(error_train)
            print("Validation error after  epoch : "+str(j+1)+" here batch size = 25")
            print(error_validation)
            train_error_epoch.append(error_train)
            validation_error_epoch.append(error_validation)
    print("\n")
    print("Final Training Errors after 5 trails and 100 Epocs : ")
    print(train_error_epoch)
    print("\n")
    print("Final Validation Errors after 5 trails and 100 Epocs : ")
    print(validation_error_epoch)
    mean_training = []
    variance_training = []
    mean_validation = []
    variance_validation = []
    train_error_epoch = np.reshape(train_error_epoch,(trails,number_of_epocs))
    validation_error_epoch = np.reshape(validation_error_epoch,(trails,number_of_epocs))
    mean_training = np.mean(train_error_epoch, axis=0)
    mean_validation = np.mean(validation_error_epoch,axis=0)
    variance_training = np.var(train_error_epoch,axis=0)
    variance_validation = np.var(validation_error_epoch,axis=0)
    mean_training = np.reshape(mean_training,(number_of_epocs,))
    mean_validation = np.reshape(mean_validation, (number_of_epocs))
    variance_training = np.reshape(variance_training,(number_of_epocs,))
    variance_validation = np.reshape(variance_validation,(number_of_epocs,))
    epochs = []
    for i in range(1,number_of_epocs+1) :
        epochs.append(i)
    print("\n")
    print("Plots are started to Generate In Figures Folder : ")
    plt.plot(epochs,mean_training, color='red', label='Training')
    plt.xlabel("Epoch values")
    #plt.title("Plot for Training Error Vs Epochs")
    location = "./figures/lab3.a_TrainingError" + ".png"
    #plt.savefig(location)
    #plt.close()
    plt.plot(epochs, mean_validation,color='blue', label='Validation')
    plt.ylabel("Training,Validation Error values")
    plt.title("Plot for Training,Validation Error Vs Epochs")
    plt.legend(loc='best')
    location = "./figures/lab3.a_TrainingAndValidationError" + ".png"
    plt.savefig(location)
    plt.close()
    #plt.ylim(0.0145,0.01465)
    plt.plot(epochs, mean_training, color='red', label='Training')
    plt.xlabel("Epoch values")
    plt.ylabel("Mean Training Error values")
    plt.title("Plot for Mean Training Error Vs Epochs")
    location = "./figures/lab3.a_MeanTrainingError" + ".png"
    plt.legend(loc='best')
    plt.savefig(location)
    plt.close()
    #plt.ylim(0.0133, 0.134)
    plt.plot(epochs, mean_validation, color='blue', label='Validation')
    plt.xlabel("Epoch values")
    plt.ylabel(" Mean Validation Error values")
    plt.title("Plot for Mean Validation Error Vs Epochs")
    location = "./figures/lab3.a_MeanValidationError" + ".png"
    plt.legend(loc='best')
    plt.savefig(location)
    plt.close()
    #plt.ylim(0.000240, 0.000242)
    plt.plot(epochs, variance_training, color='red', label='Training')
    plt.xlabel("Epoch values")
    plt.ylabel(" Variance Training Error values")
    plt.title("Plot for Variance Training Error Vs Epochs")
    location = "./figures/lab3.a_VarianceTrainingError" + ".png"
    plt.legend(loc='best')
    plt.savefig(location)
    plt.close()
    #plt.ylim(0.000210, 0.000211)
    plt.plot(epochs, variance_validation, color='blue', label='Validation')
    plt.xlabel("Epoch values")
    plt.ylabel("Variance Validation Error values")
    plt.title("Plot for Variance Validation Error Vs Epochs")
    location = "./figures/lab3.a_VarianceValidationError" + ".png"
    plt.legend(loc='best')
    plt.savefig(location)
    plt.close()
    print("\n")
    print("Plots are Generated Successfully In Figures folder")




Started calculating Training error in 5 Trails for each epoch with Batch size = 25 
Started calculating Validation error in 5 Trails for each epoch with Batch size = 25 
Training Error for 100th epoch for Trail Number : 1
Training error after  epoch : 1 here every batch size = 25
15.035148477290713
Validation error after  epoch : 1 here batch size = 25
13.13189156488082
Training error after  epoch : 2 here every batch size = 25
12.509590973247297
Validation error after  epoch : 2 here batch size = 25
11.172910178083995
Training error after  epoch : 3 here every batch size = 25
10.977932048299397
Validation error after  epoch : 3 here batch size = 25
9.794883399075434
Training error after  epoch : 4 here every batch size = 25
9.719523076430104
Validation error after  epoch : 4 here batch size = 25
8.733528150885483
Training error after  epoch : 5 here every batch size = 25
8.665632161595468
Validation error after  epoch : 5 here batch size = 25
7.906000090238052
Training error after  ep

Training error after  epoch : 54 here every batch size = 25
2.682567987946335
Validation error after  epoch : 54 here batch size = 25
3.324513223668027
Training error after  epoch : 55 here every batch size = 25
2.664470604361364
Validation error after  epoch : 55 here batch size = 25
3.311850322329742
Training error after  epoch : 56 here every batch size = 25
2.646963533600637
Validation error after  epoch : 56 here batch size = 25
3.2998923057479828
Training error after  epoch : 57 here every batch size = 25
2.6299901683520637
Validation error after  epoch : 57 here batch size = 25
3.2885859310442758
Training error after  epoch : 58 here every batch size = 25
2.6135289257746916
Validation error after  epoch : 58 here batch size = 25
3.2778741334647474
Training error after  epoch : 59 here every batch size = 25
2.597560293711018
Validation error after  epoch : 59 here batch size = 25
3.267567387139553
Training error after  epoch : 60 here every batch size = 25
2.582049326082986
Valid

Training error after  epoch : 8 here every batch size = 25
7.3871690333412126
Validation error after  epoch : 8 here batch size = 25
6.769128919644472
Training error after  epoch : 9 here every batch size = 25
6.82188433867916
Validation error after  epoch : 9 here batch size = 25
6.280949789387007
Training error after  epoch : 10 here every batch size = 25
6.348734752031685
Validation error after  epoch : 10 here batch size = 25
5.882377391644492
Training error after  epoch : 11 here every batch size = 25
5.957771882581695
Validation error after  epoch : 11 here batch size = 25
5.554890731057574
Training error after  epoch : 12 here every batch size = 25
5.626322626928118
Validation error after  epoch : 12 here batch size = 25
5.28213889009394
Training error after  epoch : 13 here every batch size = 25
5.342097300724044
Validation error after  epoch : 13 here batch size = 25
5.045703084847554
Training error after  epoch : 14 here every batch size = 25
5.097115162942952
Validation erro

Training error after  epoch : 62 here every batch size = 25
2.3217322091526347
Validation error after  epoch : 62 here batch size = 25
2.64607592505629
Training error after  epoch : 63 here every batch size = 25
2.306338766397316
Validation error after  epoch : 63 here batch size = 25
2.635330881084816
Training error after  epoch : 64 here every batch size = 25
2.291418991383169
Validation error after  epoch : 64 here batch size = 25
2.625098760862377
Training error after  epoch : 65 here every batch size = 25
2.276970879038117
Validation error after  epoch : 65 here batch size = 25
2.6153705550059168
Training error after  epoch : 66 here every batch size = 25
2.263042199636339
Validation error after  epoch : 66 here batch size = 25
2.6061339520346634
Training error after  epoch : 67 here every batch size = 25
2.24961081854324
Validation error after  epoch : 67 here batch size = 25
2.5973812949066972
Training error after  epoch : 68 here every batch size = 25
2.2366189188194356
Validat

Training error after  epoch : 16 here every batch size = 25
4.405335489837554
Validation error after  epoch : 16 here batch size = 25
4.594857725093582
Training error after  epoch : 17 here every batch size = 25
4.231297384431185
Validation error after  epoch : 17 here batch size = 25
4.428522713942394
Training error after  epoch : 18 here every batch size = 25
4.078954967336354
Validation error after  epoch : 18 here batch size = 25
4.280739553711578
Training error after  epoch : 19 here every batch size = 25
3.9440887965719247
Validation error after  epoch : 19 here batch size = 25
4.149082465458799
Training error after  epoch : 20 here every batch size = 25
3.8241412869082447
Validation error after  epoch : 20 here batch size = 25
4.0306671533374825
Training error after  epoch : 21 here every batch size = 25
3.7172207878562844
Validation error after  epoch : 21 here batch size = 25
3.9236015782249636
Training error after  epoch : 22 here every batch size = 25
3.6208161108965
Validat

Training error after  epoch : 70 here every batch size = 25
2.2474836421320417
Validation error after  epoch : 70 here batch size = 25
2.594564292632481
Training error after  epoch : 71 here every batch size = 25
2.2385812982707662
Validation error after  epoch : 71 here batch size = 25
2.588213660571224
Training error after  epoch : 72 here every batch size = 25
2.229863828517962
Validation error after  epoch : 72 here batch size = 25
2.5821045379447765
Training error after  epoch : 73 here every batch size = 25
2.221356292313554
Validation error after  epoch : 73 here batch size = 25
2.5762418156930766
Training error after  epoch : 74 here every batch size = 25
2.213022390176892
Validation error after  epoch : 74 here batch size = 25
2.5706052012436555
Training error after  epoch : 75 here every batch size = 25
2.2048559098723834
Validation error after  epoch : 75 here batch size = 25
2.5651763221003487
Training error after  epoch : 76 here every batch size = 25
2.1968975744630055
Va

Training error after  epoch : 24 here every batch size = 25
3.677109324482201
Validation error after  epoch : 24 here batch size = 25
3.891314448433759
Training error after  epoch : 25 here every batch size = 25
3.6081821973873915
Validation error after  epoch : 25 here batch size = 25
3.836213668201128
Training error after  epoch : 26 here every batch size = 25
3.5437184513719338
Validation error after  epoch : 26 here batch size = 25
3.7861248158136918
Training error after  epoch : 27 here every batch size = 25
3.482585059314766
Validation error after  epoch : 27 here batch size = 25
3.7404540921987484
Training error after  epoch : 28 here every batch size = 25
3.424896095397181
Validation error after  epoch : 28 here batch size = 25
3.6985372210128493
Training error after  epoch : 29 here every batch size = 25
3.3703380876143116
Validation error after  epoch : 29 here batch size = 25
3.6600189524250077
Training error after  epoch : 30 here every batch size = 25
3.3191266448464627
Va

Training error after  epoch : 78 here every batch size = 25
2.2482699348153434
Validation error after  epoch : 78 here batch size = 25
2.916156686718927
Training error after  epoch : 79 here every batch size = 25
2.2369428126705495
Validation error after  epoch : 79 here batch size = 25
2.9087350193809107
Training error after  epoch : 80 here every batch size = 25
2.2258357334133683
Validation error after  epoch : 80 here batch size = 25
2.9015110445090095
Training error after  epoch : 81 here every batch size = 25
2.214952567663554
Validation error after  epoch : 81 here batch size = 25
2.894512437472705
Training error after  epoch : 82 here every batch size = 25
2.2042668988826115
Validation error after  epoch : 82 here batch size = 25
2.8875611987560705
Training error after  epoch : 83 here every batch size = 25
2.1937331953649513
Validation error after  epoch : 83 here batch size = 25
2.880673338092639
Training error after  epoch : 84 here every batch size = 25
2.183331222193684
Va

Training error after  epoch : 32 here every batch size = 25
3.3537434053949546
Validation error after  epoch : 32 here batch size = 25
3.5813035088950484
Training error after  epoch : 33 here every batch size = 25
3.300825030474059
Validation error after  epoch : 33 here batch size = 25
3.539744958299615
Training error after  epoch : 34 here every batch size = 25
3.2504292311937557
Validation error after  epoch : 34 here batch size = 25
3.500987789791534
Training error after  epoch : 35 here every batch size = 25
3.2027278485203015
Validation error after  epoch : 35 here batch size = 25
3.464932048613255
Training error after  epoch : 36 here every batch size = 25
3.1574650596471456
Validation error after  epoch : 36 here batch size = 25
3.430971816455708
Training error after  epoch : 37 here every batch size = 25
3.1142084192017614
Validation error after  epoch : 37 here batch size = 25
3.3993149077376428
Training error after  epoch : 38 here every batch size = 25
3.072910802163206
Val

Training error after  epoch : 86 here every batch size = 25
2.2373855710781685
Validation error after  epoch : 86 here batch size = 25
2.8352009978558432
Training error after  epoch : 87 here every batch size = 25
2.229769126517311
Validation error after  epoch : 87 here batch size = 25
2.830410705602529
Training error after  epoch : 88 here every batch size = 25
2.222287829663174
Validation error after  epoch : 88 here batch size = 25
2.8257639052505543
Training error after  epoch : 89 here every batch size = 25
2.2149420674748113
Validation error after  epoch : 89 here batch size = 25
2.821264433760084
Training error after  epoch : 90 here every batch size = 25
2.2076617947860298
Validation error after  epoch : 90 here batch size = 25
2.8169154480833116
Training error after  epoch : 91 here every batch size = 25
2.2004923705413857
Validation error after  epoch : 91 here batch size = 25
2.812719380074964
Training error after  epoch : 92 here every batch size = 25
2.193474436586178
Val



Plots are Generated Successfully In Figures folder


In [14]:
err, o_softmax = forward_pass(X[4000: ], W, V, Y[4000:])
correct_predictions = 0
for o, y in zip(o_softmax, Y[4000:]):
    if np.argmax(o) == np.argmax(y):
        correct_predictions += 1
print(correct_predictions)

779


In [15]:
print(f'Testing Accuracy = {cnt / 1000 * 100}%')

Testing Accuracy = 77.9%


**Observations:** Along with each epoch, training error and validation error were reducing. Which suggests the MLP worked fine. Also the network achieves the accuracy of 78%