#MNIST
Our objective is to build a neural network for the classification of the MNIST dataset. This neural network will comprise two layers, each with 10 nodes, and an input layer with 784 nodes corresponding to the image pixels. The specific structure of the neural network is outlined below, where $X$ represents the input, $A^{[0]}$ denotes the first layer, $Z^{[1]}$ signifies the unactivated layer 1, $A^{[1]}$ stands for the activated layer 1, and so forth. The weights and biases are represented by $W$ and $b$ respectively:


<div align="center">

$A^{[0]}=X$

$Z^{[1]}=W^{[1]}A^{[0]}+b^{[1]}$

$A^{[1]}=\text{ReLU}(Z^{[1]})$

$Z^{[2]}=W^{[2]}A^{[1]}+b^{[2]}$

$A^{[2]}=\text{softmax}(Z^{[2]})$
</div>




You have the flexibility to create any function within or outside the class, allowing you to modify parameters as needed

In [1]:
#importing libraries
import pandas as pd
import numpy as np

from keras.datasets import mnist
import matplotlib.pyplot as plt

### Required functions

In [2]:
# activation and loss functions
def ReLU(x):
    return np.maximum(0,x)


def derivative_ReLU(x):
    return np.where(x > 0, 1, 0)

def softmax(x):
    exp_x = np.exp(x - np.max(x))
    return exp_x / exp_x.sum(axis=0, keepdims=True)


In [3]:
#complete the class of neural network

class NN:
  def __init__():

      W1 = np.random.randn(10,784)

      b1 = np.zeros(10)
      W2 = np.random.randn(10, 10)
      b2 = np.zeros(10)
      return W1,b1,W2,b2

  def forward_propagation(x,w1,w2,b1,b2):


      z1 = np.dot(w1,x)+b1
      a1 = ReLU(z1)
      z2 = np.dot(w2,a1)+b2
      a2 = softmax(z2)
      return a2

  def one_hot(y_train): #return a 0 vector with 1 only in the position corresponding to the value in test target
      return np.eye(10)[y_train]
  def loss():
      s = np.square(out - Y)
      s = np.sum(s) / len(Y)
      return s

  def backward_propagation(x,y,w1,b1,w2,b2):
      m=x.shape[0]
      z1  = np.dot(w1,x)+b1
      a1 = ReLU(z1)
      z2 = np.dot(w2,a1)+b2
      a2 = softmax(z2)
      dz2 = a2-y
      dw2= np.dot(dz2,a1.T)

      dz1 = np.dot(w2.T,dz2)*derivative_ReLU(z1)
      db1 = np.sum(dz1, axis=1, keepdims=True)
      db2 = np.sum(dz2, axis=1, keepdims=True)
      dw1= np.dot(dz1, x)
      return dw1,dw2,db1,db2







  def update_params(w1,w2,b1,b2,dw1,dw2,db1,db2):
      w1 = w1 - 0.001*dw1
      b1 = b1 -0.001*db1
      w2 = w2 - 0.001*dw2
      b2 = b2 -0.001*db2
      return w1,w2,b1,b2


  def get_predictions(w1,w2,b1,b2,xtest):
      y_pred = forward_propagation(xtest,w1,w2,b1,b2)
      return y_pred
  def get_accuracy(y_pred,y_actual):
      return np.sum(y_pred==y_actual)/len(y_pred)


  def gradient_descent(x,y,w1,w2,b1,b2):

      for i in range(1000):
        a2=NN.forward_propagation(x,w1,w2,b1,b2)
        dw1,dw2,db1,db2 =NN.backward_propagation(x,y,w1,b1,w2,b2)
        w1,w2,b1,b2 =NN.update_params(w1,w2,b1,b2,dw1,dw2,db1,db2)

      return w1,w2,b1,b2

  def make_predictions(self):
      pass

  def show_prediction(self): #show the prediction and actual output for an image in mnist dataset
      pass


## main

In [4]:
(X_train, Y_train), (X_test, Y_test) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


###preprocessing the data


In [5]:
X_train, X_test = X_train / 255.0, X_test / 255.0
Y_train_onehot = NN.one_hot(Y_train)


w1,b1,w2,b2=NN.__init__()
a0= X_train.resize(784,1)

###Model Training

In [6]:
w1,w2,b1,b2=NN.gradient_descent(a0,Y_train_onehot,w1,w2,b1,b2)


TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

### Viewing Results


In [None]:
Y_pred = NN.get_predictions(w1,w2,b1,b2,X_test)
print(get_accuracy(Y_pred,Y_test)
