## Neural Network with Backpropagation
---
This exercise implements a feed forward neural network with backpropagation from scratch. MNIST dataset is used. The input layer has 401 neurons corresponding to the 20x20 image (400 pixels + 1 bias). The hidden layer has 26 neurons (25 + 1 bias). The output layer has 10 neurons corresponding to the 0-9 digits.

#### Importing dependencies

In [1]:
import numpy as np
from scipy.io import loadmat

#### Sigmoidal activation function

In [2]:
def sigmoid(x,deriv=False):
	if deriv is True:
		return x*(1-x)
	return 1/(1+np.exp(-x))

#### Method to evaluate accuracy

In [3]:
def accuracy(A2, y):
	correct = 0
	for i in range(A2.shape[0]):
		result = np.argmax(A2[i])
		result+=1

		if result==0:
			result=10

		actual = np.argmax(y[i])
		actual+=1

		if actual==0:
			actual=10

		if result==actual:
			correct+=1

		#print("Predicted:" + str(result) + " Actual:" + str(y[i]))

	acc = (correct/5000)*100
	return acc

#### Cross Entropy cost function

In [4]:
def cost(y,A):
	cost = y*np.log(A) + (1-y)*np.log(1-A)
	return np.sum(cost)

#### Initializing data

In [5]:
data = loadmat('ex4data1.mat')

X = np.array(data['X'])
y = np.array(data['y'])
y = y.reshape(y.shape[0],1)

X = np.insert(X, 0, np.ones(X.shape[0]), axis=1)

#### Random Initialization of weight matrices to break symmetry

In [6]:
np.random.seed(1)
episolon_init = 0.12
w0 = np.random.random((401,26))*2*episolon_init - episolon_init
w1 = np.random.random((26,10))*2*episolon_init - episolon_init

#### Hyperparameters

In [7]:
alpha = 1
iters = 1500

#### Modifying dataset for One-vs-Rest Classification

In [8]:
temp = np.zeros([y.shape[0],10])
for i in range(y.shape[0]):
	temp[i][y[i]-1] = 1

y = temp

#### Gradient Descent

In [9]:
for i in range(iters):
    #Forward propagation
	z0 = np.dot(X,w0)
	A0 = sigmoid(z0)

	z1 = np.dot(A0,w1)
	A1 = sigmoid(z1)

	A1_error = A1-y

    #Backward propagation
	w1 = w1 - ( (alpha/len(X)) * ( np.dot(A0.T, A1_error) ) )

	A2_error = np.dot(A1_error,w1.T)
	A2_delta = A2_error * (sigmoid(A0,deriv=True))

	w0 = w0 - ( (alpha/len(X)) * ( np.dot(X.T, A2_delta) ) )

	if i%300==0:
		print("Epoch " + str(i) + ": " + str(cost(y,A1)))

print("Training Accuracy:" + str(accuracy(A1,y)))

Epoch 0: -35861.91575529514
Epoch 300: -4354.6332446959295
Epoch 600: -2650.3834673949477
Epoch 900: -2129.9577118279844
Epoch 1200: -1823.4065436394221
Training Accuracy:96.26
