# Neural Networks
To implement a neural network to recognize handwritten digits. The neural network will be able to represent complex models that form non-linear hypotheses. For this exercise, the parameters from a neural network that is already trained is provided. The goal is to implement the feedforward propagation algorithm.

## Model representation
For this problem set, there are 3 layers - an input layer, a hiddel layer, and an output layer. The inputs are pixel values of digit images of 20 x 20 which makes 400 input layer units + 1 extra bias unit. There are 25 units at the hiden layers and 10 units at the output layer to represent classifiers for each out class.

The provided network parametres ($\Theta^{(1)}$, $\Theta^{(2)}$) are already are already trained.

In [14]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.io

In [15]:
mat = scipy.io.loadmat('datasets/ex3data1.mat')
X, y = mat['X'], mat['y']
print('X shape: {}'.format(X.shape))
print('y shape: {}'.format(y.shape))

X shape: (5000, 400)
y shape: (5000, 1)


In [16]:
mat = scipy.io.loadmat('datasets/ex3weights.mat')
Theta1, Theta2 = mat['Theta1'], mat['Theta2']

print('Theta1 shape: {}'.format(Theta1.shape))
print('Theta2 shape: {}'.format(Theta2.shape))

Theta1 shape: (25, 401)
Theta2 shape: (10, 26)


### Feedforward Propagation

$a^{(1)} = x$  (add $a_0^{(1)}$)

$z^{(2)} = \Theta^{(1)}a^{(1)}$<br>
$a^{(2)} = g(z^{(2)})$  (add $a_0^{(2)}$)

$z^{(3)} = \Theta^{(2)}a^{(2)}$<br>
$a^{(3)} = g(z^{(3)}) = h_\theta(x)$  

In [17]:
def sigmoid(z):
    return 1/(1 + np.exp(-z))

m, n = X.shape

a1 = np.insert(X, 0, 1, 1) # (5000,401)

z2 = a1 @ Theta1.T # (5000,401)x(401,25) -> (5000,25)
a2 = sigmoid(z2) 

a2 = np.insert(a2, 0, 1, 1) # (5000,26)

z3 = a2 @ Theta2.T # (5000,26)x(26,10) -> (5000,10)
a3 = sigmoid(z3)

print('h(x) shape: {}'.format(a3.shape))

h(x) shape: (5000, 10)


### Predict

In [30]:
y_prob, y_pred = np.max(a3, axis=1), np.argmax(a3, axis=1)+1 

print('Predicted, Probability')
np.vstack([y_pred, y_prob]).T

Predicted, Probability


array([[10.        ,  0.99573401],
       [10.        ,  0.99569693],
       [10.        ,  0.9280084 ],
       ...,
       [ 9.        ,  0.64982695],
       [ 9.        ,  0.9714105 ],
       [ 9.        ,  0.69628899]])

In [27]:
print('Accuracy: {}'.format(np.mean(y_pred == y.flatten())))

Accuracy: 0.9752
