# Neural Networks Coding Challenge

Objectives:
  * Write a simple three layer network
  * Compute forward propagation for a new sample in three layer network
  * Compute backward propagation in the same network
  * Use MLPClassifier to train and test a dataset

### Background

Other than the MLPClassifier objective, you will be working with this neural net during this coding challenge:

![Simple Neural Net](https://www.lucidchart.com/publicSegments/view/a5b0773e-7165-450d-99fc-7089891e099a/image.png)

### 1. Write a simple three layer network

Create variables to store weights and biases for the above network. Initialize each with $0.5$.

In [0]:
import numpy as np

w = np.array([[0.5, 0.5] for i in range(3)])
b = [0.5, 0.5, 0.5]

### 2. Compute forward propagation for a new sample in three layer network

Write a function `feed_forward` that takes a new sample $x$ and calculates $\hat{y}$ via forward propagation.

In [0]:
x1 = 4
x2 = 0.5
x3 = 0.125
y1 = 0
y2 = 1
y3 = 1

def g(x):
#   return x * (x > 0)
  return np.log(1 + np.exp(x))

def feed_forward(x):
  l = []
  l.append(np.array([b[1], g(np.dot([b[0], x], w[0]))]))
  l.append(np.array([b[2], g(np.dot(l[0], w[1]))]))
  l.append(np.array([g(np.dot(l[1], w[2]))]))
  return l
    
print(feed_forward(x1))

[array([0.5       , 2.35020656]), array([0.5      , 1.6406046]), array([1.36513736])]


In [0]:
# TEST
y_hat1 = feed_forward(x1)
y_hat2 = feed_forward(x2)
y_hat3 = feed_forward(x3)

print(y_hat1[2][0])
print(y_hat2[2][0])
print(y_hat3[2][0])


1.365137361104654
1.1808110866847727
1.167785333902784


### 3. Compute backward propagation for the same network

The backprop algorithm is derived from the goal of minimizing the error (or loss) function $\epsilon = (y - \hat{y})^2$.

$\epsilon = (y - \sigma(h_2+b_2))^2$

Via the chain rule, the derivative of the above is

$\frac{\partial \epsilon}{\partial \hat{y}} = 2(y-\hat{y})\sigma(h_2)$

Let $\alpha = 0.1$. Update the weights for $h_2$ and $h_1$ via back propagation so that $h_2$ = $h_2 + \alpha \frac{\partial \epsilon}{\partial \hat{y}}$ and $h_1 = h_1 + \alpha \frac{\partial \epsilon}{\partial h_2}$

Also, let $\sigma(x) = ReLU(x)$. As such, $\sigma'(x) = 0$ when $x \le 0$ and $\sigma'(x) = 1$ when $x \gt 0$.

Check Case1: of [Brian Dolhansky](http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4) for a more detailed explanation of the values in the back propagation.


In [0]:
def dg(x): 
# return [0 if i<=0 else 1 for i in x]
  return np.exp(x)/(1 + np.exp(x))

def feed_forward_and_back_propagate(x, y):
  a = 0.1
  l = feed_forward(x)
  e = [[] for i in range(3)]
  e[-1] = np.array([0, l[-1] - y])
  e[-2] = np.array(np.dot(w[-1].T, e[-1])) * dg(l[-2])
  e[-3] = np.array(np.dot(w[-2].T, e[-2])) * dg(l[-3])
  
  w[0] = w[0] + a*np.dot(np.array([b[0], x]).T, e[0])
  w[1] = w[1] + a*np.dot(l[0].T, e[1])
  w[2] = w[2] + a*np.dot(l[1].T, e[2])
  
  l = feed_forward(x)
  return l

In [0]:
y_hat4 = feed_forward_and_back_propagate(x1,y1)
y_hat5 = feed_forward_and_back_propagate(x2,y2)
y_hat6 = feed_forward_and_back_propagate(x3,y3)

print(y_hat4[2][0])
print(y_hat5[2][0])
print(y_hat6[2][0])

2.2792284476084936
1.7317315192019487
1.8548838972966304


### 4. Use MLPClassifier to train a dataset

`X` is now a small dataset. Create an MLPClassifier from sklearn and train it on the `X` dataset, with `y` as the targets.

In [0]:
import numpy as np
from sklearn.neural_network import MLPClassifier

X = np.row_stack([x1,x2,x3])
y = np.row_stack([y1,y2,y3])

clf = MLPClassifier(solver='lbfgs', alpha=0.1, hidden_layer_sizes=(1, 1), random_state=8)
clf.fit(X, y.ravel());