# Neural Networks Coding Challenge

Objectives:
  * Write a simple three layer network
  * Compute forward propagation for a new sample in three layer network
  * Compute backward propagation in the same network
  * Use MLPClassifier to train and test a dataset

### Background

Other than the MLPClassifier objective, you will be working with this neural net during this coding challenge:

![Simple Neural Net](https://www.lucidchart.com/publicSegments/view/a5b0773e-7165-450d-99fc-7089891e099a/image.png)

### 1. Write a simple three layer network

Create variables to store weights and biases for the above network. Initialize each with $0.5$.

In [0]:
import numpy as np
syn = [0.5,0.5,0.5]
b = [0.5,0.5,0.5]

### 2. Compute forward propagation for a new sample in three layer network

Write a function `feed_forward` that takes a new sample $x$ and calculates $\hat{y}$ via forward propagation.

In [0]:
x1 = 4
x2 = 0.5
x3 = 0.125
y1 = 0
y2 = 1
y3 = 1

# sigmoid function

def nonlin(x,deriv=False): 
  if(deriv==True):
     return x*(1-x)
  return 1/(1+np.exp(-x))

In [104]:
# CODE


def feed_forward(X,y,syn,b):
  # seed random numbers to make calculation
  np.random.seed(42) 

  # forward propagation
  h1 = nonlin(np.dot(X,syn[0])+b[0])
  h2 = nonlin(np.dot(h1,syn[1])+b[1])
  y_hat = nonlin(np.dot(h2,syn[2])+b[2])
  print (y_hat)
  return y_hat

# TEST
y_hat1 = feed_forward(x1,y1,syn,b)
y_hat2 = feed_forward(x2,y2,syn,b)
y_hat3 = feed_forward(x3,y3,syn,b)

0.7030299333006731
0.7003970647199883
0.6999291610175293


### 3. Compute backward propagation for the same network

The backprop algorithm is derived from the goal of minimizing the error (or loss) function $\epsilon = (y - \hat{y})^2$.

$\epsilon = (y - \sigma(h_2+b_2))^2$

Via the chain rule, the derivative of the above is

$\frac{\partial \epsilon}{\partial \hat{y}} = 2(y-\hat{y})\sigma(h_2)$

Let $\alpha = 0.1$. Update the weights for $h_2$ and $h_1$ via back propagation so that $h_2$ = $h_2 + \alpha \frac{\partial \epsilon}{\partial \hat{y}}$ and $h_1 = h_1 + \alpha \frac{\partial \epsilon}{\partial h_2}$

Also, let $\sigma(x) = ReLU(x)$. As such, $\sigma'(x) = 0$ when $x \le 0$ and $\sigma'(x) = 1$ when $x \gt 0$.

Check Case1: of [Brian Dolhansky](http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4) for a more detailed explanation of the values in the back propagation.


In [145]:
def feed_forward_and_back_propagate(X,y,syn,b): 
  # seed random numbers to make calculation
  np.random.seed(42) 
  alpha = 0.1
  for iter in range(10000):
      # forward propagation
      l0 = nonlin(np.dot(X,syn[0])+b[0])
      l1 = nonlin(np.dot(l0,syn[1])+b[1])
      l2 = nonlin(np.dot(l1,syn[2])+b[2])
      l2_error = y-l2
 
      l2_delta = l2_error * nonlin(l2, deriv =True)
   
      l2_delta = np.array(l2_delta)

      l1_error = np.dot(l2_delta,syn[2])

      l1_delta = l1_error * nonlin(l1, deriv = True)
   
      l1_delta = np.array(l1_delta)
      
      l0_error = np.dot(l1_delta,syn[1])
      
      l0_delta = l0_error*nonlin(l0,deriv = True)
      
      l0_delta = np.array(l0_delta)

      syn[2]  += 0.1*np.dot(l1,l2_delta)

      syn[1]  += 0.1*np.dot(l0,l1_delta)
      
      syn[0]  += 0.1*np.dot(X,l0_delta)
  print ("Output After Training:")
  print (l2)
  return l2

y_hat4 = feed_forward_and_back_propagate(x1,y1,syn,b)
y_hat5 = feed_forward_and_back_propagate(x2,y2,syn,b)
y_hat6 = feed_forward_and_back_propagate(x3,y3,syn,b)

Output After Training:
0.02447878663904724
Output After Training:
0.9756804877685157
Output After Training:
0.9831760324205627


Back propagation has improved our results significantly.

### 4. Use MLPClassifier to train a dataset

`X` is now a small dataset. Create an MLPClassifier from sklearn and train it on the `X` dataset, with `y` as the targets.

In [40]:
import numpy as np
X = np.row_stack([x1,x2,x3])
Y = np.row_stack([y1,y2,y3])

# MLP Classifier
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

model = MLPClassifier(
                    hidden_layer_sizes=(15, 2),
                    activation='tanh',
                    solver='lbfgs',
                    alpha=1e-5,
                    batch_size=1, 
                    learning_rate='adaptive',
                    learning_rate_init=1,
                    max_iter=200,
                    shuffle=True,
                    random_state=42,
                    verbose=10,
                    tol=1e-4 )
model.fit(X,Y)
predicted_y = model.predict(X)
print (predicted_y)
from sklearn.metrics import roc_auc_score
try:
    print ("Mean AUC Score - MLPClassifier: ",roc_auc_score(Y, predicted_y))
except ValueError:
    pass

    


[0 1 1]
Mean AUC Score - MLPClassifier:  1.0


  y = column_or_1d(y, warn=True)
