# Neural Network

### Forward propagation
$x_i$ = input layer<br>
$w_i$ = weights<br>
$b$ = bias<br>
$z$ = hidden layer<br>
$f(z)$ = activation<br>

$z = \sum\limits_{i=1}^{n}x_iw_i + b$<br>
Given Sigmoid as activation function: $f(H) = \frac{1}{1+e^{-z}}$

**Remark**<br>
$w_i$ for b is 1<br>
Which activation function to choose depends on the prediction output: number/category<br>
If numerical output, MSE loss function will be used, reLU, linear(Identity) activation will be used.<br>
If categorical output, cross entropy loss function will be used, sigmoid, tanh, softmax (non linear) activation function will be used.



### Back propagation
incrementally tweaking the network’s weights until the lowest possible cost value is obtained.

### Partial derivative for $w_i$: $\frac{\partial C}{\partial w_i} = \frac{\partial C}{\partial \hat{y}} * \frac{\partial \hat{y}}{\partial z} * \frac{\partial z}{\partial w_i}$

1. $\frac{\partial C}{\partial \hat{y}} = \frac{\partial}{\partial \hat{y}}\frac{1}{n}\sum\limits_{i=1}^{n}(y_i-\hat{y_i})^2 = \frac{2}{n}\sum\limits_{1=1}^{n}(y_i-\hat{y_i})$

2. Given $\sigma$ = Sigmoid function (different activation function has different derivative below)

3. $\frac{\partial \hat{y}}{\partial z} = \frac{\partial}{\partial z}\sigma(z) = \sigma(z) * (1-\sigma(z)) $

4. $\frac{\partial z}{\partial w_i} = \frac{\partial}{\partial w_i}\sum\limits_{i=1}^{n}x_iw_i+b = x_i$

### $\frac{\partial C}{\partial w_i} = \frac{2}{n} * \sum\limits_{i=1}^{n}(y_i - \hat{y_i}) * \sigma(z) * (1-\sigma(z)) * x_i$

### Partial derivative for $b$
### $\frac{\partial C}{\partial b} = \frac{2}{n} * \sum\limits_{i=1}^{n}(y_i-\hat{y_i}) * \sigma(z) * (1-\sigma(z))$

# Cost Function
##### MSE = $\frac{1}{n}\sum\limits_{i=1}^{n}(y_i - \hat{y_i})^2$

##### Cross entropy = $\frac{1}{n}\sum\limits_{i=1}^{n} [y * \log(\hat{y}) + (1-y) * \log(1-\hat{y})]$

### learning algorithm
1. Start with values (often random) for the network parameters (wij weights and bj biases).
2. Take a set of examples of input data and pass them through the network to obtain their prediction.
3. Compare these predictions obtained with the values of expected labels and calculate the loss with them.
4. Perform the backpropagation in order to propagate this loss to each and every one of the parameters that make up the model of the neural network.
5. Use this propagated information to update the parameters of the neural network with the gradient descent in a way that the total loss is reduced and a better model is obtained.
6. Continue iterating in the previous steps until we consider that we have a good model.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
np.set_printoptions(suppress=True)
from sklearn.datasets import make_moons
import seaborn as sns

### Activation function

In [4]:
def linear(z):
    return z

def sigmoid(z):
    return 1 / (1+np.exp(-z))

def tanh(z):
    pass

def softmax(z):
    ez = np.exp(z)
    sm = ez / np.sum(ez)
    return sm

def relu(z):
    return max(0, z)

### Cost Function: cross entropy

In [None]:
def Cross_Entropy(theta, X, y):
    m = y.shape[0]
    theta = theta[:, np.newaxis] #trick to make numpy minimize work
    h = sigmoid(X.dot(theta))
    J = (1/m) * (-y.T.dot(np.log(h)) - (1-y).T.dot(np.log(1-h)))

    grad = (1/m) * (h - y).T.dot(X)

    return J, grad

### Activation

In [5]:
def Dense(a_in, W, b):  
    z = a_in @ W + b
    a_out = sigmoid(z)
    return a_out

In [6]:
def Sequential(X, W1, b1, W2, b2, W3, b3):
    a1 = Dense(X, W1, b1)
    a2 = Dense(a1, W2, b2)
    a3 = Dense(a2, W3, b3)
    return a3

### Prediction

In [7]:
def Predict(X, W1, b1, W2, b2, W3, b3):    
    output = Sequential(X, W1, b1, W2, b2, W3, b3)
    y_hat = np.zeros_like(output)
    for i in range(len(output)):
        if output[i] > 0.5:
            y_hat[i] = 1
        else:
            y_hat[i] = 0
    return y_hat

### Parameter

In [18]:
df = pd.read_csv('housepricedata.csv')
dataset = df.values
X = dataset[:, :-1]
y = dataset[:, -1]
df.head()

Unnamed: 0,LotArea,OverallQual,OverallCond,TotalBsmtSF,FullBath,HalfBath,BedroomAbvGr,TotRmsAbvGrd,Fireplaces,GarageArea,AboveMedianPrice
0,8450,7,5,856,2,1,3,8,0,548,1
1,9600,6,8,1262,2,0,3,6,1,460,1
2,11250,7,5,920,2,1,3,6,1,608,1
3,9550,7,5,756,1,0,3,7,1,642,0
4,14260,8,5,1145,2,1,4,9,1,836,1


In [30]:
y.shape[0]

1460