<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Perceptron on XOR Gates](#Q2)
3. [Multilayer Perceptron](#Q3)
4. [Keras MMP](#Q4)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:** Computes the weighted sum of it's inputs, also contains an activation function.
- **Input Layer:** The first layer. Must resemble the shape of your features
- **Hidden Layer:** Layers between input and output
- **Output Layer:** Layer that outputs the networks prediction
- **Activation:** A function that you use on the weighted sum of the neruon. Determines whether or not that neuron "fires"
- **Backpropagation:** The process of updating weights based on the error of your predictions. 


## 2. Perceptron on XOR Gates <a id="Q2"></a>

The XOr, or “exclusive or”, problem is a classic problem in ANN research. It is the problem of using a neural network to predict the outputs of XOr logic gates given two binary inputs. An XOr function should return a true value if the two inputs are not equal and a false value if they are equal. Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

|x1	|x2 | y |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 1 | 0 |
| 1 | 0 | 1 |


In [129]:
import numpy as np
np.random.seed(10)

In [216]:
class Perceptron:
    def __init__(self, x, y):
        self.input = x
        self.output = y
        self.weights = 2 * np.random.random((2,1)) -1
    def sigmoid(self, x):
        return 1/(1+np.exp(-x))
    
    def sigmoid_derivative(self, x):
        sx = self.sigmoid(x)
        return sx * (1-sx)
    
    def train(self, epochs):
        print(f'Weights before Training: {self.weights}')
        for num in range(epochs):
            weighted_sum = np.dot(self.input, self.weights)
            activated_output = sigmoid(weighted_sum) 
            error = activated_output - self.output
            adjustments = error * sigmoid_derivative(activated_output)
            self.weights += np.dot(self.input.T, adjustments)
        print(f'Weights after Training: {self.weights}')
        print(f"Output after Training: {activated_output}")
        return self.weights
    
    def predict(self, X):
        return np.where(self.sigmoid(np.dot(X, self.weights)) > 0.5, 1, 0)  

In [217]:
x_data= np.array([
    [0, 0],
    [0, 1],
    [1, 1],
    [1, 0]])

y_data= [[0], [1], [0], [1]]

In [218]:
NN = Perceptron(x_data, y_data)

In [219]:
training = NN.train(100000)

Weights before Training: [[-0.41440405]
 [-0.87149788]]
Weights after Training: [[-24999.27984675]
 [-25000.07498582]]
Output after Training: [[0.5]
 [0. ]
 [0. ]
 [0. ]]


In [182]:
NN.predict([0,1])

array([1])

## Lecture method 

In [183]:
def sigmoid(x):
    return 1/(1+np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [184]:
inputs= np.array([
    [0, 0],
    [0, 1],
    [1, 1],
    [1, 0]])

correct_outputs= [[0], [1], [0], [1]]

weights = 2 * np.random.random((2,1)) - 1
print(weights)
for i in range(10000):
    
    weighted_sum = np.dot(inputs, weights)
    
    activated_output = sigmoid(weighted_sum)
    
    error = activated_output - correct_outputs
    
    adjustments = error * sigmoid_derivative(activated_output)
    
    weights += np.dot(inputs.T, adjustments)

print(f"Weights: {weights}")
print(f"Output after training: {activated_output}")

[[0.30079436]
 [0.20207791]]
Weights: [[1964.76854983]
 [1964.50940705]]
Output after training: [[0.5]
 [1. ]
 [1. ]
 [1. ]]


## 3. Multilayer Perceptron <a id="Q3"></a>

Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights.
Your network must have one hidden layer.
You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
Train your model on the Heart Disease dataset from UCI:



In [185]:
import pandas as pd

In [186]:
heart = pd.read_csv("heart.csv")

In [187]:
heart.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [267]:
target = "target"
features = [column for column in heart.columns if column != "target"]

In [268]:
y_data = heart[target].values
x_data = heart[features].values

In [306]:
class Neural_Network(object):
    def __init__(self):
        self.inputSize = 13
        self.outputSize = 1
        self.hiddenSize = 6
        self.weights1 = np.random.rand(self.inputSize, self.hiddenSize) 
        self.weights2 = np.random.rand(self.hiddenSize, self.outputSize)
    
    def sigmoid(self, x):
        return 1/(1+np.exp(-x))
    
    def sigmoid_derivative(self, x):
        sx = self.sigmoid(x)
        return sx * (1-sx)
    
    def forward(self, x):
        self.weighted_sum1 = np.dot(x, self.weights1)
        self.activation_output1 = self.sigmoid(self.weighted_sum1) 
        self.weighted_sum2 = np.dot(self.activation_output1, self.weights2) 
        self.activation_output2 = self.sigmoid(self.weighted_sum2) 
        return self.activation_output2
    
    def backward(self, data, target):
        self.error1 = target - self.activation_output2
        self.adjustments1 = self.error1 *self.sigmoid_derivative(self.activation_output2)
        self.error2 = self.adjustments1.dot(self.weights2.T)
        self.adjustments2 = self.error2 *self.sigmoid_derivative(self.activation_output1)
        self.weights1 += data.T.dot(self.adjustments2) 
        self.weights2 += self.z2.T.dot(self.adjustments1) 
    
    def train(self, data, target, epochs):
        for epoch in range(epochs):
            output = self.forward(data)
            self.backward(data, target)
    
    def predict(self, data):
        print(str(self.forward(data)))


In [307]:
NN = Neural_Network()

In [308]:
hist = NN.train(x_data, y_data, 10)

ValueError: shapes (303,303) and (1,6) not aligned: 303 (dim 1) != 1 (dim 0)

## 4. Keras MMP <a id="Q4"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

In [333]:
import tensorflow as tf
from tensorflow import keras
from keras.utils import normalize
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import GridSearchCV, train_test_split
from keras.wrappers.scikit_learn import KerasClassifier

In [311]:
normal_x = normalize(x_data)

In [314]:
X_train, X_test, y_train, y_test = train_test_split(normal_x, y_data, train_size= 0.8, test_size= 0.2)

In [326]:
model = Sequential([
    Dense(64, activation= "sigmoid", input_shape= (13,)),
    Dense(32, activation= "relu"),
    Dense(16, activation= "sigmoid"),
    Dense(8, activation= "relu"),
    Dense(1, activation= "sigmoid")
])

In [331]:
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

In [332]:
hist = model.fit(X_train, y_train,
          batch_size=32, epochs=100,
          validation_data=(X_test, y_test))

Train on 242 samples, validate on 61 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Ep

In [343]:
def create_model(optimizer):
    model = Sequential([
    Dense(64, activation= "sigmoid", input_shape= (13,)),
    Dense(32, activation= "relu"),
    Dense(16, activation= "sigmoid"),
    Dense(8, activation= "relu"),
    Dense(1, activation= "sigmoid")
    ])
    model.compile(loss='mean_squared_error', optimizer=optimizer, metrics=['accuracy'])
    return model

In [337]:
model = KerasClassifier(build_fn=create_model, verbose=1)

In [None]:
param_grid = {'batch_size': [10, 20, 40, 60, 80, 100], 'epochs': [100, 200, 300, 400, 500]}
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X_train, y_train)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 337/500
Epoch 338/500
Epoch 339/500
Epoch 340/500
Epoch 341/500
Epoch 342/500
Epoch 343/500
Epoch 344/500
Epoch 345/500
Epoch 346/500
Epoch 347/500
Epoch 348/500
Epoch 349/500
Epoch 350/500
Epoch 351/500
Epoch 352/500
Epoch 353/500
Epoch 354/500
Epoch 355/500
Epoch 356/500
Epoch 357/500
Epoch 358/500
Epoch 359/500
Epoch 360/500
Epoch 361/500
Epoch 362/500
Epoch 363/500
Epoch 364/500
Epoch 365/500
Epoch 366/500
Epoch 367/500
Epoch 368/500
Epoch 369/500
Epoch 370/500
Epoch 371/500
Epoch 372/500
Epoch 373/500
Epoch 374/500
Epoch 375/500
Epoch 376/500
Epoch 377/500
Epoch 378/500
Epoch 379/500
Epoch 380/500
Epoch 381/500
Epoch 382/500
Epoch 383/500
Epoch 384/500
Epoch 385/500
Epoch 386/500
Epoch 387/500
Epoch 388/500
Epoch 389/500
Epoch 390/500
Epoch 391/500
Epoch 392/500
Epoch 393/500
Epoch 394/500
Epoch 395/500
Epoch 396/500
Epoch 397/500
Epoch 398/500
Epoch 399/500
Epoch 400/500
Epoch 401/500
Epoch 402/500
Epoch 403/500
Epoch 404/500
Epoch 405/500
Epoch 406/500
Epoch 407/500
Epoch 

IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



Epoch 214/300
Epoch 215/300
Epoch 216/300
Epoch 217/300
Epoch 218/300
Epoch 219/300
Epoch 220/300
Epoch 221/300
Epoch 222/300
Epoch 223/300
Epoch 224/300
Epoch 225/300
Epoch 226/300
Epoch 227/300
Epoch 228/300
Epoch 229/300
Epoch 230/300
Epoch 231/300
Epoch 232/300
Epoch 233/300
Epoch 234/300
Epoch 235/300
Epoch 236/300
Epoch 237/300
Epoch 238/300
Epoch 239/300
Epoch 240/300
Epoch 241/300
Epoch 242/300
Epoch 243/300
Epoch 244/300
Epoch 245/300
Epoch 246/300
Epoch 247/300
Epoch 248/300
Epoch 249/300
Epoch 250/300
Epoch 251/300
Epoch 252/300
Epoch 253/300
Epoch 254/300
Epoch 255/300
Epoch 256/300
Epoch 257/300
Epoch 258/300
Epoch 259/300
Epoch 260/300
Epoch 261/300
Epoch 262/300
Epoch 263/300
Epoch 264/300
Epoch 265/300
Epoch 266/300
Epoch 267/300
Epoch 268/300
Epoch 269/300
Epoch 270/300
Epoch 271/300
Epoch 272/300
Epoch 273/300
Epoch 274/300
Epoch 275/300
Epoch 276/300
Epoch 277/300
Epoch 278/300
Epoch 279/300
Epoch 280/300
Epoch 281/300
Epoch 282/300
Epoch 283/300
Epoch 284/300
Epoch 

In [339]:
grid.best_params_

{'batch_size': 10, 'epochs': 500}

In [345]:
optimizers = ["Adagrad", "adam", "sgd"]

In [346]:
for optimizer in optimizers:
    print(f"Optimizer: {optimizer}")
    model = create_model(optimizer)
    model.fit(X_train, y_train, epochs= 500, batch_size= 10, validation_data= (X_test, y_test))

Optimizer: Adagrad
Train on 242 samples, validate on 61 samples
Epoch 1/500
Epoch 2/500
Epoch 3/500
Epoch 4/500
Epoch 5/500
Epoch 6/500
Epoch 7/500
Epoch 8/500
Epoch 9/500
Epoch 10/500
Epoch 11/500
Epoch 12/500
Epoch 13/500
Epoch 14/500
Epoch 15/500
Epoch 16/500
Epoch 17/500
Epoch 18/500
Epoch 19/500
Epoch 20/500
Epoch 21/500
Epoch 22/500
Epoch 23/500
Epoch 24/500
Epoch 25/500
Epoch 26/500
Epoch 27/500
Epoch 28/500
Epoch 29/500
Epoch 30/500
Epoch 31/500
Epoch 32/500
Epoch 33/500
Epoch 34/500
Epoch 35/500
Epoch 36/500
Epoch 37/500
Epoch 38/500
Epoch 39/500
Epoch 40/500
Epoch 41/500
Epoch 42/500
Epoch 43/500
Epoch 44/500
Epoch 45/500
Epoch 46/500
Epoch 47/500
Epoch 48/500
Epoch 49/500
Epoch 50/500
Epoch 51/500
Epoch 52/500
Epoch 53/500
Epoch 54/500
Epoch 55/500
Epoch 56/500
Epoch 57/500
Epoch 58/500
Epoch 59/500
Epoch 60/500
Epoch 61/500
Epoch 62/500
Epoch 63/500
Epoch 64/500
Epoch 65/500
Epoch 66/500
Epoch 67/500
Epoch 68/500
Epoch 69/500
Epoch 70/500
Epoch 71/500
Epoch 72/500
Epoch 73/

## For this specific combination of batch size, epochs, and topology, optimizer ranks are the following #1 Adam #2 Sgd #3 Adagrad