<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Perceptron on XOR Gates](#Q2)
3. [Multilayer Perceptron](#Q3)
4. [Keras MMP](#Q4)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:** Neurons are a cell that (simple def) take in an input signal and return an ouput when activated. In the context of ANNs, these are nodes within different layers that roughly do the same thing. They take in an input and return an output as a result. These activities vary depending on which layer the artifical neuron is in.
- **Input Layer:** The input layer is the layer of an ANN that present the network with a given pattern. The number of neurons in the input layer equals the number of input variables in the data being processed. 
- **Hidden Layer:** Hidden layer(s) are what sit between the input and output layers. These layers contain the neurons(s) that take in a set of weighted inputs and return an output through an activation function.
- **Output Layer:** The output layer is the layer that 'answer' output. The number of neurons in the output layer equals the number of outputs associated with each input.
- **Activation:** This is the part of an artifical neuron that defines the output depending on it's input. These allow ANNs to model linear and non-linear behavior. 
- **Backpropagation:** Backpropagation is an algorithm for how weights in a neural network are updated in reverse order at the end of each training epoch.


## 2. Perceptron on XOR Gates <a id="Q2"></a>

The XOr, or “exclusive or”, problem is a classic problem in ANN research. It is the problem of using a neural network to predict the outputs of XOr logic gates given two binary inputs. An XOr function should return a true value if the two inputs are not equal and a false value if they are equal. Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

|x1	|x2 | y |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 1 | 0 |
| 1 | 0 | 1 |


In [1]:
import numpy as np

In [33]:
class Perceptron(object):
    # initialize with the input size, [default] learning rate, and [default] epochs
    def __init__(self, input_size, learning_rate=1, epochs=100):
        self.WEIGHT = np.zeros(input_size+1)
        self.epochs = epochs
        self.learning_rate = learning_rate
    
    #implement activation function that returns 1 if the input is greater than or equal to 0, otherwise return 0
    def activation_function(self, x):
        return 1 if x >= 0 else 0
    
    #implement predict function that runs an input through the perceptron and returns its output
    #add bias into the input vector
    def predict(self, x):
        z = self.WEIGHT.T.dot(x)
        a = self.activation_function(z)
        return a 
    
    #function for running the perceptron 'algo'
    #update the weights for the given epochs
    #iterate through the training set 
    #set bias into the input during the update to the weight
    #create prediction, compute error, and update 
    def fit(self, X, d):
        for _ in range(self.epochs):
            for i in range(d.shape[0]):
                x = np.insert(X[i], 0, 1)
                y = self.predict(x)
                e = d[i] - y
                self.WEIGHT = self.WEIGHT + self.learning_rate * e * x 

In [51]:
X = np.array([
    [0, 0],
    [0, 1],
    [1, 1],
    [1, 0]
])
y = np.array([0, 1, 0, 1])
# y = np.array([0, 0, 0, 1])

model = Perceptron(input_size=2)
model.fit(X, y)
print(model.WEIGHT)

[0. 0. 0.]


## 3. Multilayer Perceptron <a id="Q3"></a>

Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights.
Your network must have one hidden layer.
You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
Train your model on the Heart Disease dataset from UCI:



In [54]:
import matplotlib.pyplot as plt
import pandas as pd

In [55]:
df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')

In [59]:
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [120]:
import random 

def sigmoid(self, s):
    return 1 / (1+np.exp(-s))

def sigmoid_derivative(x): 
    sx = sigmoid(x)
    return sx * (1-sx)
    
class NeuralNetwork:
    def __init__(self, x,y):
        self.input = x
        self.weights1= np.random.rand(self.input.shape[1],4) # considering we have 4 nodes in the hidden layer
        self.weights2 = np.random.rand(4,1)
        self.y = y
        self.output = np. zeros(y.shape)
        
    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.layer2 = sigmoid(np.dot(self.layer1, self.weights2))
        return self.layer2
        
    def backprop(self):
        d_weights2 = np.dot(self.layer1.T, 2*(self.y -self.output)*sigmoid_derivative(self.output))
        d_weights1 = np.dot(self.input.T, np.dot(2*(self.y -self.output)*sigmoid_derivative(self.output), self.weights2.T)*sigmoid_derivative(self.layer1))
    
        self.weights1 += d_weights1
        self.weights2 += d_weights2

    def train(self, X, y):
        self.output = self.feedforward()
        self.backprop()

In [121]:
from sklearn.model_selection import train_test_split

In [122]:
features = df.loc[:,df.columns != 'target']
target = df.iloc[:,-1]
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.20, random_state=42)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((242, 13), (61, 13), (242,), (61,))

In [123]:
nn = NeuralNetwork(X_train, y_train)

In [119]:
for i in range(1500): 
    if i % 100 ==0: 
        print ("for iteration # " + str(i) + "\n")
        print ("Input : \n" + str(X_train))
        print ("Actual Output: \n" + str(y_train))
        print ("Predicted Output: \n" + str(nn.feedforward()))
        print ("Loss: \n" + str(np.mean(np.square(y_train - nn.feed_forward())))) # mean sum squared loss
        print ("\n")
  
    nn.train(X_train, y_train)

for iteration # 0

Input : 
     age  sex  cp  trestbps  chol  fbs  restecg  thalach  exang  oldpeak  \
132   42    1   1       120   295    0        1      162      0      0.0   
202   58    1   0       150   270    0        0      111      1      0.8   
196   46    1   2       150   231    0        1      147      0      3.6   
75    55    0   1       135   250    0        0      161      0      1.4   
176   60    1   0       117   230    1        1      160      1      1.4   
59    57    0   0       128   303    0        0      159      0      0.0   
93    54    0   1       132   288    1        0      159      1      0.0   
6     56    0   1       140   294    0        0      153      0      1.3   
177   64    1   2       140   335    0        1      158      0      0.0   
30    41    0   1       105   198    0        1      168      0      0.0   
22    42    1   0       140   226    0        1      178      0      0.0   
258   62    0   0       150   244    0        1      154    

AttributeError: 'NeuralNetwork' object has no attribute 'feedforward'

## 4. Keras MMP <a id="Q4"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

In [69]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler

In [60]:
df_keras = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')

In [62]:
print(df_keras.shape)
df_keras.head()

(303, 14)


Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [64]:
features = df_keras.loc[:,df_keras.columns != 'target']
target = df_keras.iloc[:,-1]

In [67]:
X_train, X_test, y_train, y_test = train_test_split(features, target, test_size=0.20, random_state=42)

In [68]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((242, 13), (61, 13), (242,), (61,))

In [70]:
model = Sequential()

In [71]:
model.add(Dense(64, input_shape=[len(X_train.keys())], activation='relu'))

W0816 12:10:15.043053 4563686848 deprecation.py:506] From /Users/dwightchurchill/anaconda3/envs/U4-S2-NNF/lib/python3.7/site-packages/tensorflow/python/ops/init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


In [72]:
model.add(Dense(64, activation='relu'))

In [73]:
model.add(Dense(1))

In [74]:
model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

In [75]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 64)                896       
_________________________________________________________________
dense_1 (Dense)              (None, 64)                4160      
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 65        
Total params: 5,121
Trainable params: 5,121
Non-trainable params: 0
_________________________________________________________________


In [76]:
model.fit(X_train, y_train, epochs=1000, verbose=1)

Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

<tensorflow.python.keras.callbacks.History at 0x1a4822f550>

In [79]:
scores = model.evaluate(X_train,y_train)
print(f"{model.metrics_names[1]}: {scores[1]*100}")

acc: 90.08264541625977


In [80]:
scores = model.evaluate(X_test,y_test)
print(f"{model.metrics_names[1]}: {scores[1]*100}")

acc: 78.68852615356445


### Hyperparameter Tuning

In [82]:
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier

#### Batch Size and Epochs

In [84]:
# create function for model creation for KerasClassifier
def create_model():
    model = Sequential()
    model.add(Dense(64, input_shape=[len(X_train.keys())], activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])
    return model

In [85]:
model = KerasClassifier(build_fn=create_model, verbose=1)

In [None]:
#define grid search parameters
param_grid = {'batch_size': [10, 20, 60, 100],
             'epochs': [1000]}

#create grid search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X_train, y_train)



Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

In [118]:
# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")

NameError: name 'grid_result' is not defined