# Neural Networks Sprint Challenge

## 1) Define the following terms:

- Neuron - a node in a network
- Input Layer - first layer, receives input from the dataset
- Hidden Layer - layers after the input layer, before the output layer. Deep learning is using two or more of these layers
- Output Layer - the last layer of a neural network
- Activation - function which decides how much signal to pass onto the next layer. Common functions: sigmoid, tanh, step, relu
- Backpropagation - used to calculate a gradient needed for the weights used in the network.

 YOUR ANSWER HERE

## 2) Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

| x1 | x2 | x3 | y |
|----|----|----|---|
| 1  | 1  | 1  | 1 |
| 1  | 0  | 1  | 0 |
| 0  | 1  | 1  | 0 |
| 0  | 0  | 1  | 0 |

In [10]:
##### Your Code Here #####
import numpy as np

inputs = np.array([[1,1,1],
                   [1,0,1],
                   [0,1,1],
                   [0,0,1]])

correct_outputs = [[1],
                   [0],
                   [0],
                   [0]]

In [24]:
class Perceptron(object):
    def __init__(self, rate = 0.01, niter = 10):
        self.rate = rate
        self.niter = niter

    def fit(self, X, y):
        """Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        """

        # weights
        self.weight = np.zeros(1 + X.shape[1])

        # Number of misclassifications
        self.errors = []  # Number of misclassifications

        for i in range(self.niter):
            err = 0
            for xi, target in zip(X, y):
                delta_w = self.rate * (target - self.predict(xi))
                self.weight[1:] += delta_w * xi
                self.weight[0] += delta_w
                err += int(delta_w != 0.0)
            self.errors.append(err)
            return self

    def net_input(self, X):
        """Calculate net input"""
        return np.dot(X, self.weight[1:]) + self.weight[0]

    def predict(self, X):
        """Return class label after unit step"""
        return np.where(self.net_input(X) >= 0.0, 1, 0)
    
pn = Perceptron(0.1, 10)
pn.fit(inputs, correct_outputs)
pn.predict(inputs)

array([0, 0, 0, 0])

## 3) Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. 
- Your network must have one hidden layer. 
- You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
- Train your model on the Heart Disease dataset from UCI:

[Github Dataset](https://github.com/ryanleeallred/datasets/blob/master/heart.csv)

[Raw File on Github](https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv)


In [42]:
##### Your Code Here #####
import pandas as pd

df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
df.head()

X = df.drop('target', axis=1)
y = df.target

X = np.array(X)

y=np.array(y)
y=np.reshape(y, (303,1))

X.shape, y.shape

((303, 13), (303, 1))

In [43]:
class Neural_Network(object):
    def __init__(self):
        self.inputs = 13
        self.hiddenNodes = 12
        self.outputNodes = 1
        
        # Initialize weights
        self.L1_weights = np.random.randn(self.inputs, self.hiddenNodes)
        self.L2_weights = np.random.randn(self.hiddenNodes, self.outputNodes)
        
    def feed_forward(self, X):
        # Weighted sum between inputs and hidden layer
        self.hidden_sum = np.dot(X, self.L1_weights)
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        # Weighted sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.L2_weights)
        # final actiavtion of output
        self.activated_output = self.sigmoid(self.output_sum)
        return self.activated_output
    
    def sigmoid(self, s):
        return 1/(1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1-s)
    
    def backward(self, X, y, o):
        # backward propagate through the network
        self.o_error = y - o # error in output
        self.o_delta = self.o_error*self.sigmoidPrime(o) # applying derivative of sigmoid to error 
        
        self.z2_error = self.o_delta.dot(self.L2_weights.T) # z2 error: how much our hidden layer weights contributed to output error
        self.z2_delta = self.z2_error*self.sigmoidPrime(self.activated_hidden) # applying derivative of sigmoid to z2 error
        
        self.L1_weights += X.T.dot(self.z2_delta)
        self.L2_weights += self.activated_hidden.T.dot(self.o_delta) # adjusting second set (hidden --> output) weights
    
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X, y, o)

In [47]:
NN = Neural_Network()

for _ in range(1000):
    NN.train(X, y)
NN.feed_forward(X)

array([[3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],
       [3.82687048e-35],


## 4) Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy. 

- Use the Heart Disease Dataset (binary classification)
- Use an appropriate loss function for a binary classification task
- Use an appropriate activation function on the final layer of your network. 
- Train your model using verbose output for ease of grading.
- Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
- When hyperparameter tuning, show you work by adding code cells for each new experiment. 
- Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
- You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

In [48]:
##### Your Code Here #####
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
import numpy as np

batch_size = 64
num_classes = 1
epochs = 50

model = Sequential()
model.add(Dense(16, activation = 'relu', input_shape=(13,)))
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(num_classes, activation='sigmoid'))
model.compile(loss='mse', optimizer = 'adam', metrics=['accuracy'])
model.summary()

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.


Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 16)                224       
_________________________________________________________________
dense_2 (Dense)              (None, 16)                272       
_________________________________________________________________
dropout_1 (Dropout)          (None, 16)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 17        
Total params: 513
Trainable params: 513
Non-trainable params: 0
_________________________________________________________________


In [None]:
history = model.fit(X, y, epochs=epochs, validation_split=.1)
scores = model.evaluate(X,y)

Instructions for updating:
Use tf.cast instead.
Train on 272 samples, validate on 31 samples
Epoch 1/50
