# Neural Networks Sprint Challenge

## 1) Define the following terms:

- Neuron
- Input Layer
- Hidden Layer
- Output Layer
- Activation
- Backpropagation

Neuron = An individual node in a neural output.

Input Layer = This is the layer that takes in the columns of data used to make the predictions. Each column is a node in the input layer

Hidden Layer = Connects the input layer to the output layer, there are a variable amount of hidden layers in each neural network, while there is only one input and output layer.

Output Layer =  The output layer receives information from the hidden layer(s) to create an output

Activation = Activation controls how much signal gets passed from one layer of a neural network to another layer, the activation I will use is sigmoid.

Backpropagation = This is an algorithm used to update parameters of neural networks. The most common form of backpropagation is gradient descent and it is also what I use in section 3. 

## 2) Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

| x1 | x2 | x3 | y |
|----|----|----|---|
| 1  | 1  | 1  | 1 |
| 1  | 0  | 1  | 0 |
| 0  | 1  | 1  | 0 |
| 0  | 0  | 1  | 0 |

In [0]:
import tensorflow

In [2]:
from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.


In [0]:
import numpy as np

In [0]:
class Perceptron(object):
    def __init__(self, rate = 0.01, niter = 10):
        self.rate = rate
        self.niter = niter
    
    def fit(self, X, y):
        self.weight = np.random.uniform(-1, 1, X.shape[1] + 1)
        
        self.errors = []
        
        for i in range(self.niter):
            err = 0
            for xi, target in zip(X, y):
                delta_w = self.rate * (target - self.predict(xi))
                self.weight[1:] += delta_w * xi
                self.weight[0] += delta_w
                err += int(delta_w != 0.0)
            self.errors.append(err)
        return self
    
    def net_input(self, X):
        return np.dot(X, self.weight[1:]) + self.weight[0]
    
    def predict(self, X):
        return np.where(self.net_input(X) >= 0.5, 1, 0)
    

In [0]:
X = np.array([[1,1,1],
             [1,0,1],
             [0,1,1],
             [0,0,1]])

In [0]:
y = np.array([1, 0, 0, 0])

In [0]:
model = Perceptron(rate = 1, niter=20)

In [8]:
model.fit(X, y)

<__main__.Perceptron at 0x7f679f8e7c88>

In [9]:
model.predict(X)

array([1, 0, 0, 0])

## 3) Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. 
- Your network must have one hidden layer. 
- You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
- Train your model on the Heart Disease dataset from UCI:

[Github Dataset](https://github.com/ryanleeallred/datasets/blob/master/heart.csv)

[Raw File on Github](https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv)


In [0]:
import pandas as pd

In [11]:
data = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
data.head(20)

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1
5,57,1,0,140,192,0,1,148,0,0.4,1,0,1,1
6,56,0,1,140,294,0,0,153,0,1.3,1,0,2,1
7,44,1,1,120,263,0,1,173,0,0.0,2,0,3,1
8,52,1,2,172,199,1,1,162,0,0.5,2,0,3,1
9,57,1,2,150,168,0,1,174,0,1.6,2,0,2,1


In [0]:
data2 = data.drop('target', axis=1)

In [0]:
from sklearn.preprocessing import StandardScaler

In [0]:
scaler = StandardScaler()

In [15]:
scaler.fit(data2)

  return self.partial_fit(X, y)


StandardScaler(copy=True, with_mean=True, with_std=True)

In [16]:
scaled_df = scaler.fit_transform(data.drop('target', axis=1))

  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


In [0]:
scaled_df = pd.DataFrame(scaled_df, columns=['age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach',
       'exang', 'oldpeak', 'slope', 'ca', 'thal'])

In [18]:
scaled_df.head(15)

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal
0,0.952197,0.681005,1.973123,0.763956,-0.256334,2.394438,-1.005832,0.015443,-0.696631,1.087338,-2.274579,-0.714429,-2.148873
1,-1.915313,0.681005,1.002577,-0.092738,0.072199,-0.417635,0.898962,1.633471,-0.696631,2.122573,-2.274579,-0.714429,-0.512922
2,-1.474158,-1.468418,0.032031,-0.092738,-0.816773,-0.417635,-1.005832,0.977514,-0.696631,0.310912,0.976352,-0.714429,-0.512922
3,0.180175,0.681005,0.032031,-0.663867,-0.198357,-0.417635,0.898962,1.239897,-0.696631,-0.206705,0.976352,-0.714429,-0.512922
4,0.290464,-1.468418,-0.938515,-0.663867,2.08205,-0.417635,0.898962,0.583939,1.435481,-0.379244,0.976352,-0.714429,-0.512922
5,0.290464,0.681005,-0.938515,0.478391,-1.048678,-0.417635,0.898962,-0.072018,-0.696631,-0.551783,-0.649113,-0.714429,-2.148873
6,0.180175,-1.468418,0.032031,0.478391,0.922521,-0.417635,-1.005832,0.146634,-0.696631,0.224643,-0.649113,-0.714429,-0.512922
7,-1.143291,0.681005,0.032031,-0.663867,0.323431,-0.417635,0.898962,1.021244,-0.696631,-0.896862,0.976352,-0.714429,1.123029
8,-0.26098,0.681005,1.002577,2.306004,-0.9134,2.394438,0.898962,0.540209,-0.696631,-0.465514,0.976352,-0.714429,1.123029
9,0.290464,0.681005,1.002577,1.04952,-1.51249,-0.417635,0.898962,1.064975,-0.696631,0.483451,0.976352,-0.714429,-0.512922


In [0]:
X = np.array(scaled_df)

In [20]:
X

array([[ 0.9521966 ,  0.68100522,  1.97312292, ..., -2.27457861,
        -0.71442887, -2.14887271],
       [-1.91531289,  0.68100522,  1.00257707, ..., -2.27457861,
        -0.71442887, -0.51292188],
       [-1.47415758, -1.46841752,  0.03203122, ...,  0.97635214,
        -0.71442887, -0.51292188],
       ...,
       [ 1.50364073,  0.68100522, -0.93851463, ..., -0.64911323,
         1.24459328,  1.12302895],
       [ 0.29046364,  0.68100522, -0.93851463, ..., -0.64911323,
         0.26508221,  1.12302895],
       [ 0.29046364, -1.46841752,  0.03203122, ..., -0.64911323,
         0.26508221, -0.51292188]])

In [0]:
from sklearn.model_selection import train_test_split

In [0]:
y = np.array([data['target']]).T

In [23]:
y.shape

(303, 1)

In [24]:
X.shape, y.shape

((303, 13), (303, 1))

In [25]:
y.shape

(303, 1)

In [0]:
b=np.random.randn(303, 1)

In [0]:
from scipy import optimize

In [0]:
class Neural_Network(object):
    def __init__(self):
        self.inputs = 13
        self.hiddenNodes = 303
        self.outputNodes = 1
        
        self.L1_weights = np.random.randn(self.inputs, self.hiddenNodes)
        self.L2_weights = np.random.randn(self.hiddenNodes, self.outputNodes)

    def feed_forward(self, X):
        self.hidden_sum = np.dot(X, self.L1_weights)
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        self.output_sum = np.dot(self.activated_hidden, self.L2_weights)
        self.activated_output = self.sigmoid(self.output_sum)
        return self.activated_output
        
    def sigmoid(self, s):
        return 1/(1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
    
    def backward(self, X, y, o):
        self.o_error = y - o 
        self.o_delta = self.o_error*self.sigmoidPrime(o) 

        self.z2_error = self.o_delta.dot(self.L2_weights.T)
        self.z2_delta = self.z2_delta = self.z2_error*self.sigmoidPrime(self.activated_hidden) 

        self.L1_weights += X.T.dot(self.z2_delta) 
        self.L2_weights += self.activated_hidden.T.dot(self.o_delta) 
        
    def train (self, X, y):
        o = self.feed_forward(X)
        self.backward(X, y, o)
        

In [0]:
nn = Neural_Network()

In [0]:
nn.train(X,y)

In [31]:
for i in range (100):
    print('Epoch', i+1, ': ')
    print('loss: ', str(np.mean(np.square(y - nn.feed_forward(X)))))
    print('\n')
    nn.train(X=X, y=y)

Epoch 1 : 
loss:  0.45544554455445546


Epoch 2 : 
loss:  0.45544554455445546


Epoch 3 : 
loss:  0.45544554455445546


Epoch 4 : 
loss:  0.45544554455445546


Epoch 5 : 
loss:  0.45544554455445546


Epoch 6 : 
loss:  0.45544554455445546


Epoch 7 : 
loss:  0.45544554455445546


Epoch 8 : 
loss:  0.45544554455445546


Epoch 9 : 
loss:  0.45544554455445546


Epoch 10 : 
loss:  0.45544554455445546


Epoch 11 : 
loss:  0.45544554455445546


Epoch 12 : 
loss:  0.45544554455445546


Epoch 13 : 
loss:  0.45544554455445546


Epoch 14 : 
loss:  0.45544554455445546


Epoch 15 : 
loss:  0.45544554455445546


Epoch 16 : 
loss:  0.45544554455445546


Epoch 17 : 
loss:  0.45544554455445546


Epoch 18 : 
loss:  0.45544554455445546


Epoch 19 : 
loss:  0.45544554455445546


Epoch 20 : 
loss:  0.45544554455445546


Epoch 21 : 
loss:  0.45544554455445546


Epoch 22 : 
loss:  0.45544554455445546


Epoch 23 : 
loss:  0.45544554455445546


Epoch 24 : 
loss:  0.45544554455445546


Epoch 25 : 
loss:  0.4554

## 4) Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy. 

- Use the Heart Disease Dataset (binary classification)
- Use an appropriate loss function for a binary classification task
- Use an appropriate activation function on the final layer of your network. 
- Train your model using verbose output for ease of grading.
- Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
- When hyperparameter tuning, show you work by adding code cells for each new experiment. 
- Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
- You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

In [0]:
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier

In [35]:
def keras_model():
    model = Sequential()
    model.add(Dense(6, input_dim=13, activation='sigmoid'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model
  
model = KerasClassifier(build_fn=keras_model)
kfold = StratifiedKFold(n_splits=4, shuffle=True, random_state=42)

batch_size = [10,30,50]
epochs = [50,100,150]
param_grid = dict(batch_size=batch_size, epochs=epochs)

grid_search = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=3, cv=kfold)

results = grid_search.fit(X, y)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.




Epoch 1/150
Epoch 2/150
Epoch 3/150
Epoch 4/150
Epoch 5/150
Epoch 6/150
Epoch 7/150
Epoch 8/150
Epoch 9/150
Epoch 10/150
Epoch 11/150
Epoch 12/150
Epoch 13/150
Epoch 14/150
Epoch 15/150
Epoch 16/150
Epoch 17/150
Epoch 18/150
Epoch 19/150
Epoch 20/150
Epoch 21/150
Epoch 22/150
Epoch 23/150
Epoch 24/150
Epoch 25/150
Epoch 26/150
Epoch 27/150
Epoch 28/150
Epoch 29/150
Epoch 30/150
Epoch 31/150
Epoch 32/150
Epoch 33/150
Epoch 34/150
Epoch 35/150
Epoch 36/150
Epoch 37/150
Epoch 38/150
Epoch 39/150
Epoch 40/150
Epoch 41/150
Epoch 42/150
Epoch 43/150
Epoch 44/150
Epoch 45/150
Epoch 46/150
Epoch 47/150
Epoch 48/150
Epoch 49/150
Epoch 50/150
Epoch 51/150
Epoch 52/150
Epoch 53/150
Epoch 54/150
Epoch 55/150
Epoch 56/150
Epoch 57/150
Epoch 58/150
Epoch 59/150
Epoch 60/150
Epoch 61/150
Epoch 62/150
Epoch 63/150
Epoch 64/150
Epoch 65/150
Epoch 66/150
Epoch 67/150
Epoch 68/150
Epoch 69/150
Epoch 70/150
Epoch 71/150
Epoch 72/150
Epoch 73/150
Epoch 74/150
Epoch 75/150
Epoch 76/150
Epoch 77/150
Epoch 78

In [37]:
print(f"Best Score: {results.best_score_:2f} using {results.best_params_}")

Best Score: 0.858086 using {'batch_size': 30, 'epochs': 150}
