<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Perceptron on XOR Gates](#Q2)
3. [Multilayer Perceptron](#Q3)
4. [Keras MMP](#Q4)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:** Neurons are what constitute the input, hidden, and output layers of a neural network. Neurons consist of a set of inputs, a set of weights, and an activation function.
- **Input Layer:** The input layer is what brings the initial data into the neural network for further processing. It is the very beginning of the neural network pipeline.
- **Hidden Layer:** The hidden layer (or layers) is used to transform the inputs into values that are useful for the output layer.
- **Output Layer:** The output layer is the last layer in a neural network and produces the desired output for the model.
- **Activation:** Activation is achieved when a neuron is activated in a neural net. This is done using an activation function which determines whether a neuron is fired (activated) or not.
- **Backpropagation:** Backpropogation is the method in which the neural network adjusts it's parameters after an initial run.


## 2. Perceptron on XOR Gates <a id="Q3=2"></a>

Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

|x1	|x2|x3|	y|
|---|---|---|---|
1|	1|	1|	1|
1|	0|	1|	0|
0|	1|	1|	0|
0|	0|	1|	0|

In [1]:
import numpy as np
import pandas as pd

np.random.seed(812)

inputs = np.array([
    [1,1,1],
    [1,0,1],
    [0,1,1],
    [0,0,1]
])

ground_truth = [[1], [0], [0], [0]]

In [2]:
def sigmoid(x):
    return 1 / (1+np.exp(-x))

def sigmoid_derivate(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [3]:
weights = 2 *np.random.random((3,1)) -1
weights

array([[ 0.0099616 ],
       [ 0.21185521],
       [-0.08502562]])

In [4]:
for iteration in range(1000):
    
    # Weighted sum of inputs/weights
    weighted_sum = np.dot(inputs, weights)
    #print(weighted_sum)
    
    # Activate!
    activated_output = sigmoid(weighted_sum)
    #print(activated_output)
    
    # Calculate the error
    error = ground_truth - activated_output
    
    # Adjustments
    adjustments = error * sigmoid_derivate(activated_output)
    
    weights += np.dot(inputs.T, adjustments)
    #print(weights)
    
print("Weights after training")
print(weights)

print("Output after training")
print(activated_output)

Weights after training
[[  7.20886354]
 [  7.20886371]
 [-11.10146556]]
Output after training
[[9.64948649e-01]
 [2.00042945e-02]
 [2.00042978e-02]
 [1.51353405e-05]]


## 3. Multilayer Perceptron <a id="Q3"></a>

Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights.
Your network must have one hidden layer.
You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
Train your model on the Heart Disease dataset from UCI:



### Import data preprocessors

In [5]:
from sklearn.preprocessing import StandardScaler
num_transform = StandardScaler()

import category_encoders as ce
cat_transform = ce.one_hot.OneHotEncoder(verbose=0, cols=None, drop_invariant=False, return_df=True, handle_missing='value', handle_unknown='value', use_cat_names=False)

### Create class NeuralNetwork

In [6]:
class NeuralNetwork:
    def __init__(self):
        # Set up Architecture of Neural Network
        self.inputs = 30
        self.hiddenNodes = 5
        self.outputNodes = 1
        
        # Initial Weights 
        # 28x28 Matrix Array for the First Layer
        self.weights1 = np.random.randn(self.inputs, self.hiddenNodes)
        # 28x1 Matrix Array for Hidden to Output
        self.weights2 = np.random.randn(self.hiddenNodes, self.outputNodes)
        
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
        
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        """
        
        #Weighted sume of inputs and hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        #Acivations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        #Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
    
    def backward(self, X, y, o):
        """
        Backward propagate through the network
        """
        
        self.o_error = y - o #error in output
        self.o_delta = self.o_error * self.sigmoidPrime(o) # apply derivative of sigmoid to error
        
        self.z2_error = self.o_delta.dot(self.weights2.T) # z2 error: how much our hidden layer weights were off
        self.z2_delta = self.z2_error*self.sigmoidPrime(self.activated_hidden)
        
        self.weights1 += X.T.dot(self.z2_delta) #Adjust first set (input => hidden) weights
        self.weights2 += self.activated_hidden.T.dot(self.o_delta) #adjust second set (hidden => output) weights
        
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X, y, o)

### Import Data

In [7]:
df = pd.read_csv('heart.csv')

In [8]:
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [9]:
df.nunique()

age          41
sex           2
cp            4
trestbps     49
chol        152
fbs           2
restecg       3
thalach      91
exang         2
oldpeak      40
slope         3
ca            5
thal          4
target        2
dtype: int64

### Transform numerical and categorical columns

In [28]:
# Seperating numerical and categorical variables
df_cat = df[['sex', 'cp', 'fbs', 'restecg', 'exang', 'slope', 'ca', 'thal']]
df_num = df[['age', 'trestbps', 'chol', 'thalach', 'oldpeak']]
y = df[['target']]

In [11]:
# Transforming categorical variables into strings
df_cat = df_cat.astype(str)

In [12]:
# One hot encode categoricals
df_cat = cat_transform.fit_transform(df_cat)

# Standard scale numericals
df_num = num_transform.fit_transform(df_num)

  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


In [13]:
# Verify shape of dataframes for merge
df_cat.shape, df_num.shape

((303, 25), (303, 5))

In [14]:
# Transform df_num into dataframe for merge
df_num = pd.DataFrame(df_num)

In [15]:
# Merge Dataframes
X = df_cat.merge(df_num, how='outer', left_index=True, right_index=True)

In [16]:
# Check shape of X
X.shape

(303, 30)

### Neural Net Implementation

In [17]:
# Instantiate neural network class
nn= NeuralNetwork()

In [37]:
# Loss consistently decreases with each iteration!

for i in range(1000):
    if (i+1 in [1]) or ((i+1) % 250 ==0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
        #print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
    nn.train(X,y)

+---------EPOCH 1---------+
Actual Output: 
      target
0         1
1         1
2         1
3         1
4         1
5         1
6         1
7         1
8         1
9         1
10        1
11        1
12        1
13        1
14        1
15        1
16        1
17        1
18        1
19        1
20        1
21        1
22        1
23        1
24        1
25        1
26        1
27        1
28        1
29        1
..      ...
273       0
274       0
275       0
276       0
277       0
278       0
279       0
280       0
281       0
282       0
283       0
284       0
285       0
286       0
287       0
288       0
289       0
290       0
291       0
292       0
293       0
294       0
295       0
296       0
297       0
298       0
299       0
300       0
301       0
302       0

[303 rows x 1 columns]
Predicted Output: 
 [[0.99995674]
 [1.        ]
 [0.99999999]
 [0.99999999]
 [0.99999998]
 [0.99993956]
 [0.99999998]
 [0.99999998]
 [0.99999998]
 [1.        ]
 [0.98723591]
 [1.        ]

## 4. Keras MMP <a id="Q4"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

### Import Keras modules

In [31]:
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Reshape, Conv2D, AveragePooling2D, Flatten
from keras.layers import MaxPooling2D
from keras.wrappers.scikit_learn import KerasClassifier

from sklearn.model_selection import GridSearchCV

### Model Function

In [32]:
# Function to create model, required for KerasClassifier
def create_model():
    # create model
    model = Sequential()
    model.add(keras.layers.Dense(12, input_dim=30, activation='relu'))
    model.add(keras.layers.Dense(12, activation='relu'))
    model.add(keras.layers.Dense(12, activation='relu'))
    model.add(keras.layers.Dense(12, activation='relu'))
    model.add(keras.layers.Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model, verbose=1)


### GridSearch

In [36]:
# define the grid search parameters
# batch_size = [10, 20, 40, 60, 80, 100]
# param_grid = dict(batch_size=batch_size, epochs=epochs)

# define the grid search parameters
param_grid = {'batch_size': [ 20, 60, 100],
              'epochs': [20,40,60]
              }

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20