<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Chocolate Gummy Bears](#Q2)
    - Perceptron
    - Multilayer Perceptron
4. [Keras MMP](#Q3)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:**
- **Input Layer:**
- **Hidden Layer:**
- **Output Layer:**
- **Activation:**
- **Backpropagation:**


## 2. Chocolate Gummy Bears <a id="Q2"></a>

Right now, you're probably thinking, "yuck, who the hell would eat that?". Great question. Your candy company wants to know too. And you thought I was kidding about the [Chocolate Gummy Bears](https://nuts.com/chocolatessweets/gummies/gummy-bears/milk-gummy-bears.html?utm_source=google&utm_medium=cpc&adpos=1o1&gclid=Cj0KCQjwrfvsBRD7ARIsAKuDvMOZrysDku3jGuWaDqf9TrV3x5JLXt1eqnVhN0KM6fMcbA1nod3h8AwaAvWwEALw_wcB). 

Let's assume that a candy company has gone out and collected information on the types of Halloween candy kids ate. Our candy company wants to predict the eating behavior of witches, warlocks, and ghosts -- aka costumed kids. They shared a sample dataset with us. Each row represents a piece of candy that a costumed child was presented with during "trick" or "treat". We know if the candy was `chocolate` (or not chocolate) or `gummy` (or not gummy). Your goal is to predict if the costumed kid `ate` the piece of candy. 

If both chocolate and gummy equal one, you've got a chocolate gummy bear on your hands!?!?!
![Chocolate Gummy Bear](https://ed910ae2d60f0d25bcb8-80550f96b5feb12604f4f720bfefb46d.ssl.cf1.rackcdn.com/3fb630c04435b7b5-2leZuM7_-zoom.jpg)

In [18]:
import pandas as pd
candy = pd.read_csv('chocolate_gummy_bears.csv')

In [19]:
print(candy.shape)
candy.head()

(10000, 3)


Unnamed: 0,chocolate,gummy,ate
0,0,1,1
1,1,0,1
2,0,1,1
3,0,0,0
4,1,1,0


### Perceptron

To make predictions on the `candy` dataframe. Build and train a Perceptron using numpy. Your target column is `ate` and your features: `chocolate` and `gummy`. Do not do any feature engineering. :P

Once you've trained your model, report your accuracy. You will not be able to achieve more than ~50% with the simple perceptron. Explain why you could not achieve a higher accuracy with the *simple perceptron* architecture, because it's possible to achieve ~95% accuracy on this dataset. Provide your answer in markdown (and *optional* data anlysis code) after your perceptron implementation. 

In [20]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

In [21]:
# Start your candy perceptron here

X = candy[['chocolate', 'gummy']].values
y = candy['ate'].values

In [22]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42,
                                                 )

X_train.shape, X_test.shape, y_train.shape, y_test.shape

((8000, 2), (2000, 2), (8000,), (2000,))

In [23]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [24]:
weights = 2*np.random.random((2, 1)) - 1

In [None]:
for iteration in range(10000):
    
    #weighted sum of inputs / weights
    weighted_sum = np.dot(X_train, weights)
    
    #activate
    activated_output = sigmoid(weighted_sum)
    
    #cac error
    error = y_train - activated_output
    
    adjustments = error * sigmoid_derivative(activated_output)
    
    #update the weights
    
    weights += np.dot(X_train.T, adjustments)

In [25]:
activated_output[activated_output < .5] = 0
activated_output[activated_output >= .5] = 1

In [26]:
print("accuracy:", accuracy_score(activated_output, y_train))

accuracy: 0.5025


### Multilayer Perceptron <a id="Q3"></a>

Using the sample candy dataset, implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. Your Multilayer Perceptron should be implemented in Numpy. 
Your network must have one hidden layer.

Once you've trained your model, report your accuracy. Explain why your MLP's performance is considerably better than your simple perceptron's on the candy dataset. 

In [27]:
class NeuralNetwork: 
    def __init__(self):
        # Set up Architecture 
        self.inputs = 2
        self.hiddenNodes = 8
        self.outputNodes = 1
        
        #Initial weights
        self.weights1 = np.random.randn(self.inputs, self.hiddenNodes)
        self.weights2 = np.random.randn(self.hiddenNodes, self.outputNodes)
    
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return np.exp(-s)/((1+np.exp(-s))**2)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        """
        
        #Weighted sume of inputs and hidden layer
        self.h2 = np.dot(X, self.weights1)
        
        #Acivations of weighted sum
        self.a2 = self.sigmoid(self.h2)
        
        # Weight sum between hidden and output
        self.h3 = np.dot(self.a2, self.weights2)
        
        #Final activation of output
        self.a3 = self.sigmoid(self.h3)
        
        return self.a3
    
    def backward(self, X, y, o, learning_rate=0.1):
        """
        Backward propagate through the network
        """
        self.o_error = y - o #error in output
        self.o_delta = learning_rate*(self.o_error * self.sigmoidPrime(o)) # apply derivative of sigmoid to error
        
        self.z2_error = self.o_delta.dot(self.weights2.T) # z2 error: how much our hidden layer weights were off
        self.z2_delta = self.z2_error*self.sigmoidPrime(self.a2)
        
        self.weights1 += X.T.dot(self.z2_delta) #Adjust first set (input => hidden) weights
        self.weights2 += self.a2.T.dot(self.o_delta) #adjust second set (hidden => output) weights
        
    def train(self, X, y, learning_rate=0.01):
        for _ in range(10000):
            o = self.feed_forward(X)
            self.backward(X, y, o, learning_rate=learning_rate)
        self.loss = np.mean(np.square(y-self.feed_forward(X)))
        print("Loss: " + str(self.loss))
        
    def predict(self, X, y):
        preds = self.feed_forward(X)
        
        preds[preds < .5] = 0
        preds[preds >= .5] = 1
        
        print("Accuracy:", accuracy_score(y, preds))

In [28]:
y_train = y_train.reshape(-1,1)
y_test = y_test.reshape(-1,1)

In [29]:
nn = NeuralNetwork()

nn.train(X_train, y_train)

nn.predict(X_test, y_test)
#I ran this before I restarted my kernal. I have over .92 before and I can't replicate since.\
#Can't figure out why

Loss: 0.20077882325769553
Accuracy: 0.729


P.S. Don't try candy gummy bears. They're disgusting. 

## 3. Keras MMP <a id="Q3"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV or RandomSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 3 parameters in order to get a 3 on this section.

In [3]:
import pandas as pd
from sklearn.preprocessing import StandardScaler

df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
df = df.sample(frac=1)
print(df.shape)
df.head()

(303, 14)


Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
176,60,1,0,117,230,1,1,160,1,1.4,2,2,3,0
130,54,0,2,160,201,0,1,163,0,0.0,2,1,2,1
255,45,1,0,142,309,0,0,147,1,0.0,1,3,3,0
50,51,0,2,130,256,0,0,149,0,0.5,2,0,2,1
232,55,1,0,160,289,0,0,145,1,0.8,1,1,3,0


In [4]:
import tensorflow
from tensorflow import keras 
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Input
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import SGD
from sklearn.preprocessing import OrdinalEncoder
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
import matplotlib.pyplot as plt

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-c559cab32e20>", line 3, in <module>
    from tensorflow.keras.models import Sequential, Model
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 959, in _find_and_load_unlocked
  File "C:\Users\archi\Anaconda3\lib\site-packages\tensorflow\__init__.py", line 50, in __getattr__
    module = self._load()
  File "C:\Users\archi\Anaconda3\lib\site-packages\tensorflow\__init__.py", line 44, in _load
    module = _importlib.import_module(self.__name__)
  File "C:\Users\archi\Anaconda3\lib\importlib\__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "C:\Users\archi\Anaconda3\lib\site-packages\tensorflow_core\__init__.py", line 42, in <module>
    from . 

TypeError: expected bytes, Descriptor found

In [35]:
X = df.drop(columns='target').to_numpy()
y = df['target'].to_numpy()

In [36]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.2, random_state=42)

X_train.shape, X_test.shape, y_train.shape, y_test.shape

((242, 13), (61, 13), (242,), (61,))

In [37]:
# Train a baseline model
# Parameters
inputs = X_train.shape[1]
epochs = 50
batch_size = 10

# Create Model
model = Sequential()
model.add(Dense(5, activation='relu', input_shape=(inputs,)))
model.add(Dense(5, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

# Compile Model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Fit Model
model.fit(X_train, y_train,
          validation_data=(X_test, y_test),
          epochs=epochs,
          batch_size=batch_size,
          verbose=True)

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-37-5f55d9cb7ab1>", line 8, in <module>
    model = Sequential()
NameError: name 'Sequential' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2040, in showtraceback
    stb = value._render_traceback_()
AttributeError: 'NameError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\ultratb.py", line 1101, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPytho

NameError: name 'Sequential' is not defined

In [38]:

# Hyperparameter Tuning with GridSearchCV
# Set random seed for reproducibility
seed = 42
np.random.seed(seed)

# Function to create model
def create_model():
    model = Sequential()
    model.add(Dense(10, input_dim=13, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# Create model
model = KerasClassifier(build_fn=create_model, verbose=True)

# Define grid search parameters
params = {'batch_size': [10, 20, 30],
          'epochs': [40, 50, 60]}

# Create gridsearch
grid = GridSearchCV(estimator=model, param_grid=params, cv=5, scoring='accuracy', n_jobs=1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.



Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-38-b202ae57bb41>", line 17, in <module>
    model = KerasClassifier(build_fn=create_model, verbose=True)
NameError: name 'KerasClassifier' is not defined

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2040, in showtraceback
    stb = value._render_traceback_()
AttributeError: 'NameError' object has no attribute '_render_traceback_'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\archi\Anaconda3\lib\site-packages\IPython\core\ultratb.py", line 1101, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "C:\

NameError: name 'KerasClassifier' is not defined

In [None]:
# baseline Keras model

wandb.config.epochs = 20
wandb.config.batch_size = 101

model.fit(X, y, 
          validation_split=0.33, 
          epochs=wandb.config.epochs, 
          batch_size=wandb.config.batch_size,
          verbose=True,
          callbacks=[WandbCallback()]
         )

In [None]:
%%time
# define the grid search parameters
# start with batch size
wandb.config.epochs = 20

param_grid = {'batch_size': [3, 101, 303]}

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=1)
grid_result = grid.fit(X, y,
                       epochs=wandb.config.epochs,
                       verbose=False,
                       callbacks=[WandbCallback()]
                      )

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Mean: {mean}, Stdev: {stdev} with: {param}")

In [None]:
%%time
# define the grid search parameters
# second, try epoch size
# wandb.config.batch_size = 3

param_grid = {'batch_size': [3],
              'epochs': [20, 50, 100]}

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=1)
grid_result = grid.fit(X, y,
                       verbose=False,
                       callbacks=[WandbCallback()]
                      )

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Mean: {mean}, Stdev: {stdev} with: {param}")

In [None]:
%%time
# define the grid search parameters
# third, optimizers
# wandb.config.epochs = 100
# wandb.config.batch_size = 101

param_grid = {'batch_size': [3],
              'epochs': [100],
              'optimizer': ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']}

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=1)
grid_result = grid.fit(X, y,
                       verbose=False,
                       callbacks=[WandbCallback()]
                      )

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Mean: {mean}, Stdev: {stdev} with: {param}")

In [None]:
%%time
# define the grid search parameters
# fourth, number of nodes in first hidden layer

param_grid = {'batch_size': [3],
              'epochs': [100],
              'optimizer': ['Adamax'],
              'hidden_1_nodes' : [7, 13, 26]}

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, cv=3, n_jobs=1)
grid_result = grid.fit(X, y,
                       verbose=False,
                       callbacks=[WandbCallback()]
                      )

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Mean: {mean}, Stdev: {stdev} with: {param}")