<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Chocolate Gummy Bears](#Q2)
    - Perceptron
    - Multilayer Perceptron
4. [Keras MMP](#Q3)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:**
 - A unit of computation in a neural network. Usually called a node or unit
- **Input Layer:**
 - The first layer of a neural network that contains an input node which provides information from the outside world. One or multiple input nodes make up a input layer
- **Hidden Layer:**
 - The layer between the input and output layers. It contains the nodes that perform computations and transferr information from the input to output nodes.
- **Output Layer:**
 - The final layer that contains the output nodes which is responsible to for transfering from the network to the outside world.
- **Activation:**
 - The activation function defines the output of a set of inputs that have been weighted and summed (bias might also be added in). Common activation functions are sigmoid and relu.
- **Backpropagation:**
 - Backpropagation used to update the weights/parameters of neural networks. First, inputs are processed through a forward step and produce an output. That output is differenced against the expected target output. The differences are then back propogated to find attributions from weights and biases parameters. Update the parameters and go through the steps until it converges to a solution.


## 2. Chocolate Gummy Bears <a id="Q2"></a>

Right now, you're probably thinking, "yuck, who the hell would eat that?". Great question. Your candy company wants to know too. And you thought I was kidding about the [Chocolate Gummy Bears](https://nuts.com/chocolatessweets/gummies/gummy-bears/milk-gummy-bears.html?utm_source=google&utm_medium=cpc&adpos=1o1&gclid=Cj0KCQjwrfvsBRD7ARIsAKuDvMOZrysDku3jGuWaDqf9TrV3x5JLXt1eqnVhN0KM6fMcbA1nod3h8AwaAvWwEALw_wcB). 

Let's assume that a candy company has gone out and collected information on the types of Halloween candy kids ate. Our candy company wants to predict the eating behavior of witches, warlocks, and ghosts -- aka costumed kids. They shared a sample dataset with us. Each row represents a piece of candy that a costumed child was presented with during "trick" or "treat". We know if the candy was `chocolate` (or not chocolate) or `gummy` (or not gummy). Your goal is to predict if the costumed kid `ate` the piece of candy. 

If both chocolate and gummy equal one, you've got a chocolate gummy bear on your hands!?!?!
![Chocolate Gummy Bear](https://ed910ae2d60f0d25bcb8-80550f96b5feb12604f4f720bfefb46d.ssl.cf1.rackcdn.com/3fb630c04435b7b5-2leZuM7_-zoom.jpg)

In [1]:
import pandas as pd
candy = pd.read_csv('chocolate_gummy_bears.csv')

In [2]:
print(candy.shape)
candy.head()

(10000, 3)


Unnamed: 0,chocolate,gummy,ate
0,0,1,1
1,1,0,1
2,0,1,1
3,0,0,0
4,1,1,0


### Perceptron

To make predictions on the `candy` dataframe. Build and train a Perceptron using numpy. Your target column is `ate` and your features: `chocolate` and `gummy`. Do not do any feature engineering. :P

Once you've trained your model, report your accuracy. You will not be able to achieve more than ~50% with the simple perceptron. Explain why you could not achieve a higher accuracy with the *simple perceptron* architecture, because it's possible to achieve ~95% accuracy on this dataset. Provide your answer in markdown (and *optional* data anlysis code) after your perceptron implementation. 

In [4]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler

X = candy[['chocolate', 'gummy']].values
y = candy['ate'].values

In [None]:
import numpy as np

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoid_derivative(x):
    sx = sigmoid(x)
    return sx * (1-sx)

In [5]:
class Perceptron:
 

    def __init__(self, niter=10):
        self.niter = niter
        np.random.seed(42)
        self.weights = 2 * np.random.random((2, 1)) - 1
  

    def sigmoid(self, x):
        return 1 / (1 + np.exp(-x))

    
    def sigmoid_derivative(self, x):
        sx = self.sigmoid(x)
        return sx * (1-sx)

    
    def train(self, X, y):
        for i in range(self.niter):
            # Weighted sum of inputs / weights
            weighted_sum = np.dot(X, self.weights)
            # Activate
            activated_output = self.sigmoid(weighted_sum)
            # Calculate error
            error = y - activated_output
            adjustments = error * self.sigmoid_derivative(activated_output)
            # Update the Weights
            self.weights += np.dot(X.T, adjustments)
            
    
    def predict(self, X):
        # Weighted sum of inputs / weights
        weighted_sum = np.dot(X, self.weights)
        predictions = np.round(self.sigmoid(weighted_sum))
        return predictions

In [6]:
cbears = Perceptron()
cbears.train(X, y)
cbears_score = accuracy_score(y, cbears.predict(X))
print(f'Accuracy for a single layer perceptron: {cbears_score}')

ValueError: non-broadcastable output operand with shape (2,1) doesn't match the broadcast shape (2,10000)

### Multilayer Perceptron <a id="Q3"></a>

Using the sample candy dataset, implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. Your Multilayer Perceptron should be implemented in Numpy. 
Your network must have one hidden layer.

Once you've trained your model, report your accuracy. Explain why your MLP's performance is considerably better than your simple perceptron's on the candy dataset. 

In [None]:
class NeuralNetwork:
    def __init__(self, inputs=2, hiddenNodes=4, outputNodes=1):
        # Set up architecture of neural network
        self.inputs = inputs
        self.hiddenNodes = hiddenNodes
        self.outputNodes = outputNodes
        
        # Initial weights
        self.weights1 = np.random.rand(self.inputs, self.hiddenNodes)       #2x4
        self.weights2 = np.random.rand(self.hiddenNodes, self.outputNodes)  #4x1 
        
    def sigmoid(self, s):
        return 1 / (1 + np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
    
    def feed_forward(self, X):
        # Weighted sum of inputs
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
    
    def backward(self, X, y, o):
        # Error in output
        self.o_error = y - o
        
        # Apply derivative of sigmoid to error
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        # How much error can be explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
        # Adjustment to first set of weights (input => hidden)
        self.weights1 += X.T.dot(self.z2_delta)
        # Adjustment to second set of weights (hidden => output)
        self.weights2 += self.activated_hidden.T.dot(self.o_delta)
        
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X, y, o)

In [None]:
y_train = y_train.reshape(-1,1)
y_test = y_test.reshape(-1,1)

In [None]:
# Train the MLP
nn = NeuralNetwork()

for i in range(10000):
    if (i+1 in [1, 2, 3, 4, 5]) or ((i+1) % 1000 == 0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---' * 3 + '+')
        print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print('Loss: \n', str(np.mean(np.square(y-nn.feed_forward(X)))))
    nn.train(X, y)

P.S. Don't try candy gummy bears. They're disgusting. 

## 3. Keras MMP <a id="Q3"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV or RandomSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 3 parameters in order to get a 3 on this section.

In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler

df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
df = df.sample(frac=1)
print(df.shape)
df.head()

In [None]:
from sklearn.model_selection import train_test_split, GridSearchCV
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier