<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Chocolate Gummy Bears](#Q2)
    - Perceptron
    - Multilayer Perceptron
4. [Keras MMP](#Q3)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:** Cells in the brain connected by synapses that transmit electrical impulses. In the context of artificial neural networks, the analogous description of a neuron is to receive an input in the form of parameters (values/weights/biases) that are linearly combined and transformed into a non-linear output according to the neuron's function (activation), and passed either as a final output or to the next stage in the neural network. 
- **Input Layer:** The first layer of a neural network. The input layer is traditionally made of a neuron for each feature of the dataset that's fed to the neural network. 
- **Hidden Layer:** The intermediary neuronal computations between an input and output layer of a neural network. Hidden layers don't need to correspond to any obviously recognizable feature to the outside world. 
- **Output Layer:** The last layer of a neural network. It is responsible for computing the final result with a neuron for each output variable. Each output is in the form of a number between 0 and 1 for the output query. 
- **Activation:** The activation function of a neuron computes the output response given an input. The output shape produced is one that is either most useful for the next neuronal layer or one for the final output. Typical functions like tanh, or ReLU produce outputs with large range to maximally benefit the cost function of the neural network, while sigmoid functions typically produce final outputs between 0 and 1.
- **Backpropagation:** An algorithm used in the training of _feedforward_ neural networks, whereby a net summation comparison of neural network's output with true output leads to the backpropogating update of the neural network parameters (weights/biases). This process trains the neural network. The gradient of the loss function determines which direction each of the parameters need to change to minimize the loss function, with larger steps on maximal gains and smaller steps on minimal gains proportional to a paramter's importance. The action continues lowering the error comparison with each iteration generating a best result. 

## 2. Chocolate Gummy Bears <a id="Q2"></a>

Right now, you're probably thinking, "yuck, who the hell would eat that?". Great question. Your candy company wants to know too. And you thought I was kidding about the [Chocolate Gummy Bears](https://nuts.com/chocolatessweets/gummies/gummy-bears/milk-gummy-bears.html?utm_source=google&utm_medium=cpc&adpos=1o1&gclid=Cj0KCQjwrfvsBRD7ARIsAKuDvMOZrysDku3jGuWaDqf9TrV3x5JLXt1eqnVhN0KM6fMcbA1nod3h8AwaAvWwEALw_wcB). 

Let's assume that a candy company has gone out and collected information on the types of Halloween candy kids ate. Our candy company wants to predict the eating behavior of witches, warlocks, and ghosts -- aka costumed kids. They shared a sample dataset with us. Each row represents a piece of candy that a costumed child was presented with during "trick" or "treat". We know if the candy was `chocolate` (or not chocolate) or `gummy` (or not gummy). Your goal is to predict if the costumed kid `ate` the piece of candy. 

If both chocolate and gummy equal one, you've got a chocolate gummy bear on your hands!?!?!
![Chocolate Gummy Bear](https://ed910ae2d60f0d25bcb8-80550f96b5feb12604f4f720bfefb46d.ssl.cf1.rackcdn.com/3fb630c04435b7b5-2leZuM7_-zoom.jpg)

In [1]:
import pandas as pd
candy = pd.read_csv('chocolate_gummy_bears.csv')
candy.shape

(10000, 3)

In [2]:
candy.head()

Unnamed: 0,chocolate,gummy,ate
0,0,1,1
1,1,0,1
2,0,1,1
3,0,0,0
4,1,1,0


In [3]:
# Get a feel for the data prediction
# Data highly resembles a XOR gate

print('Chocolate=0,Gummy=0')
print(candy[(candy['chocolate']==0) & (candy['gummy']==0)]['ate'].value_counts())

print('\nChocolate=0,Gummy=1')
print(candy[(candy['chocolate']==0) & (candy['gummy']==1)]['ate'].value_counts())

print('\nChocolate=1,Gummy=0')
print(candy[(candy['chocolate']==1) & (candy['gummy']==0)]['ate'].value_counts())

print('\nChocolate=1,Gummy=1')
print(candy[(candy['chocolate']==1) & (candy['gummy']==1)]['ate'].value_counts())

Chocolate=0,Gummy=0
0    2377
1     141
Name: ate, dtype: int64

Chocolate=0,Gummy=1
1    2360
0     131
Name: ate, dtype: int64

Chocolate=1,Gummy=0
1    2359
0     130
Name: ate, dtype: int64

Chocolate=1,Gummy=1
0    2362
1     140
Name: ate, dtype: int64


### Perceptron

To make predictions on the `candy` dataframe. Build and train a Perceptron using numpy. Your target column is `ate` and your features: `chocolate` and `gummy`. Do not do any feature engineering. :P

Once you've trained your model, report your accuracy. You will not be able to achieve more than ~50% with the simple perceptron. Explain why you could not achieve a higher accuracy with the *simple perceptron* architecture, because it's possible to achieve ~95% accuracy on this dataset. Provide your answer in markdown (and *optional* data anlysis code) after your perceptron implementation. 

In [4]:
import numpy as np
from sklearn.metrics import accuracy_score

In [5]:
np.random.seed(42)

In [6]:
# Start your candy perceptron here

X = candy[['chocolate', 'gummy']].values
y = candy['ate'].values

In [7]:
X

array([[0, 1],
       [1, 0],
       [0, 1],
       ...,
       [0, 1],
       [0, 1],
       [1, 0]])

In [8]:
y

array([1, 1, 1, ..., 1, 1, 1])

In [9]:
# Code largely borrowed from lecture U4S2M1
# Since the directions say not to do any feature engineering, let's first try to do by ignoring bias

class Perceptron(object):
    def __init__(self, rate=0.01, iters = 10):
        self.rate = rate 
        self.iters = iters 
        
    def train(self, X, y):
        '''Fits the training data
        X =  Training Vectors 
        X.shape : [#samples , #features ]
        y = Target values 
        y.shape = [# samples]
        '''
                
        # Initalize weights to random (-1,1)
        self.weights = 2 * np.random.random((X.shape[1])) - 1
        # Print initial weights
        print('Initial weights: ',self.weights)
                            
        #loop till iteration limit is met 
        for i in range (self.iters):
            #count=0
            for xi,target in zip(X,y):
                
                #diagnostics
                #if ((i==0) & (count<10)):
                #    print(xi,target)
                #    count = count + 1
                
                # Weighted sum of inputs / weights
                weighted_sum = np.dot(xi, self.weights)

                # Activate!
                activated_output = self.sigmoid(weighted_sum)

                # Calc error
                error = target - activated_output
                
                # Get adjustments for weights, scale by self.rate
                adjustment = self.rate * (error * self.sigmoid_derivative(activated_output))
                
                # Update the Weights
                self.weights += np.dot(xi.T, adjustment)

            # Print final weights
            if i==self.iters-1:
                print('Final weights: ',self.weights)
            
        return self
    
    def sigmoid(self,x):
        return (1 / (1 + np.exp(-x)))
    
    def sigmoid_derivative(self,x):
        sx = self.sigmoid(x)
        return sx * (1 - sx)

    def predict(self, X):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
                
        weighted_sum = np.dot(X, self.weights)
        activated_output = self.sigmoid(weighted_sum)
        round_result = np.where(activated_output >= 0.5, 1, 0)
        
        return round_result

In [10]:
pn = Perceptron(0.002,10);
pn.train(X,y);

Initial weights:  [-0.25091976  0.90142861]
Final weights:  [-0.0181422   0.02735231]


In [11]:
# Accuracy for given candy dataframe cannot reach accuracy > 0.5
# An XOR gate cannot been written in a linear combination of two terms.
# Another way to look at it, an XOR gate cannot be logically created with a one layer OR,AND,NOR,NAND gate
#  which are the only similar options for how a weighted sum is computed by a symmetric activation function. 
# Also, there's no bias

predictions = pn.predict(X)

print('')
print('Accuray: ',accuracy_score(y, predictions))


Accuray:  0.2771


In [12]:
# NOW CHECK RESULT INCLUDING BIAS

In [13]:
# Code largely borrowed from lecture U4S2M1
# Creating class, this time with a bias

class Perceptron(object):
    def __init__(self, rate=0.01, iters = 10):
        self.rate = rate 
        self.iters = iters 
        
    def train(self, X, y):
        '''Fits the training data
        X =  Training Vectors 
        X.shape : [#samples , #features ]
        y = Target values 
        y.shape = [# samples]
        '''
                
        # Initalize weights to random (-1,1)
        self.weights = 2 * np.random.random((1+X.shape[1])) - 1
        # Print initial weights
        print('Initial weights: ',self.weights)
                            
        #loop till iteration limit is met 
        for i in range (self.iters):
            #count=0
            for xi,target in zip(X,y):
                
                #diagnostics
                #if ((i==0) & (count<10)):
                #    print(xi,target)
                #    count = count + 1
                
                # Weighted sum of inputs / weights with bias
                weighted_sum = np.dot(xi, self.weights[:-1]) + self.weights[-1] 

                # Activate!
                activated_output = self.sigmoid(weighted_sum)

                # Calc error
                error = target - activated_output
                
                # Get adjustments for weights, scale by self.rate
                adjustment = self.rate * (error * self.sigmoid_derivative(activated_output))
                
                # Update the Weights
                self.weights[:-1] += adjustment * xi
                self.weights[-1] += adjustment    #bias

            # Print final weights
            if i==self.iters-1:
                print('Final weights: ',self.weights)
            
        return self
    
    def sigmoid(self,x):
        return (1 / (1 + np.exp(-x)))
    
    def sigmoid_derivative(self,x):
        sx = self.sigmoid(x)
        return sx * (1 - sx)

    def predict(self, X):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
                
        weighted_sum = np.dot(X, self.weights[:-1]) + self.weights[-1]
        activated_output = self.sigmoid(weighted_sum)
        round_result = np.where(activated_output >= 0.5, 1, 0)
        
        return round_result

In [14]:
pn = Perceptron(0.002,50);
pn.train(X,y);

Initial weights:  [ 0.46398788  0.19731697 -0.68796272]
Final weights:  [ 0.00966221  0.00901784 -0.00276011]


In [15]:
# Accuracy for given candy dataframe with bias may reach ~.72

predictions = pn.predict(X)

print('')
print('Accuray: ',accuracy_score(y, predictions))


Accuray:  0.7236


### Multilayer Perceptron <a id="Q3"></a>

Using the sample candy dataset, implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. Your Multilayer Perceptron should be implemented in Numpy. 
Your network must have one hidden layer.

Once you've trained your model, report your accuracy. Explain why your MLP's performance is considerably better than your simple perceptron's on the candy dataset. 

### But before, Demonstrate xor gate with nand, or, and gates

XOR gate can be logically created with a two-layer set of gates utilizing OR,AND,NOR,NAND gates, which our MLP  _should_ closely converge. This should explain how our MLP should be able to generate a high accuracy.

![XOR gate](https://upload.wikimedia.org/wikipedia/commons/e/ed/3_gate_XOR.svg)

All gate weights were trained by Module1-Assignment steps. (I remove those for demonstration)

In [16]:
def sigmoid(x):
        return (1 / (1 + np.exp(-x)))

In [17]:
# NAND gate weights

weights_nand=[[-11.84042561],[-11.84042561],[ 17.80950276]] #computed externally

data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [1,1,1,0]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df['ones']=np.ones(4) # bias input
df['ones']=df['ones'].astype('int')
inputs = df[['x1','x2','ones']]
correct_outputs = df[['y']]

weighted_sum = np.dot(inputs, weights_nand)
sigmoid(weighted_sum) # Activated output matches desired NAND gate

array([[0.99999998],
       [0.99744992],
       [0.99744992],
       [0.00281114]])

In [18]:
# OR gate weights

weights_or=[[13.28361249],[13.28361323],[-6.30303619]] # computed externally

data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [0,1,1,1]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df['ones']=np.ones(4) # bias input
df['ones']=df['ones'].astype('int')
inputs = df[['x1','x2','ones']]
correct_outputs = df[['y']]

weighted_sum = np.dot(inputs, weights_or)
sigmoid(weighted_sum) # Activated output matches desired OR gate

array([[0.00182739],
       [0.9990711 ],
       [0.9990711 ],
       [1.        ]])

In [19]:
# AND gate weights

weights_and = [[ 12.03108603], [ 12.03108603], [-18.33494183]] # computed externally

data = { 'x1': [0,1,0,1],
         'x2': [0,0,1,1],
         'y':  [0,0,0,1]
       }

df = pd.DataFrame.from_dict(data).astype('int')
df['ones']=np.ones(4) # bias input
df['ones']=df['ones'].astype('int')
inputs = df[['x1','x2','ones']]
correct_outputs = df[['y']]

weighted_sum = np.dot(inputs, weights_and)
sigmoid(weighted_sum) # Activated output matches desired AND gate

array([[1.08952182e-08],
       [1.82589798e-03],
       [1.82589798e-03],
       [9.96754484e-01]])

In [20]:
# Forward propogate with gates of weights from above

inputs = df[['x1','x2']]

# Set weights
weights1=np.hstack((weights_nand,weights_or))
weights2=np.array((weights_and))

# 1st layer propogation
b = np.ones((4,1))
inputs_wbias = np.hstack((inputs,b))
hidden_sum = np.dot(inputs_wbias, weights1)
activated_hidden = sigmoid(hidden_sum)

# 2nd layer propogation
b = np.ones((4,1))
activated_hidden_wbias = np.hstack((activated_hidden,b))
output_sum = np.dot(activated_hidden_wbias, weights2)
activated_output = sigmoid(output_sum)

# Result
activated_output  # Result matches desired nand gate

array([[0.00186641],
       [0.99661623],
       [0.99661623],
       [0.00188859]])

In [21]:
#DELETE ME
# Forward propogate with gates of weights from above

b = np.ones((4,1))
inputs = df[['x1','x2']]
inputs = np.hstack((inputs,b))

# Set weights
b = np.ones((3,1))
weights1=np.hstack((weights_nand,weights_or,b))
weights2=np.array((weights_and))

# 1st layer propogation
hidden_sum = np.dot(inputs, weights1)
activated_hidden = sigmoid(hidden_sum)

# 2nd layer propogation
output_sum = np.dot(activated_hidden, weights2)
activated_output = sigmoid(output_sum)

# Result
activated_output  # Result matches desired nand gate

#print(inputs.shape,weights1.shape)
#print(activated_hidden.shape,weights2.shape)

array([[0.20573223],
       [0.99961848],
       [0.99961848],
       [0.00449411]])

### Now, let's see the results by training our own multilayer perceptron

In [22]:
import pandas as pd
import numpy as np
from sklearn.metrics import accuracy_score
candy = pd.read_csv('chocolate_gummy_bears.csv')
X = candy[['chocolate', 'gummy']].values
y = candy[['ate']].values
np.random.seed(42)

In [23]:
# Modified class structure from U4S2M2 notes

class NeuralNetwork:
    def __init__(self):
        self.rate = .01 
        
        # Set up Architecture of Neural Network
        self.inputs = 2 + 1       # +1 for bias
        self.hiddenNodes = 2  
        self.outputNodes = 1

        # Initial Weights
        # 2x2 Matrix Array for the First Layer
        self.weights1 = 2 * np.random.rand(self.inputs, self.hiddenNodes) - 1
       
        # 2x1 Matrix Array for Hidden to Output
        self.weights2 = 2 * np.random.rand(self.hiddenNodes+1, self.outputNodes) - 1
        
        #print(self.weights1.shape)
        #print(self.weights2.shape)
        
        #Testing forwardpropogation with above weights
        #weights_nand=[[-11.84042561],[-11.84042561],[ 17.80950276]] 
        #weights_or=[[13.28361249],[13.28361323],[-6.30303619]]
        #weights_and = [[ 12.03108603], [ 12.03108603], [-18.33494183]]
        #self.weights1=np.hstack((weights_nand,weights_or))
        #self.weights2=np.array((weights_and))
        
            
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        sx = self.sigmoid(s)
        return sx * (1 - sx)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        # Add bias column to input df
        b = np.ones((X.shape[0],1))
        X = np.hstack((X,b))

        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Add bias column to activated_hidden
        b = np.ones((X.shape[0],1))
        self.activated_hidden_wbias = np.hstack((self.activated_hidden,b))
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden_wbias, self.weights2)
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output

    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """        
        
        # Add bias column to input df
        b = np.ones((X.shape[0],1))
        X = np.hstack((X,b))
        
        # Error in Output
        self.o_error = y - o
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden_wbias)
        
        print(X.shape)
        print(self.z2_delta.shape)
        print(self.weights1.shape)
        
        
        print()
        print("BEFORE weights2:")
        print(self.weights2)
        
        # Apply adjustment to first set of weights (input => hidden)
        #self.weights1 += self.rate * X.T.dot(self.z2_delta)
        # Apply adjustment to second set of weights (hidden => output)
        self.weights2 += self.rate * self.activated_hidden_wbias.T.dot(self.o_delta)
        
        print()
        print("AFTER weights2:")
        print(self.weights2)
        
        print()
        print('o_delta')
        print(self.o_delta)
        
        print()
        print('activated hidden w bias')
        print(self.activated_hidden_wbias)
        
    
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X,y,o)
    
    def predict(self, X, o):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
        
        round_result = np.where(o >= 0.5, 1, 0)
        
        return round_result

In [24]:
nn = NeuralNetwork()
o = nn.feed_forward(X)
nn.backward(X,y,o)

#accuracy_score(y, nn.predict(X,o))


(10000, 3)
(10000, 3)
(3, 2)

BEFORE weights2:
[[-0.88383278]
 [ 0.73235229]
 [ 0.20223002]]

AFTER weights2:
[[-1.29940178]
 [ 0.06280733]
 [-1.08747634]]

o_delta
[[0.11173925]
 [0.09438967]
 [0.11173925]
 ...
 [0.11173925]
 [0.11173925]
 [0.09438967]]

activated hidden w bias
[[0.4442392  0.37973009 1.        ]
 [0.28112613 0.55315282 1.        ]
 [0.4442392  0.37973009 1.        ]
 ...
 [0.4442392  0.37973009 1.        ]
 [0.4442392  0.37973009 1.        ]
 [0.28112613 0.55315282 1.        ]]


In [25]:
# Testing

In [26]:
# Modified class structure from U4S2M2 notes

class NeuralNetwork:
    def __init__(self):
        self.rate = .01 
        
        # Set up Architecture of Neural Network
        self.inputs = 2       # +1 for bias
        self.hiddenNodes = 2  
        self.outputNodes = 1

        # Initial Weights
        # 2x2 Matrix Array for the First Layer
        self.weights1 = 2 * np.random.rand(self.inputs, self.hiddenNodes) - 1
        self.b1 = np.ones((1, self.hiddenNodes))
        
        #b = np.ones((3,1))
        #self.weights1=np.hstack((self.weights1,b))
       
        # 2x1 Matrix Array for Hidden to Output
        self.weights2 = 2 * np.random.rand(self.hiddenNodes, self.outputNodes) - 1
        self.b2 = np.ones((1, self.outputNodes))
        
        #print(self.weights1.shape)
        #print(self.weights2.shape)
        
        #Testing forwardpropogation with above weights
        #weights_nand=[[-11.84042561],[-11.84042561],[ 17.80950276]] 
        #weights_or=[[13.28361249],[13.28361323],[-6.30303619]]
        #weights_and = [[ 12.03108603], [ 12.03108603], [-18.33494183]]
        #self.weights1=np.hstack((weights_nand,weights_or))
        #self.weights2=np.array((weights_and))
        
            
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        sx = self.sigmoid(s)
        return sx * (1 - sx)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        # Add bias column to input df
        #b = np.ones((X.shape[0],1))
        #X = np.hstack((X,b))

        
        

        
        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1) + self.b1
    
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Add bias column to activated_hidden
        #b = np.ones((X.shape[0],1))
        #self.activated_hidden_wbias = np.hstack((self.activated_hidden,b))
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2) + self.b2
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output

    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """        
        
        # Add bias column to input df
        #b = np.ones((X.shape[0],1))
        #X = np.hstack((X,b))
        
        # Error in Output
        self.o_error = y - o
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
#         print(X.shape)
#         print(self.z2_delta.shape)
#         print(self.weights1.shape)
        
        
#         print()
#         print("BEFORE weights2:")
#         print(self.weights2)
        
        # Apply adjustment to first set of weights (input => hidden)
        self.weights1 -= self.rate * X.T.dot(self.z2_delta)
        self.b1       -= self.rate * np.sum(self.z2_delta, axis=0)
        # Apply adjustment to second set of weights (hidden => output)
        self.weights2 -= self.rate * self.activated_hidden.T.dot(self.o_delta)
        self.b2       -= self.rate * np.sum(self.o_delta, axis=0)
        
#         print()
#         print("AFTER weights2:")
#         print(self.weights2)
        
#         print()
#         print('o_delta')
#         print(self.o_delta)
        
#         print()
#         print('activated hidden')
#         print(self.activated_hidden)
        
    
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X,y,o)
    
    def predict(self, X, o):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
        
        round_result = np.where(o >= 0.5, 1, 0)
        
        return round_result

In [27]:
nn = NeuralNetwork()
o = nn.feed_forward(X)
#nn.backward(X,y,o)

#accuracy_score(y, nn.predict(X,o))

In [28]:
nn = NeuralNetwork()

# Number of Epochs / Iterations
for i in range(100):
    if (i+1 in [1,2,3,4,5]) or ((i+1) % 100 ==0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
        print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
    nn.train(X,y)

+---------EPOCH 1---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.70016959]
 [0.71058645]
 [0.70016959]
 ...
 [0.70016959]
 [0.70016959]
 [0.71058645]]
Loss: 
 0.2920771111398822
+---------EPOCH 2---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.99993438]
 [0.99988071]
 [0.99993438]
 ...
 [0.99993438]
 [0.99993438]
 [0.99988071]]
Loss: 
 0.49990383954662737
+---------EPOCH 3---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[1.]
 [1.]
 [1.]
 ...
 [1.]
 [1.]
 [1.]]
Loss: 
 0.49999999999999734
+---------EPOCH 4---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[1.]
 [1.]
 [1.]
 ...
 [1.]
 [1.]
 [1.]]
Loss: 
 0.5
+--------

In [29]:
predictions = nn.predict(X,o)

print('')
print('Accuray: ',accuracy_score(y, predictions))


Accuray:  0.2771


In [30]:
nn = NeuralNetwork()

# Number of Epochs / Iterations
for i in range(100):
    if (i+1 in [1,2,3,4,5]) or ((i+1) % 100 ==0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
        print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
    nn.train(X,y)

+---------EPOCH 1---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.7224584 ]
 [0.71890753]
 [0.7224584 ]
 ...
 [0.7224584 ]
 [0.7224584 ]
 [0.71890753]]
Loss: 
 0.298412096304008
+---------EPOCH 2---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.99996115]
 [0.99993822]
 [0.99996115]
 ...
 [0.99996115]
 [0.99996115]
 [0.99993822]]
Loss: 
 0.4999466594671015
+---------EPOCH 3---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[1.]
 [1.]
 [1.]
 ...
 [1.]
 [1.]
 [1.]]
Loss: 
 0.4999999999999986
+---------EPOCH 4---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[1.]
 [1.]
 [1.]
 ...
 [1.]
 [1.]
 [1.]]
Loss: 
 0.5
+---------EP

In [31]:
predictions = nn.predict(X,o)

print('')
print('Accuray: ',accuracy_score(y, predictions))


Accuray:  0.2771


In [32]:
# Largely copied, then slightly modified from U4S2M2 notes

class NeuralNetwork:
    def __init__(self):
        self.rate = .01 
        
        # Set up Architecture of Neural Network
        self.inputs = 2
        self.hiddenNodes = 2
        self.outputNodes = 1

        # Initial Weights
        # 2x2 Matrix Array for the First Layer
        self.weights1 = 2 * np.random.rand(self.inputs, self.hiddenNodes) - 1
       
        # 2x1 Matrix Array for Hidden to Output
        self.weights2 = 2 * np.random.rand(self.hiddenNodes, self.outputNodes) - 1
        
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        sx = self.sigmoid(s)
        return sx * (1 - sx)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        #print("BEFORE")
        #print(self.weights1)
        #print(self.weights2)
        
        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
        
    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """        
        # Error in Output
        self.o_error = y - o
        
        #PRINT LOSS FUNCTION
        #print("SUM: ", sum(self.o_error))
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
        # Apply adjustment to first set of weights (input => hidden)
        self.weights1 += self.rate * X.T.dot(self.z2_delta)
        # Apply adjustment to second set of weights (hidden => output)
        self.weights2 += self.rate * self.activated_hidden.T.dot(self.o_delta)
        
        #print("BACKPROPAGATED UPDATE WEIGHTS")
        #print(self.weights1)
        #print(self.weights2)
        

    def train(self, X, y):
        o = self.feed_forward(X)
        
        #print("NEW O")
        #print(o)
        
        self.backward(X,y,o)
        
    def predict(self, X, o):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
        
        round_result = np.where(o >= 0.5, 1, 0)
        
        return round_result

In [33]:
nn = NeuralNetwork();
nn.train(X,y);

In [34]:
o = nn.feed_forward(X);
o

array([[0.71475047],
       [0.75161591],
       [0.71475047],
       ...,
       [0.71475047],
       [0.71475047],
       [0.75161591]])

In [35]:
predictions = nn.predict(X,o)

print('')
print('Accuray: ',accuracy_score(y, predictions))


Accuray:  0.5


In [36]:
# TRY THIS TIME WITH A BIAS

# I KNOW MY MLP NEEDS A BIAS BECAUSE THE XOR GATE DIAGRAM BELOW
# (WHICH IS WHAT OUR MLP SHOULD CONVERGE TO)
# INCLUDES A NAND GATE WHICH I KNOW FROM MODULE1-ASSIGNMENT NEEDS
# A BIAS TO FUNCTION PROPERLY

In [37]:
# Largely copied, then slightly modified from U4S2M2 notes

# DIFFICULTY GETTING DIMENSIONS TO LINE UP TO CONFIGURE BIAS, NOT ENOUGH TIME

class NeuralNetwork:
    def __init__(self, rate=0.01, iters = 10):
        self.rate = rate
        self.iters = iters
        
        # Set up Architecture of Neural Network
        self.inputs = 2 + 1         # +1 for BIAS
        self.hiddenNodes = 2 + 1    # +1 for BIAS
        self.outputNodes = 1

        # Initial Weights
        # 2x2 Matrix Array for the First Layer
        self.weights1 = 2 * np.random.rand(self.inputs, self.hiddenNodes) - 1
       
        # 2x1 Matrix Array for Hidden to Output
        self.weights2 = 2 * np.random.rand(self.hiddenNodes, self.outputNodes) - 1
        
        #print(self.weights1)
        #print(self.weights1[:-1])
        #print(self.weights1[-1])
        #print()
        
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        sx = self.sigmoid(s)
        return sx * (1 - sx)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        #print("BEFORE")
        #print(self.weights1)
        #print(self.weights2)
        
        #for xi,target in zip(X,y):
        #        
        #    # Weighted sum of inputs / weights with bias
        #    weighted_sum = np.dot(xi, self.weights1[:-1]) + self.weights1[-1] 

        #    # Activate!
        #    activated_output = self.sigmoid(weighted_sum)

        #    # Calc error
        #    error = target - activated_output
                
        #    # Get adjustments for weights, scale by self.rate
        #    adjustment = self.rate * (error * self.sigmoid_derivative(activated_output))
                
        #        # Update the Weights
        #    self.weights[:-1] += adjustment * xi
        #    self.weights[-1] += adjustment    #bias
        
        
        
        
        
        
        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1[:-1]) + self.weights1[-1]
                
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        print(self.activated_hidden.shape)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2[:-1]) + self.weights[-1]
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
        
    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """        
        # Error in Output
        self.o_error = y - o
        
        print("SUM: ", sum(self.o_error))
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
        # Apply adjustment to first set of weights (input => hidden)
        self.weights1 += self.rate * X.T.dot(self.z2_delta)
        # Apply adjustment to second set of weights (hidden => output)
        self.weights2 += self.rate * self.activated_hidden.T.dot(self.o_delta)
        
        print("BACKPROPAGATED UPDATE WEIGHTS")
        print(self.weights1)
        print(self.weights2)
        

    def train(self, X, y):
        o = self.feed_forward(X)
        
        print("NEW O")
        print(o)
        
        self.backward(X,y,o)
        
    def predict(self, X, o):
        """
        (Must run train function beforehand)
        Returns class label for weights defined by class
        
        If activated output is over 0.5, then classify as 1 
        otherwise, classify as 0
        """
        
        round_result = np.where(o >= 0.5, 1, 0)
        
        return round_result

In [38]:
nn = NeuralNetwork();
nn.train(X,y);

(10000, 3)


ValueError: shapes (10000,3) and (2,1) not aligned: 3 (dim 1) != 2 (dim 0)

In [None]:
# Loop neural network training

# Train my 'net
#nn = NeuralNetwork()

## Number of Epochs / Iterations
#for i in range(10000):
#    if (i+1 in [1,2,3,4,5]) or ((i+1) % 1000 ==0):
#        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
#        print('Input: \n', X)
#        print('Actual Output: \n', y)
#        print('Predicted Output: \n', str(nn.feed_forward(X)))
#        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
#    nn.train(X,y)

P.S. Don't try candy gummy bears. They're disgusting. 

## 3. Keras MMP <a id="Q3"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV or RandomSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 3 parameters in order to get a 3 on this section.

In [48]:
import pandas as pd
from sklearn.preprocessing import StandardScaler

df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
df = df.sample(frac=1)
print(df.shape)
df.head()

(303, 14)


Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
179,57,1,0,150,276,0,0,112,1,0.6,1,1,1,0
228,59,1,3,170,288,0,0,159,0,0.2,1,0,3,0
111,57,1,2,150,126,1,1,173,0,0.2,2,1,3,1
246,56,0,0,134,409,0,0,150,1,1.9,1,2,3,0
60,71,0,2,110,265,1,0,130,0,0.0,2,1,2,1


In [49]:
scaler = StandardScaler()

X = scaler.fit_transform(df.drop(columns='target'))
y = df.target.values

print(x.shape,y.shape)

(303, 13) (303,)


  return self.partial_fit(X, y)
  return self.fit(X, **fit_params).transform(X)


In [50]:
# Import Libraries
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import GridSearchCV
from keras.wrappers.scikit_learn import KerasClassifier

In [51]:
np.random.seed(42)

# Create function for model
def create_model():
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='relu'))
    model.add(Dense(13, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))

    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model, batch_size=10, 
                        epochs=20,verbose=1)

# Create 5-fold cross validation 
kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)

# Grid search
param_grid = dict()

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, 
                    n_jobs=1, cv=kfold)

# Fit
grid_result1 = grid.fit(X, y)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20


Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20




Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [53]:
grid_result = grid_result1

print(f"Best: {grid_result.best_score_*100:.3f}% using {grid_result.best_params_}")

Best: 82.178% using {}


In [54]:
# TUNE BATCH SIZE

In [55]:
batch_size = [10, 20, 50, 100]
param_grid = dict(batch_size=batch_size)

grid = GridSearchCV(estimator=model, param_grid=param_grid, 
                    n_jobs=1, cv=kfold)

grid_result2 = grid.fit(X, y)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20


Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20


Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20


Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20


Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20




Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [57]:
grid_result = grid_result2

means = [x*100 for x in grid_result.cv_results_['mean_test_score']]
params = grid_result.cv_results_['params']

for mean, param in zip(means,params):
    print(f'Accuracy: {mean:.2f}%  Params: {param}')
print()
print(f"Best: {grid_result.best_score_*100:.2f}% using {grid_result.best_params_}")
    

Accuracy: 81.19%  Params: {'batch_size': 10}
Accuracy: 83.83%  Params: {'batch_size': 20}
Accuracy: 78.88%  Params: {'batch_size': 50}
Accuracy: 76.24%  Params: {'batch_size': 100}

Best: 83.83% using {'batch_size': 20}
