<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Perceptron on XOR Gates](#Q2)
3. [Multilayer Perceptron](#Q3)
4. [Keras MMP](#Q4)

<a id="Q1"></a>
## 1. Define the following terms:

- **Neuron:**
 - Neurons are the individual units that make up a neural network. A neuron is esseantially a thing that takes in data, activates/transforms it, and outputs the data. 
- **Input Layer:**
 - The input layer is the beginning of the flow of a neural network. This is the data that is going to be processed by the subsequent layers. The dimensionality of the data is what determines how many nodes make up the input layer.
- **Hidden Layer:**
 - This is the layer (or layers) between the input and output layers. The hidden layer takes in the data from the input layer, performs a function on it, then transforms it into outputs.
- **Output Layer:**
 - The output layer is the vectorized results of the transformed data. The number of nodes in the output layer is determined by the number of classes in the data.
- **Activation:**
 - The activation function determines if the neuron will be activated or not. The purpose of this is to introduce non-linearlity to a model, making it capable of learning and performing more complex tasks.
- **Backpropagation:**
 - Short for "Backwards Propagation of errors", it refers to a specific algorithm for how weights in a neural network are updated in reverse order at the end of each training epoch.


## 2. Perceptron on XOR Gates <a id="Q3=2"></a>

Create a perceptron class that can model the behavior of an AND gate. You can use the following table as your training data:

|x1	|x2|x3|	y|
|---|---|---|---|
1|	1|	1|	1|
1|	0|	1|	0|
0|	1|	1|	0|
0|	0|	1|	0|

In [1]:
# training data

import numpy as np

X = np.array([[1,1,1],[1,0,1],[0,1,1],[0,0,1]])
y = [[1], [0], [0], [0]]

In [12]:
def Perceptron(X, y):
    
    # sigmoid activation function and derivative
    def sigmoid(x):
        return 1 / (1+np.exp(-x))

    def sigmoid_derivative(x):
        sx = sigmoid(x)
        return sx * (1-sx)
    
    # initialize random weights for inputs
    weights = 2 * np.random.random((3,1)) - 1
    
    
    for iteration in range(1000):
    
        # Weighted sum of inputs/weights
        weighted_sum = np.dot(X, weights)

        # Activate
        activated_outputs = sigmoid(weighted_sum)

        # Calculate error
        error = y - activated_outputs

        # Adjsutments
        adjustments = error * sigmoid_derivative(activated_outputs)

        weights += np.dot(X.T, adjustments)
    
    return activated_outputs

In [13]:
Perceptron(X, y)

array([[9.64968597e-01],
       [1.99927336e-02],
       [1.99927384e-02],
       [1.51085807e-05]])

## 3. Multilayer Perceptron <a id="Q3"></a>

Implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights.
Your network must have one hidden layer.
You do not have to update weights via gradient descent. You can use something like the derivative of the sigmoid function to update weights.
Train your model on the Heart Disease dataset from UCI:



In [140]:
import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv")
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,3,145,233,1,0,150,0,2.3,0,0,1,1
1,37,1,2,130,250,0,1,187,0,3.5,0,0,2,1
2,41,0,1,130,204,0,0,172,0,1.4,2,0,2,1
3,56,1,1,120,236,0,1,178,0,0.8,2,0,2,1
4,57,0,0,120,354,0,1,163,1,0.6,2,0,2,1


In [141]:
y = df["target"].values
y

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,

In [142]:
X = df.drop(columns="target").values
X.shape, X_train[0:5]

((303, 13),
 array([[ 42. ,   1. ,   1. , 120. , 295. ,   0. ,   1. , 162. ,   0. ,
           0. ,   2. ,   0. ,   2. ],
        [ 58. ,   1. ,   0. , 150. , 270. ,   0. ,   0. , 111. ,   1. ,
           0.8,   2. ,   0. ,   3. ],
        [ 46. ,   1. ,   2. , 150. , 231. ,   0. ,   1. , 147. ,   0. ,
           3.6,   1. ,   0. ,   2. ],
        [ 55. ,   0. ,   1. , 135. , 250. ,   0. ,   0. , 161. ,   0. ,
           1.4,   1. ,   0. ,   2. ],
        [ 60. ,   1. ,   0. , 117. , 230. ,   1. ,   1. , 160. ,   1. ,
           1.4,   2. ,   2. ,   3. ]]))

### Initializing Weights

In [143]:
class NeuralNetwork:
    def __init__(self):
        # Set up Architecture of Neural Network
        self.input = 13
        self.hiddenNodes = 4
        self.outputNodes = 1
        
        # Initial Weights
        # 13x4 Matrix Array for the First Layer
        self.weights1 = np.random.randn(self.input,self.hiddenNodes)
        # 4x1 Matrix Array for Hidden to Output
        self.weights2 = np.random.randn(self.hiddenNodes, self.outputNodes)

In [144]:
nn = NeuralNetwork()

print("Layer 1 weights: \n", nn.weights1)
print("Layer 2 weights: \n", nn.weights2)

Layer 1 weights: 
 [[ 2.3758805   1.18441405 -0.05387254 -0.67356603]
 [ 1.62792997 -0.67591533  0.45998083 -1.17393319]
 [ 0.30628062 -1.54748342  3.01721378 -0.30271282]
 [ 1.66016865  1.36724663  0.43285827  0.538492  ]
 [ 0.49288233 -1.42645691  0.34115162 -0.82200752]
 [ 0.13139128 -0.16650826  1.22268406 -0.11403783]
 [-0.07664202  0.71338342  0.35313681  2.19228096]
 [-0.51973544  0.58123778 -1.01607494  1.43821703]
 [ 0.02750955  0.9418513  -0.31743425  1.21160615]
 [ 0.88181566  0.07275299  0.2393588  -1.0916967 ]
 [ 0.22119457  1.07942777 -0.94186049  0.61124015]
 [-0.56280752  1.09342246 -0.66939216 -0.31169132]
 [ 2.05330163 -0.67682584  0.62477996 -0.62920175]]
Layer 2 weights: 
 [[ 0.36576632]
 [ 0.60244312]
 [-0.82601484]
 [ 0.85470337]]


### Take in inputs, get weighted sum, activate, pass activated values to next layer

In [145]:
class NeuralNetwork:
    def __init__(self):
        # Set up Architecture of Neural Network
        self.input = 13
        self.hiddenNodes = 4
        self.outputNodes = 1
        
        # Initial Weights
        # 13x4 Matrix Array for the First Layer
        self.weights1 = np.random.randn(self.input,self.hiddenNodes)
        # 4x1 Matrix Array for Hidden to Output
        self.weights2 = np.random.randn(self.hiddenNodes, self.outputNodes)
    
    def sigmoid(self, x):
        return 1 / (1+np.exp(-x))
    
    def feed_forward(self,X):
        """
        Calculate the NN inference using feed forward.
        """
        
        # Weighted sum of inputs & hidden
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weighted sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        # Final Activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output

In [146]:
# generate an output

nn = NeuralNetwork()

print(X[0])
output = nn.feed_forward(X[0])
print("output", output)

[ 63.    1.    3.  145.  233.    1.    0.  150.    0.    2.3   0.    0.
   1. ]
output [0.59300905]


In [147]:
# calculate error

output_all = nn.feed_forward(X)
error_all = y - output_all
print(error_all)

[[ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]
 [ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]
 [ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]
 ...
 [ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]
 [ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]
 [ 0.40699095  0.40699095  0.40699095 ... -0.59300905 -0.59300905
  -0.59300905]]


### Backpropagation

In [148]:
attributes = ['weights1', 'hidden_sum', 'activated_hidden', 'weights2', 'output']

[print(i + '\n', getattr(nn,i), '\n'+'---'*3) for i in dir(nn) if i in attributes]

activated_hidden
 [[4.24336788e-163 1.00000000e+000 6.12765094e-183 7.14563451e-119]
 [1.61724238e-200 1.00000000e+000 6.61007857e-187 4.34514067e-125]
 [1.03126976e-165 1.00000000e+000 6.44952731e-169 6.04107237e-112]
 ...
 [6.83924955e-138 1.00000000e+000 2.77337680e-168 8.15232223e-113]
 [2.10651241e-094 1.00000000e+000 1.32335728e-135 1.11563555e-088]
 [8.69685605e-180 1.00000000e+000 2.13508962e-178 6.60178574e-123]] 
---------
hidden_sum
 [[-373.87601289  229.59161923 -419.56026055 -272.04112445]
 [-460.03629613  224.10439315 -428.69481685 -286.35407849]
 [-379.89574952  161.58409948 -387.27287387 -256.09094887]
 ...
 [-315.83406482  174.6027078  -385.81422999 -258.09381269]
 [-215.69796505   99.56119524 -310.56881566 -202.51806394]
 [-412.30235515  200.32883059 -409.10163793 -281.33062626]] 
---------
weights1
 [[ 0.48825704 -0.53446807  0.20669076 -0.43003544]
 [ 0.17776949  0.29467547 -1.2642992   1.42644844]
 [ 0.05640493 -0.60400253  1.15070842  1.94960793]
 [ 0.40000741  0.

[None, None, None, None]

### Putting it all together

In [149]:
class NeuralNetwork: 
    def __init__(self):
        # Set upArchietecture 
        self.inputs = 13
        self.hiddenNodes = 4
        self.outputNodes = 1
        
        #Initial weights
        self.weights1 = np.random.randn(self.inputs, self.hiddenNodes) #13x4
        self.weights2 = np.random.rand(self.hiddenNodes, self.outputNodes) #4x1
    
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        """
        
        #Weighted sume of inputs and hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        #Acivations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        #Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
    
    def backward(self, X, y, o):
        """
        Backward propagate through the network
        """
        self.o_error = y - o #error in output
        self.o_delta = self.o_error * self.sigmoidPrime(o) # apply derivative of sigmoid to error
        
        self.z2_error = self.o_delta.dot(self.weights2.T) # z2 error: how much our hidden layer weights were off
        self.z2_delta = self.z2_error*self.sigmoidPrime(self.activated_hidden)
        
        self.weights1 += X.T.dot(self.z2_delta) #Adjust first set (input => hidden) weights
        self.weights2 += self.activated_hidden.T.dot(self.o_delta) #adjust second set (hidden => output) weights
        
    def train(self, X, y):
        o = self.feed_forward(X)
        self.backward(X, y, o)

In [150]:
nn = NeuralNetwork()

for i in range(1000):
    if (i+1 in [1,2,3,4,5]) or ((i+1) % 50 ==0):
        print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
        print('Input: \n', X)
        print('Actual Output: \n', y)
        print('Predicted Output: \n', str(nn.feed_forward(X)))
        print("Loss: \n", str(np.mean(np.square(y - nn.feed_forward(X)))))
    nn.train(X,y)

+---------EPOCH 1---------+
Input: 
 [[63.  1.  3. ...  0.  0.  1.]
 [37.  1.  2. ...  0.  0.  2.]
 [41.  0.  1. ...  2.  0.  2.]
 ...
 [68.  1.  0. ...  1.  2.  3.]
 [57.  1.  0. ...  1.  1.  3.]
 [57.  0.  1. ...  1.  1.  2.]]
Actual Output: 
 [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0]
Predicted Output: 
 [[0.85789223]
 [0.71313947]
 [0.71432769]
 [0.71355587]
 [0.71313947]
 [0.85789223]
 [0.84622257]
 [0.71313947]
 [0.85

ValueError: shapes (303,303) and (1,4) not aligned: 303 (dim 1) != 1 (dim 0)

## 4. Keras MMP <a id="Q4"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 5 parameters in order to get a 3 on this section.

In [96]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((242, 13), (61, 13), (242,), (61,))

In [100]:
import keras
from keras.models import Sequential
from keras.layers import Dense


#baseline model

model = Sequential()
model.add(Dense(13, input_dim=13, activation='sigmoid'))
model.add(Dense(16, activation='relu'))
model.add(Dense(1, activation="sigmoid"))

model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

model.fit(X_train, y_train, validation_split=.2, epochs=20, batch_size=10)

Train on 193 samples, validate on 49 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x1a36a649e8>

In [101]:
#experimenting with batch size

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV

def create_model():
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='sigmoid'))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(1, activation="sigmoid"))
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, verbose=1)

param_grid = {'batch_size': [10, 20, 40, 60, 80, 100],
              'epochs': [20]}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Best: 0.6487603305785123 using {'batch_size': 10, 'epochs': 20}
Means: 0.6487603305785123, Stdev: 0.021696459526140754 with: {'batch_size': 10, 'epochs': 20}
Means: 0.5826446271139728, Stdev: 0.02466819052260848 with: {'batch_size': 20, 'epochs': 20}
Means: 0.6198347048325972, Stdev: 0.05081760568855904 with: {'batch_size': 40, 'epochs': 20}
Means: 0.5991735614774641, Stdev: 0.048557977897279295 with: {'batch_size': 60, 'epochs': 20}
Means: 0.5454545257505307, Stdev: 0.0820765301367928 with: {'batch_size': 80, 'epochs': 20}
Means: 0.541322306414281, Stdev: 0.012532856467212732 with: {'batch_size': 100, 'epochs': 20}


In [103]:
# experimenting with activation function in input layer

def create_model(activation='relu'):
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation=activation))
    model.add(Dense(16, activation='relu'))
    model.add(Dense(1, activation="sigmoid"))
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, batch_size=10, verbose=1)

param_grid = {'activation': ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear'],
              'epochs': [20]}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Best: 0.6280991772482217 using {'activation': 'tanh', 'epochs': 20}
Means: 0.586776864553286, Stdev: 0.03873119424644328 with: {'activation': 'softmax', 'epochs': 20}
Means: 0.5247933958187576, Stdev: 0.1116836501370129 with: {'activation': 'softplus', 'epochs': 20}
Means: 0.5909090955887945, Stdev: 0.028402075067477186 with: {'activation': 'softsign', 'epochs': 20}
Means: 0.4669421580581626, Stdev: 0.04352147348453235 with: {'activation': 'relu', 'epochs': 20}
Means: 0.6280991772482217, Stdev: 0.018701163279581885 with: {'activation': 'tanh', 'epochs': 20}
Means: 0.5909090955887945, Stdev: 0.05514656856499684 with: {'activation': 'sigmoid', 'epochs': 20}
Means: 0.5826446324094268, Stdev: 0.04760121818485593 with: {'activation': 'hard_sigmoid', 'epochs': 20}


In [104]:
# experimenting with activation function in hidden layer

def create_model(activation='relu'):
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='tanh'))
    model.add(Dense(16, activation=activation))
    model.add(Dense(1, activation="sigmoid"))
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, batch_size=10, verbose=1)

param_grid = {'activation': ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear'],
              'epochs': [20]}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Best: 0.6322314118550829 using {'activation': 'hard_sigmoid', 'epochs': 20}
Means: 0.6074380202234284, Stdev: 0.05988013397732138 with: {'activation': 'softmax', 'epochs': 20}
Means: 0.5743801676291079, Stdev: 0.025256282653527046 with: {'activation': 'softplus', 'epochs': 20}
Means: 0.5785124012507683, Stdev: 0.028284565025742243 with: {'activation': 'softsign', 'epochs': 20}
Means: 0.6074380215165044, Stdev: 0.06005028443039823 with: {'activation': 'relu', 'epochs': 20}
Means: 0.5702479370861999, Stdev: 0.08205122800602034 with: {'activation': 'tanh', 'epochs': 20}
Means: 0.5743801682448584, Stdev: 0.013675997974484275 with: {'activation': 'sigmoid', 'epochs': 20}
Means: 0.6322314118550829, Stdev: 0.08356065733537404 with: {'activation': 'hard_sigmoid', 'ep

In [105]:
# experimenting with activation function in output layer

def create_model(activation='relu'):
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='tanh'))
    model.add(Dense(16, activation='hard_sigmoid'))
    model.add(Dense(1, activation=activation))
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, batch_size=10, verbose=1)

param_grid = {'activation': ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear'],
              'epochs': [20]}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Best: 0.6487603318100134 using {'activation': 'softplus', 'epochs': 20}
Means: 0.5495867828938586, Stdev: 0.022980246135376938 with: {'activation': 'softmax', 'epochs': 20}
Means: 0.6487603318100134, Stdev: 0.017444179108051558 with: {'activation': 'softplus', 'epochs': 20}
Means: 0.5165289365801929, Stdev: 0.07599110104020614 with: {'activation': 'softsign', 'epochs': 20}
Means: 0.5785124011891932, Stdev: 0.08635681509041344 with: {'activation': 'relu', 'epochs': 20}
Means: 0.33057851873892397, Stdev: 0.16596135796371653 with: {'activation': 'tanh', 'epochs': 20}
Means: 0.5578512454574759, Stdev: 0.034214475468150686 with: {'activation': 'sigmoid', 'epochs': 20}
Means: 0.5578512427481738, Stdev: 0.03421447173952422 with: {'activation': 'hard_sigmoid', 'epoch

In [106]:
# experimenting with number of epochs

def create_model():
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='tanh'))
    model.add(Dense(16, activation='hard_sigmoid'))
    model.add(Dense(1, activation="softplus"))
    model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, batch_size=10, verbose=1)

param_grid = {'epochs': [20, 50, 80, 140, 300]}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/80
Epoch 2/80
Epoch 3/80
Epoch 4/80
Epoch 5/80
Epoch 6/80
Epoch 7/80
Epoch 8/80
Epoch 9/80
Epoch 10/80
Epoch 11/80
Epoch 12/80
Epoch 13/80
Epoch 14/80
Epoch 15/80
Epoch 16/80
Epoch 17/80
Epoch 18/80
Epoch 19/80
Epoch 20/80
Epoch 21/80
Epoch 22/80
Epoch 23/80
Epoch 24/80
Epoch 25/80
Epoch 26/80
Epoch 27/80
Epoch 28/80
Epoch 29/80
Epoch 30/80
Epoch 31/80
Epoch 32/80
Epoch 33/80
Epoch 34/80
Epoch 35/80
Epoch 36/80
Epoch 37/80
Epoch 38/80
Epoch 39/80
Epoch 40/80
Epoch 41/80
Epoch 42/80
Epoch 43/80
Epoch 44/80
Epoch 45/80
Epoch 46/80
Epoch 47/80
Epoch 48/80
Epoch 49/80
Epoch 50/80
Epoch 51/80
Epoch 52/80
Epoch 53/80
Epoch 54/80
Epoch 55/80
Epoch 56/80
Epoch 57/80
Epoch 58/80
Epoch 59/80
Epoch 60/80
Epoch 61/80
Epoch 62/80
Epoch 63/80
Epoch 64/80
Epoch 65/80
Epoch 66/80
Epoch 67/80
Epoch 68/80
Epoch 69/80
Epoch 70/80
Epoch 71/80
Epoch 72/80
Epoch 73/80
Epoch 74/80
Epoch 75/80
Epoch 76/80
Epoch 77/80
Epoch 78/80
Epoch 79/80
Epoch 80/80
Best: 0.6404958729408989 using {'epochs': 80}
Mea

In [108]:
# experimenting with optimizer

def create_model(optimizer='adam'):
    model = Sequential()
    model.add(Dense(13, input_dim=13, activation='tanh'))
    model.add(Dense(16, activation='hard_sigmoid'))
    model.add(Dense(1, activation="softplus"))
    model.compile(optimizer=optimizer, loss="binary_crossentropy", metrics=["accuracy"])
    
    return model

model = KerasClassifier(build_fn=create_model, batch_size=10, epochs=80, verbose=1)

param_grid = {'optimizer': ['SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adam', 'Adamax', 'Nadam']}

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}")



Epoch 1/80
Epoch 2/80
Epoch 3/80
Epoch 4/80
Epoch 5/80
Epoch 6/80
Epoch 7/80
Epoch 8/80
Epoch 9/80
Epoch 10/80
Epoch 11/80
Epoch 12/80
Epoch 13/80
Epoch 14/80
Epoch 15/80
Epoch 16/80
Epoch 17/80
Epoch 18/80
Epoch 19/80
Epoch 20/80
Epoch 21/80
Epoch 22/80
Epoch 23/80
Epoch 24/80
Epoch 25/80
Epoch 26/80
Epoch 27/80
Epoch 28/80
Epoch 29/80
Epoch 30/80
Epoch 31/80
Epoch 32/80
Epoch 33/80
Epoch 34/80
Epoch 35/80
Epoch 36/80
Epoch 37/80
Epoch 38/80
Epoch 39/80
Epoch 40/80
Epoch 41/80
Epoch 42/80
Epoch 43/80
Epoch 44/80
Epoch 45/80
Epoch 46/80
Epoch 47/80
Epoch 48/80
Epoch 49/80
Epoch 50/80
Epoch 51/80
Epoch 52/80
Epoch 53/80
Epoch 54/80
Epoch 55/80
Epoch 56/80
Epoch 57/80
Epoch 58/80
Epoch 59/80
Epoch 60/80
Epoch 61/80
Epoch 62/80
Epoch 63/80
Epoch 64/80
Epoch 65/80
Epoch 66/80
Epoch 67/80
Epoch 68/80
Epoch 69/80
Epoch 70/80
Epoch 71/80
Epoch 72/80
Epoch 73/80
Epoch 74/80
Epoch 75/80
Epoch 76/80
Epoch 77/80
Epoch 78/80
Epoch 79/80
Epoch 80/80
Best: 0.6322314071753794 using {'optimizer': 'Nad