<img align="left" src="https://lever-client-logos.s3.amazonaws.com/864372b1-534c-480e-acd5-9711f850815c-1524247202159.png" width=200>
<br></br>
<br></br>

## *Data Science Unit 4 Sprint 2*

# Sprint Challenge - Neural Network Foundations

Table of Problems

1. [Defining Neural Networks](#Q1)
2. [Chocolate Gummy Bears](#Q2)
    - Perceptron
    - Multilayer Perceptron
4. [Keras MMP](#Q3)

<a id="Q1"></a>
## 1. Define the following terms:

- ## **Neuron:** 
A neuron can be described similar to the brains neuron, where it has dendrites, an axon and an axon terminal. What this means for neural netowrks in data science - since
they where modeled after the neurons on our brains - is that an example of a neuron is a simple perceptron. It takes an input like the dendrites, it calculates the weight
of that input like the axon and the axon terminal fires depending on those weights like the activation function.
- ## **Input Layer:**
The input layer consists of the outside information that is inputed onto the neural network.
- ## **Hidden Layer:**
The hidden layer has no connection to the outside, it performs computations and connects to the output layer. It can be made up of several hidden nodes.
- ## **Output Layer:**
The output layer performs calculations and passes its findings to the outside, generally with desired activation for the final result.
- ## **Activation:**
The activation is made up of a function, generally referred to as the activation function. The role of the activation function is to turn the output into a non linear output
so that the results can be interpreted correctly since most problems are non linear. In the case of the sigmoid function, it squishifies the results to be between 0-1.
- ## **Backpropagation:**
Backpropagation is the process by which the perceptron updates it weights based on the error that is calculated from the desired output and the predicted output. 


## 2. Chocolate Gummy Bears <a id="Q2"></a>

Right now, you're probably thinking, "yuck, who the hell would eat that?". Great question. Your candy company wants to know too. And you thought I was kidding about the [Chocolate Gummy Bears](https://nuts.com/chocolatessweets/gummies/gummy-bears/milk-gummy-bears.html?utm_source=google&utm_medium=cpc&adpos=1o1&gclid=Cj0KCQjwrfvsBRD7ARIsAKuDvMOZrysDku3jGuWaDqf9TrV3x5JLXt1eqnVhN0KM6fMcbA1nod3h8AwaAvWwEALw_wcB). 

Let's assume that a candy company has gone out and collected information on the types of Halloween candy kids ate. Our candy company wants to predict the eating behavior of witches, warlocks, and ghosts -- aka costumed kids. They shared a sample dataset with us. Each row represents a piece of candy that a costumed child was presented with during "trick" or "treat". We know if the candy was `chocolate` (or not chocolate) or `gummy` (or not gummy). Your goal is to predict if the costumed kid `ate` the piece of candy. 

If both chocolate and gummy equal one, you've got a chocolate gummy bear on your hands!?!?!
![Chocolate Gummy Bear](https://ed910ae2d60f0d25bcb8-80550f96b5feb12604f4f720bfefb46d.ssl.cf1.rackcdn.com/3fb630c04435b7b5-2leZuM7_-zoom.jpg)

In [6]:
# Backprop on the Seeds Dataset
from random import seed
from random import randrange
from random import random
from csv import reader
from math import exp

# Load a CSV file
def load_csv(filename):
	dataset = list()
	with open(filename, 'r') as file:
		csv_reader = reader(file)
		for row in csv_reader:
			if not row:
				continue
			dataset.append(row)
	return dataset

# Convert string column to float
def str_column_to_float(dataset, column):
	for row in dataset:
		row[column] = float(row[column].strip())

# Convert string column to integer
def str_column_to_int(dataset, column):
	class_values = [row[column] for row in dataset]
	unique = set(class_values)
	lookup = dict()
	for i, value in enumerate(unique):
		lookup[value] = i
	for row in dataset:
		row[column] = lookup[row[column]]
	return lookup

# Find the min and max values for each column
def dataset_minmax(dataset):
	minmax = list()
	stats = [[min(column), max(column)] for column in zip(*dataset)]
	return stats

# Rescale dataset columns to the range 0-1
def normalize_dataset(dataset, minmax):
	for row in dataset:
		for i in range(len(row)-1):
			row[i] = (row[i] - minmax[i][0]) / (minmax[i][1] - minmax[i][0])

# Split a dataset into k folds
def cross_validation_split(dataset, n_folds):
	dataset_split = list()
	dataset_copy = list(dataset)
	fold_size = int(len(dataset) / n_folds)
	for i in range(n_folds):
		fold = list()
		while len(fold) < fold_size:
			index = randrange(len(dataset_copy))
			fold.append(dataset_copy.pop(index))
		dataset_split.append(fold)
	return dataset_split

# Calculate accuracy percentage
def accuracy_metric(actual, predicted):
	correct = 0
	for i in range(len(actual)):
		if actual[i] == predicted[i]:
			correct += 1
	return correct / float(len(actual)) * 100.0

# Evaluate an algorithm using a cross validation split
def evaluate_algorithm(dataset, algorithm, n_folds, *args):
	folds = cross_validation_split(dataset, n_folds)
	scores = list()
	for fold in folds:
		train_set = list(folds)
		train_set.remove(fold)
		train_set = sum(train_set, [])
		test_set = list()
		for row in fold:
			row_copy = list(row)
			test_set.append(row_copy)
			row_copy[-1] = None
		predicted = algorithm(train_set, test_set, *args)
		actual = [row[-1] for row in fold]
		accuracy = accuracy_metric(actual, predicted)
		scores.append(accuracy)
	return scores

# Calculate neuron activation for an input
def activate(weights, inputs):
	activation = weights[-1]
	for i in range(len(weights)-1):
		activation += weights[i] * inputs[i]
	return activation

# Transfer neuron activation
def transfer(activation):
	return 1.0 / (1.0 + exp(-activation))

# Forward propagate input to a network output
def forward_propagate(network, row):
	inputs = row
	for layer in network:
		new_inputs = []
		for neuron in layer:
			activation = activate(neuron['weights'], inputs)
			neuron['output'] = transfer(activation)
			new_inputs.append(neuron['output'])
		inputs = new_inputs
	return inputs

# Calculate the derivative of an neuron output
def transfer_derivative(output):
	return output * (1.0 - output)

# Backpropagate error and store in neurons
def backward_propagate_error(network, expected):
	for i in reversed(range(len(network))):
		layer = network[i]
		errors = list()
		if i != len(network)-1:
			for j in range(len(layer)):
				error = 0.0
				for neuron in network[i + 1]:
					error += (neuron['weights'][j] * neuron['delta'])
				errors.append(error)
		else:
			for j in range(len(layer)):
				neuron = layer[j]
				errors.append(expected[j] - neuron['output'])
		for j in range(len(layer)):
			neuron = layer[j]
			neuron['delta'] = errors[j] * transfer_derivative(neuron['output'])

# Update network weights with error
def update_weights(network, row, l_rate):
	for i in range(len(network)):
		inputs = row[:-1]
		if i != 0:
			inputs = [neuron['output'] for neuron in network[i - 1]]
		for neuron in network[i]:
			for j in range(len(inputs)):
				neuron['weights'][j] += l_rate * neuron['delta'] * inputs[j]
			neuron['weights'][-1] += l_rate * neuron['delta']

# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch, n_outputs):
	for epoch in range(n_epoch):
		for row in train:
			outputs = forward_propagate(network, row)
			expected = [0 for i in range(n_outputs)]
			expected[row[-1]] = 1
			backward_propagate_error(network, expected)
			update_weights(network, row, l_rate)

# Initialize a network
def initialize_network(n_inputs, n_hidden, n_outputs):
	network = list()
	hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in range(n_hidden)]
	network.append(hidden_layer)
	output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in range(n_outputs)]
	network.append(output_layer)
	return network

# Make a prediction with a network
def predict(network, row):
	outputs = forward_propagate(network, row)
	return outputs.index(max(outputs))

# Backpropagation Algorithm With Stochastic Gradient Descent
def back_propagation(train, test, l_rate, n_epoch, n_hidden):
	n_inputs = len(train[0]) - 1
	n_outputs = len(set([row[-1] for row in train]))
	network = initialize_network(n_inputs, n_hidden, n_outputs)
	train_network(network, train, l_rate, n_epoch, n_outputs)
	predictions = list()
	for row in test:
		prediction = predict(network, row)
		predictions.append(prediction)
	return(predictions)

# Test Backprop on Seeds dataset
seed(1)
# load and prepare data
filename = 'chocolate_gummy_bears.csv'
dataset = load_csv(filename)
for i in range(len(dataset[0])-1):
	str_column_to_float(dataset, i)
# convert class column to integers
str_column_to_int(dataset, len(dataset[0])-1)
# normalize input variables
minmax = dataset_minmax(dataset)
normalize_dataset(dataset, minmax)
# evaluate algorithm
n_folds = 5
l_rate = 0.3
n_epoch = 500
n_hidden = 5
scores = evaluate_algorithm(dataset, back_propagation, n_folds, l_rate, n_epoch, n_hidden)
print('Scores: %s' % scores)
print('Mean Accuracy: %.3f%%' % (sum(scores)/float(len(scores))))

KeyError: 0

In [190]:
import pandas as pd
import numpy as np
candy = pd.read_csv('chocolate_gummy_bears.csv')

seed = 7
np.random.seed(seed)

In [2]:
candy.head()

Unnamed: 0,chocolate,gummy,ate
0,0,1,1
1,1,0,1
2,0,1,1
3,0,0,0
4,1,1,0


In [3]:
candy.shape

(10000, 3)

### Perceptron

To make predictions on the `candy` dataframe. Build and train a Perceptron using numpy. Your target column is `ate` and your features: `chocolate` and `gummy`. Do not do any feature engineering. :P

Once you've trained your model, report your accuracy. You will not be able to achieve more than ~50% with the simple perceptron. Explain why you could not achieve a higher accuracy with the *simple perceptron* architecture, because it's possible to achieve ~95% accuracy on this dataset. Provide your answer in markdown (and *optional* data anlysis code) after your perceptron implementation. 

### You are not able to achieve above 50% because this data is not linearly separable. So it needs additional architecture to be able to achieve a higher than 50% accuracy.

In [57]:
# Start your candy perceptron here

X = candy[['chocolate', 'gummy']].values
y = candy['ate'].values
y = y.reshape(-1,1)

In [89]:
import numpy as np

class Perceptron(object):
    
    def __init__(self, niter = 10, weights = None, activated_output = None):
        self.niter = niter
        self.weights = weights
        self.activated_output = activated_output
    
    def __sigmoid(self, x):
        return 1/ (1 + np.exp(-x))
    
    def __sigmoid_derivative(self, x):
        sx = self.__sigmoid(x)
        return sx * (1-sx)

    def fit(self, X, y):
        """Fit training data
        X : Training vectors, X.shape : [#samples, #features]
        y : Target values, y.shape : [#samples]
        """
        weights_size = X.shape[1]
        #weights_size_len = X.shape[1]
        correct_outputs = np.array(y).reshape(-1,1) 
        inputs = np.array(X)
        
        # Randomly Initialize Weights
        weights = 2 * np.random.random((weights_size, 1)) -1

        for i in range(self.niter):
            # Weighted sum of inputs / weights
            weighted_sum = np.dot(inputs, weights)

            # Activate!
            activated_output = self.__sigmoid(weighted_sum)

            # Cac error
            error = correct_outputs - activated_output

            adjustments = error * self.__sigmoid_derivative(activated_output)

            # Update the Weights
            weights += np.dot(inputs.T, adjustments)
        
        self.weights = weights
        self.activated_output = activated_output


    def predict(self, X):
        """Return class label after unit step"""
        size_X = X.shape[0]
        prediction = []
        for i in range(size_X):
            number = 1 if self.activated_output[i] >= .5 else 0
            prediction.append(number)
        return prediction
    
    def score(self, X, y):
        """percentage of accurate results"""
        size_X = X.shape[0]
        predictions = self.predict(X)
        results_list = []
        for i in range(size_X):
            number = True if predictions[i] == y[i] else False
            results_list.append(number)
        return np.count_nonzero(results_list)/len(results_list)



In [90]:
simple_perceptron = Perceptron(10000)

simple_perceptron.fit(X,y)
simple_perceptron.score(X,y)

0.5

### Multilayer Perceptron <a id="Q3"></a>

Using the sample candy dataset, implement a Neural Network Multilayer Perceptron class that uses backpropagation to update the network's weights. Your Multilayer Perceptron should be implemented in Numpy. 
Your network must have one hidden layer.

Once you've trained your model, report your accuracy. Explain why your MLP's performance is considerably better than your simple perceptron's on the candy dataset. 

In [130]:
import numpy as np

# I want activations that correspond to negative weights to be lower
# and activations that correspond to positive weights to be higher

class NeuralNetwork(object):
    def __init__(self, niter = 1):
        # Set up Architecture of Neural Network
        self.niter = niter
        self.inputs = 2
        self.hiddenNodes = 1
        self.outputNodes = 1

        # Initial Weights
        # 2x3 Matrix Array for the First Layer
        self.weights1 = np.random.rand(self.inputs, self.hiddenNodes)
       
        # 3x1 Matrix Array for Hidden to Output
        self.weights2 = np.random.rand(self.hiddenNodes, self.outputNodes)
        
    def sigmoid(self, s):
        return 1 / (1+np.exp(-s))
    
    def sigmoidPrime(self, s):
        return s * (1 - s)
    
    def feed_forward(self, X):
        """
        Calculate the NN inference using feed forward.
        aka "predict"
        """
        
        # Weighted sum of inputs => hidden layer
        self.hidden_sum = np.dot(X, self.weights1)
        
        # Activations of weighted sum
        self.activated_hidden = self.sigmoid(self.hidden_sum)
        
        # Weight sum between hidden and output
        self.output_sum = np.dot(self.activated_hidden, self.weights2)
        
        # Final activation of output
        self.activated_output = self.sigmoid(self.output_sum)
        
        return self.activated_output
        
    def backward(self, X,y,o):
        """
        Backward propagate through the network
        """
        
        # Error in Output
        self.o_error = y - o
        
        # Apply Derivative of Sigmoid to error
        # How far off are we in relation to the Sigmoid f(x) of the output
        # ^- aka hidden => output
        self.o_delta = self.o_error * self.sigmoidPrime(o)
        
        # z2 error
        self.z2_error = self.o_delta.dot(self.weights2.T)
        
        # How much of that "far off" can explained by the input => hidden
        self.z2_delta = self.z2_error * self.sigmoidPrime(self.activated_hidden)
        
        # descent portion
        # Adjustment to first set of weights (input => hidden)
        self.weights1 += X.T.dot(self.z2_delta)
        # Adjustment to second set of weights (hidden => output)
        self.weights2 += self.activated_hidden.T.dot(self.o_delta)
        

    def train(self, X, y):
        for i in range(self.niter):
            if (i+1 in [1,2,3,4,5]) or ((i+1) % 1000 ==0):
                print('+' + '---' * 3 + f'EPOCH {i+1}' + '---'*3 + '+')
                print('Input: \n', X)
                print('Actual Output: \n', y)
                print('Predicted Output: \n', str(self.feed_forward(X)))
                print("Loss: \n", str(np.mean(np.square(y - self.feed_forward(X)))))
                print("Accuracy: \n", str(self.score(X, y)))
            o = self.feed_forward(X)
            self.backward(X,y,o)
    
    def predict(self, X):
        """Return class label after unit step"""
        size_X = X.shape[0]
        prediction = []
        for i in range(size_X):
            number = 1 if self.activated_output[i] >= .5 else 0
            prediction.append(number)
        return prediction
    
    def score(self, X, y):
        """percentage of accurate results"""
        size_X = X.shape[0]
        predictions = self.predict(X)
        results_list = []
        for i in range(size_X):
            number = True if predictions[i] == y[i] else False
            results_list.append(number)
        return np.count_nonzero(results_list)/len(results_list)

In [131]:
multilayer_perceptron = NeuralNetwork(1000)

multilayer_perceptron.train(X,y)
#multilayer_perceptron.score(X,y)

+---------EPOCH 1---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.65330221]
 [0.65277843]
 [0.65330221]
 ...
 [0.65330221]
 [0.65330221]
 [0.65277843]]
Loss: 
 0.2714579953965892
Accuracy: 
 0.5
+---------EPOCH 2---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.49999773]
 [0.49999718]
 [0.49999773]
 ...
 [0.49999773]
 [0.49999773]
 [0.49999718]]
Loss: 
 0.201151133525034
Accuracy: 
 0.5
+---------EPOCH 3---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]
 [1]
 [1]]
Predicted Output: 
 [[0.49999774]
 [0.49999719]
 [0.49999774]
 ...
 [0.49999774]
 [0.49999774]
 [0.49999719]]
Loss: 
 0.20115113027984857
Accuracy: 
 0.5
+---------EPOCH 4---------+
Input: 
 [[0 1]
 [1 0]
 [0 1]
 ...
 [0 1]
 [0 1]
 [1 0]]
Actual Output: 
 [[1]
 [1]
 [1]
 ...
 [1]


P.S. Don't try candy gummy bears. They're disgusting. 

## 3. Keras MMP <a id="Q3"></a>

Implement a Multilayer Perceptron architecture of your choosing using the Keras library. Train your model and report its baseline accuracy. Then hyperparameter tune at least two parameters and report your model's accuracy.
Use the Heart Disease Dataset (binary classification)
Use an appropriate loss function for a binary classification task
Use an appropriate activation function on the final layer of your network.
Train your model using verbose output for ease of grading.
Use GridSearchCV or RandomSearchCV to hyperparameter tune your model. (for at least two hyperparameters)
When hyperparameter tuning, show you work by adding code cells for each new experiment.
Report the accuracy for each combination of hyperparameters as you test them so that we can easily see which resulted in the highest accuracy.
You must hyperparameter tune at least 3 parameters in order to get a 3 on this section.

In [2]:
import pandas as pd
from sklearn.preprocessing import StandardScaler

df = pd.read_csv('https://raw.githubusercontent.com/ryanleeallred/datasets/master/heart.csv')
df = df.sample(frac=1)
print(df.shape)
df.head()

(303, 14)


Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
115,37,0,2,120,215,0,1,170,0,0.0,2,0,2,1
98,43,1,2,130,315,0,1,162,0,1.9,2,1,2,1
175,40,1,0,110,167,0,0,114,1,2.0,1,0,3,0
22,42,1,0,140,226,0,1,178,0,0.0,2,0,2,1
299,45,1,3,110,264,0,1,132,0,1.2,1,0,3,0


In [3]:
scaler = StandardScaler()
X = df.drop(columns="target").values
y = df.target.values.reshape(-1,1)
X = scaler.fit_transform(X)

In [4]:
X[0]

array([-1.91531289, -1.46841752,  1.00257707, -0.66386682, -0.60419239,
       -0.41763453,  0.89896224,  0.89005288, -0.69663055, -0.89686172,
        0.97635214, -0.71442887, -0.51292188])

In [5]:
y[0]

array([1])

In [6]:
import numpy as np
import pandas as pd
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.optimizers import Adam

In [7]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model():
    # create model
    model = Sequential()
    model.add(Dense(32, activation='relu', input_shape=(inputs,)))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

# create model
model = KerasClassifier(build_fn=create_model, verbose=0)

# define the grid search parameters
batch_size = [20, 50, 100]
epochs = [20, 50, 100]
param_grid = dict(batch_size=batch_size, epochs=epochs)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where




Best: 0.8250825007756551 using {'batch_size': 100, 'epochs': 50}
Means: 0.8184818426767985, Stdev: 0.032671590713967885 with: {'batch_size': 20, 'epochs': 20}
Means: 0.7986798683802286, Stdev: 0.024697401133156067 with: {'batch_size': 20, 'epochs': 50}
Means: 0.7920792102813721, Stdev: 0.03523787151819811 with: {'batch_size': 20, 'epochs': 100}
Means: 0.7986798683802286, Stdev: 0.030605992514000077 with: {'batch_size': 50, 'epochs': 20}
Means: 0.8217821717262268, Stdev: 0.014002110305986236 with: {'batch_size': 50, 'epochs': 50}
Means: 0.7821782231330872, Stdev: 0.028004220611972473 with: {'batch_size': 50, 'epochs': 100}
Means: 0.7953795393308004, Stdev: 0.0336568844486241 with: {'batch_size': 100, 'epochs': 20}
Means: 0.8250825007756551, Stdev: 0.032671590713967885 with: {'batch_size': 100, 'epochs': 50}
Means: 0.7953795393308004, Stdev: 0.032671590713967885 with: {'batch_size': 100, 'epochs': 100}


In [9]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model(optimizer='adam'):
    # create model
    model = Sequential()
    model.add(Dense(32, activation='relu', input_shape=(inputs,)))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model


# create model
model = KerasClassifier(build_fn=create_model, verbose=0, epochs=50, batch_size=100)

# define the grid search parameters
optimizer = ['Adam', 'Adamax', 'Nadam']
param_grid = dict(optimizer=optimizer)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 



Best: 0.8118811845779419 using {'optimizer': 'Adam'}
Means: 0.8118811845779419, Stdev: 0.029147716944745876 with: {'optimizer': 'Adam'}
Means: 0.7755775650342306, Stdev: 0.024697401133156067 with: {'optimizer': 'Adamax'}
Means: 0.8085808555285136, Stdev: 0.04148449288410131 with: {'optimizer': 'Nadam'}


In [14]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model(learn_rate=0.01):
    # create model
    model = Sequential()
    model.add(Dense(32, activation='relu', input_shape=(inputs,)))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    adam = Adam(learning_rate=learn_rate)
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model


# create model
model = KerasClassifier(build_fn=create_model, verbose=0, epochs=50, batch_size=100)

# define the grid search parameters
learn_rate = [0.001, 0.01, 0.1, 0.2, 0.3]
param_grid = dict(learn_rate=learn_rate)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 



Best: 0.8118811845779419 using {'learn_rate': 0.001}
Means: 0.8118811845779419, Stdev: 0.04501048723594382 with: {'learn_rate': 0.001}
Means: 0.8118811845779419, Stdev: 0.0370461016997341 with: {'learn_rate': 0.01}
Means: 0.7920792102813721, Stdev: 0.029147716944745876 with: {'learn_rate': 0.1}
Means: 0.528052806854248, Stdev: 0.16613483093853232 with: {'learn_rate': 0.2}
Means: 0.4422442317008972, Stdev: 0.05197366159998095 with: {'learn_rate': 0.3}


In [15]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model(init_mode='uniform'):
    # create model
    model = Sequential()
    model.add(Dense(32, activation='relu', input_shape=(inputs,), kernel_initializer=init_mode))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    adam = Adam(learning_rate=0.001)
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model


# create model
model = KerasClassifier(build_fn=create_model, verbose=0, epochs=50, batch_size=100)

# define the grid search parameters
init_mode = ['uniform', 'lecun_uniform', 'normal', 'zero', 'glorot_normal', 'glorot_uniform', 'he_normal', 'he_uniform']
param_grid = dict(init_mode=init_mode)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor




Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Best: 0.8184818426767985 using {'init_mode': 'normal'}
Means: 0.801980197429657, Stdev: 0.021388576788767738 with: {'init_mode': 'uniform'}
Means: 0.8052805264790853, Stdev: 0.012348700566578033 with: {'init_mode': 'lecun_uniform'}
Means: 0.8184818426767985, Stdev: 0.009334740203990824 with: {'init_mode': 'normal'}
Means: 0.5445544521013895, Stdev: 0.08401268993381643 with: {'init_mode': 'zero'}
Means: 0.7854785521825155, Stdev: 0.028390503971451875 with: {'init_mode': 'glorot_normal'}
Means: 0.8118811845779419, Stdev: 0.008084122154383987 with: {'init_mode': 'glorot_uniform'}
Means: 0.7821782231330872, Stdev: 0.04501048723594382 with: {'init_mode': 'he_normal'}
Means: 0.7788778940836588, Stdev: 0.024697401133156067 with: {'init_mode': 'he_uniform'}


In [16]:
#Initializes and Experiment

inputs = X.shape[1]

def create_model(activation='relu'):
    # create model
    model = Sequential()
    model.add(Dense(32, activation=activation, input_shape=(inputs,), kernel_initializer='normal'))
    model.add(Dense(32, activation=activation))
    model.add(Dense(32, activation=activation))
    model.add(Dense(1, activation='sigmoid'))
    adam = Adam(learning_rate=0.001)
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model


# create model
model = KerasClassifier(build_fn=create_model, verbose=0, epochs=50, batch_size=100)

# define the grid search parameters
activation = ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear']
param_grid = dict(activation=activation)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 



Best: 0.8250825007756551 using {'activation': 'softplus'}
Means: 0.5445544521013895, Stdev: 0.08401268993381643 with: {'activation': 'softmax'}
Means: 0.8250825007756551, Stdev: 0.024697401133156067 with: {'activation': 'softplus'}
Means: 0.8151815136273702, Stdev: 0.032671590713967885 with: {'activation': 'softsign'}
Means: 0.8052805264790853, Stdev: 0.03645332582644608 with: {'activation': 'relu'}
Means: 0.8118811845779419, Stdev: 0.008084122154383987 with: {'activation': 'tanh'}
Means: 0.6930692990620931, Stdev: 0.18907175676458618 with: {'activation': 'sigmoid'}
Means: 0.504950483640035, Stdev: 0.09834761866489408 with: {'activation': 'hard_sigmoid'}
Means: 0.8118811845779419, Stdev: 0.0370461016997341 with: {'activation': 'linear'}


In [30]:
#Initializes and Experiment
from keras.constraints import maxnorm
from keras.layers import Dropout

inputs = X.shape[1]

def create_model(activation='softplus', neurons=1):
    # create model
    model = Sequential()
    model.add(Dense(neurons, activation=activation, input_shape=(inputs,), kernel_initializer='normal'))
    model.add(Dense(neurons, activation=activation))
    model.add(Dense(neurons, activation=activation))
    model.add(Dense(1, activation='sigmoid'))
    adam = Adam(learning_rate=0.001)
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
    return model


# create model
model = KerasClassifier(build_fn=create_model, verbose=0, epochs=50, batch_size=100)

# define the grid search parameters
neurons = [1, 5, 10, 15, 20, 25, 32, 64]
param_grid = dict(neurons=neurons)

# Create Grid Search
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=5)
grid_result = grid.fit(X, y)

# Report Results
print(f"Best: {grid_result.best_score_} using {grid_result.best_params_}")
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print(f"Means: {mean}, Stdev: {stdev} with: {param}") 



Best: 0.8250825078573951 using {'neurons': 20}
Means: 0.40924091917453426, Stdev: 0.031135968791725418 with: {'neurons': 1}
Means: 0.768976905164939, Stdev: 0.0770979564404335 with: {'neurons': 5}
Means: 0.8184818556599884, Stdev: 0.03498577679344864 with: {'neurons': 10}
Means: 0.8085808728394335, Stdev: 0.04403645157024128 with: {'neurons': 15}
Means: 0.8250825078573951, Stdev: 0.023153456038043113 with: {'neurons': 20}
Means: 0.8184818430702285, Stdev: 0.03522977244455077 with: {'neurons': 25}
Means: 0.8085808519876436, Stdev: 0.04991059209266743 with: {'neurons': 32}
Means: 0.8118811914629669, Stdev: 0.050709170330398054 with: {'neurons': 64}
