**Dataset Context:**

This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.
Content

The dataset consists of several medical predictor variables and one target variable, Outcome. Predictor variables includes the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

Features:
- Pregnancies: Number of times pregnant
- Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
- BloodPressure: Diastolic blood pressure (mm Hg)
- SkinThickness: Triceps skin fold thickness (mm)
- Insulin: 2-Hour serum insulin (mu U/ml)
- BMI: Body mass index (weight in kg/(height in m)^2)
- DiabetesPedigreeFunction: Diabetes pedigree function
- Age: Age (years)

Label:

- Outcome: Class variable (0 or 1) 268 of 768 are 1, the others are 0

**References:**

- https://www.kaggle.com/uciml/pima-indians-diabetes-database

In [1]:
import os
os.chdir(os.path.join(os.getcwd(), os.pardir))

In [77]:
import numpy as np
import pandas as pd

# To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)

# tensorflow and related layers
import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout

from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.constraints import MaxNorm

from sklearn.model_selection import GridSearchCV

# progress status
from tqdm.notebook import tqdm

# to make this notebook's output stable across runs
np.random.seed(42)
tf.random.set_seed(42)

In [3]:
url = "https://raw.githubusercontent.com/tzamalisp/dev_tasks/master/datasets/pima-indians-diabetes.csv"
data = pd.read_csv(url)

In [4]:
data.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


## Function to create the model, required for KerasClassifier

In [5]:
def create_model():
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

## Split into features (X) and labels (y) variables

In [6]:
data.columns

Index(['Pregnancies', 'Glucose', 'BloodPressure', 'SkinThickness', 'Insulin',
       'BMI', 'DiabetesPedigreeFunction', 'Age', 'Outcome'],
      dtype='object')

In [7]:
X = data[["Pregnancies", "Glucose", "BloodPressure", "SkinThickness", "Insulin", 
          "BMI", "DiabetesPedigreeFunction", "Age"]]
y = data["Outcome"]

## Tune Batch Size and Number of Epochs

### Create the model with the KerasClassifier

In [8]:
clf_params = {
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [9]:
model = KerasClassifier(build_fn=create_model)

In [10]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f8ec31f7550>

In [11]:
model.get_params()

{'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model()>}

### Define the Grid Search parameters

In [12]:
param_grid = {
    "batch_size": [10, 20, 40, 60, 80, 100],
    "epochs": [10, 50, 100]
}

### Initiate and train the Grid Search classifier

In [13]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [14]:
classifier = grid.fit(X, y)

### Summarize the results

In [15]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)

Best accuracy: 0.70
Best parameters for training: {'batch_size': 10, 'epochs': 100}


In [16]:
means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

In [17]:
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

0.557292 (0.028940) with: {'batch_size': 10, 'epochs': 10}
0.644531 (0.019918) with: {'batch_size': 10, 'epochs': 50}
0.695312 (0.014616) with: {'batch_size': 10, 'epochs': 100}
0.569010 (0.012890) with: {'batch_size': 20, 'epochs': 10}
0.673177 (0.051560) with: {'batch_size': 20, 'epochs': 50}
0.683594 (0.019401) with: {'batch_size': 20, 'epochs': 100}
0.558594 (0.044993) with: {'batch_size': 40, 'epochs': 10}
0.645833 (0.017566) with: {'batch_size': 40, 'epochs': 50}
0.651042 (0.042473) with: {'batch_size': 40, 'epochs': 100}
0.540365 (0.089778) with: {'batch_size': 60, 'epochs': 10}
0.657552 (0.023510) with: {'batch_size': 60, 'epochs': 50}
0.679688 (0.037603) with: {'batch_size': 60, 'epochs': 100}
0.519531 (0.103399) with: {'batch_size': 80, 'epochs': 10}
0.595052 (0.033502) with: {'batch_size': 80, 'epochs': 50}
0.652344 (0.036782) with: {'batch_size': 80, 'epochs': 100}
0.548177 (0.051263) with: {'batch_size': 100, 'epochs': 10}
0.621094 (0.014616) with: {'batch_size': 100, 'epo

## Tune the Training Optimization Algorithm

Tune the optimization algorithm used to train the network, each with default parameters. Usually, one approach a priori will be chosen and focus will take place on tuning its parameters on the relevant problem, as shown in the next section.

Add the `optimizer` argument in order to set the relevant grid parameters on Grid Search:

In [18]:
def create_model(optimizer="adam"):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

Add the best parameters from the above `fit()` method's hyperparamter tuning:

In [19]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [20]:
model = KerasClassifier(build_fn=create_model)

In [21]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f8f58314e90>

In [22]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(optimizer='adam')>}

In [24]:
param_grid = {
    "optimizer": ["SGD", "RMSprop", "Adagrad", "Adadelta", "Adam", "Adamax", "Nadam"]
}

In [25]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [26]:
classifier = grid.fit(X, y)

In [27]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.70
Best parameters for training: {'optimizer': 'Adam'}

Means, STDs, and Parameters for each grid combination:
0.665365 (0.038976) with: {'optimizer': 'SGD'}
0.695312 (0.014616) with: {'optimizer': 'RMSprop'}
0.550781 (0.082741) with: {'optimizer': 'Adagrad'}
0.626302 (0.021710) with: {'optimizer': 'Adadelta'}
0.703125 (0.038670) with: {'optimizer': 'Adam'}
0.682292 (0.027498) with: {'optimizer': 'Adamax'}
0.667969 (0.030425) with: {'optimizer': 'Nadam'}


## Tune the Learning Rate and Momentum

**Optimizing the SGD learning rate and momentum parameters.**

Learning rate controls how much to update the weight at the end of each batch and the momentum controls how much to let the previous update influence the current weight update.

It is a good idea to include the number of epochs in an optimization like this as there is a dependency between the amount of learning per batch (learning rate), the number of updates per epoch (batch size) and the number of epochs.

Add the `learning_rate` and the `momentum` arguments which correspond to the `optimizer`, in order to set the relevant grid parameters on Grid Search:

In [28]:
def create_model(learning_rate=0.01, momentum=0):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Set the SGD optimizer and Compile the model
    optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=momentum)
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

Add the best parameters from the first `fit()` method's hyperparamter tuning:

In [29]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [30]:
model = KerasClassifier(build_fn=create_model)

In [31]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f8ec3217c90>

In [32]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(learning_rate=0.01, momentum=0)>}

Try a suite of small standard learning rates and a momentum values from 0.2 to 0.8 in steps of 0.2, as well as 0.9 (because it can be a popular value in practice).

In [34]:
param_grid = {
    "learning_rate": [0.001, 0.01, 0.1, 0.2, 0.3],
    "momentum": [0.0, 0.2, 0.4, 0.6, 0.8, 0.9]
}

In [35]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [36]:
classifier = grid.fit(X, y)

In [37]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.69
Best parameters for training: {'learning_rate': 0.001, 'momentum': 0.0}

Means, STDs, and Parameters for each grid combination:
0.692708 (0.045814) with: {'learning_rate': 0.001, 'momentum': 0.0}
0.683594 (0.029232) with: {'learning_rate': 0.001, 'momentum': 0.2}
0.690104 (0.012890) with: {'learning_rate': 0.001, 'momentum': 0.4}
0.675781 (0.003189) with: {'learning_rate': 0.001, 'momentum': 0.6}
0.684896 (0.022628) with: {'learning_rate': 0.001, 'momentum': 0.8}
0.657552 (0.031466) with: {'learning_rate': 0.001, 'momentum': 0.9}
0.653646 (0.009207) with: {'learning_rate': 0.01, 'momentum': 0.0}
0.653646 (0.028940) with: {'learning_rate': 0.01, 'momentum': 0.2}
0.662760 (0.009744) with: {'learning_rate': 0.01, 'momentum': 0.4}
0.649740 (0.026557) with: {'learning_rate': 0.01, 'momentum': 0.6}
0.651042 (0.024774) with: {'learning_rate': 0.01, 'momentum': 0.8}
0.651042 (0.024774) with: {'learning_rate': 0.01, 'momentum': 0.9}
0.651042 (0.024774) with: {'learning_rate'

## Tune the Network Weight Initialization

**Use small random values.**

Here, the same weight initialization method on each layer is used. Rectifier for the hidden layer is used, and the sigmoid for the output layer because the predictions are binary.

Ideally, it may be better to use different weight initialization schemes according to the activation function used on each layer.

Check the initializers from the **Keras API**:

- https://keras.io/api/layers/initializers/

Add the `initilization_mode` argument in the layers' `kernel_initializer` in order to set the relevant grid parameters on Grid Search:

In [38]:
def create_model(initilization_mode='uniform'):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer=initilization_mode, activation='relu'))
    model.add(Dense(1, kernel_initializer=initilization_mode, activation='sigmoid'))
    # Set the SGD optimizer and Compile the model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Add the best parameters from the first `fit()` method's hyperparamter tuning:

In [39]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [40]:
model = KerasClassifier(build_fn=create_model)

In [41]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f90595b49d0>

In [42]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(initilization_mode='uniform')>}

Set up the various initializers:

In [43]:
param_grid = {
    "initilization_mode": ['uniform', 'lecun_uniform', 'normal', 'zero', 'glorot_normal', 
                           'glorot_uniform', 'he_normal', 'he_uniform']
}

In [44]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [45]:
classifier = grid.fit(X, y)

In [46]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.72
Best parameters for training: {'initilization_mode': 'uniform'}

Means, STDs, and Parameters for each grid combination:
0.716146 (0.017566) with: {'initilization_mode': 'uniform'}
0.678385 (0.046256) with: {'initilization_mode': 'lecun_uniform'}
0.710938 (0.006379) with: {'initilization_mode': 'normal'}
0.651042 (0.024774) with: {'initilization_mode': 'zero'}
0.687500 (0.031894) with: {'initilization_mode': 'glorot_normal'}
0.675781 (0.006379) with: {'initilization_mode': 'glorot_uniform'}
0.697917 (0.015733) with: {'initilization_mode': 'he_normal'}
0.639323 (0.062933) with: {'initilization_mode': 'he_uniform'}


## Tune the Neuron Activation Function

Generally, the rectifier activation function is the most popular (which is initialized here), but in the past it used to be the sigmoid and the tanh functions and these functions may still be more suitable for different problems.

Evaluate the suite of different activation functions available in Keras. These functions will be only used in the hidden layer, as a sigmoid activation function is required in the output for the binary classification problem.

Generally, it is a good idea to prepare data to the range of the different transfer functions, which we will not do in this case.

Activation functions in Keras:

- https://keras.io/api/layers/activations/

Add the `activation` argument in order to set the relevant grid parameters on Grid Search:

In [47]:
def create_model(activation='relu'):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation=activation))
    model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Add the best parameters from the first `fit()` method's hyperparamter tuning:

In [48]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [49]:
model = KerasClassifier(build_fn=create_model)

In [50]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f90595ba510>

In [51]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(activation='relu')>}

Set up the various initializers:

In [52]:
param_grid = {
    "activation": ['softmax', 'softplus', 'softsign', 'relu', 'tanh', 'sigmoid', 'hard_sigmoid', 'linear']
}

In [53]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [54]:
classifier = grid.fit(X, y)

In [55]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.74
Best parameters for training: {'activation': 'softplus'}

Means, STDs, and Parameters for each grid combination:
0.680990 (0.021236) with: {'activation': 'softmax'}
0.736979 (0.027126) with: {'activation': 'softplus'}
0.694010 (0.010253) with: {'activation': 'softsign'}
0.713542 (0.009744) with: {'activation': 'relu'}
0.694010 (0.019225) with: {'activation': 'tanh'}
0.707031 (0.013902) with: {'activation': 'sigmoid'}
0.677083 (0.031304) with: {'activation': 'hard_sigmoid'}
0.714844 (0.033146) with: {'activation': 'linear'}


## Tune the Dropout Regularization

Tune dropout in an effort to limit overfitting and improve the model’s ability to generalize. *To get good results, dropout is best combined with a weight constraint such as the max norm constraint.* This involves fitting both the dropout percentage and the weight constraint. 

Try dropout percentages between 0.0 and 0.9 (1.0 does not make sense) and maxnorm weight constraint values between 0 and 5.

Add the `dropout_rate` and `weight_constraint` arguments in order to set the relevant grid parameters on Grid Search:

In [78]:
def create_model(dropout_rate=0.0, weight_constraint=0):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, kernel_initializer='uniform', activation='linear', 
                    kernel_constraint=MaxNorm(weight_constraint)))
    model.add(Dropout(dropout_rate))
    model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Add the best parameters from the first `fit()` method's hyperparamter tuning:

In [79]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [80]:
model = KerasClassifier(build_fn=create_model)

In [81]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f8f497aa950>

In [82]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(dropout_rate=0.0, weight_constraint=0)>}

Set up the various initializers:

In [83]:
param_grid = {
    "weight_constraint": [1, 2, 3, 4, 5],
    "dropout_rate": [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
}

In [84]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [85]:
classifier = grid.fit(X, y)

In [86]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.73
Best parameters for training: {'dropout_rate': 0.0, 'weight_constraint': 1}

Means, STDs, and Parameters for each grid combination:
0.731771 (0.013279) with: {'dropout_rate': 0.0, 'weight_constraint': 1}
0.727865 (0.018688) with: {'dropout_rate': 0.0, 'weight_constraint': 2}
0.695312 (0.027251) with: {'dropout_rate': 0.0, 'weight_constraint': 3}
0.717448 (0.008027) with: {'dropout_rate': 0.0, 'weight_constraint': 4}
0.716146 (0.032578) with: {'dropout_rate': 0.0, 'weight_constraint': 5}
0.699219 (0.008438) with: {'dropout_rate': 0.1, 'weight_constraint': 1}
0.712240 (0.013279) with: {'dropout_rate': 0.1, 'weight_constraint': 2}
0.720052 (0.017566) with: {'dropout_rate': 0.1, 'weight_constraint': 3}
0.710938 (0.008438) with: {'dropout_rate': 0.1, 'weight_constraint': 4}
0.699219 (0.017758) with: {'dropout_rate': 0.1, 'weight_constraint': 5}
0.696615 (0.012075) with: {'dropout_rate': 0.2, 'weight_constraint': 1}
0.713542 (0.010253) with: {'dropout_rate': 0.2, 'weight_

## Tune the Number of Neurons in the Hidden Layer

**Tuning the number of neurons in a single hidden layer.**

The number of neurons in a layer is an important parameter to tune. Generally, the number of neurons in a layer controls the representational capacity of the network, at least at that point in the topology. Also, a large enough single layer network could approximate any other neural network.

A larger network requires more training and at least the batch size and number of epochs should ideally be optimized with the number of neurons.

Add the `neurons` argument in order to set the relevant grid parameters on Grid Search:

In [87]:
def create_model(neurons=1):
    # create model
    model = Sequential()
    model.add(Dense(neurons, input_dim=8, kernel_initializer='uniform', activation='linear', 
                    kernel_constraint=MaxNorm(4)))
    model.add(Dropout(0.2))
    model.add(Dense(1, kernel_initializer='uniform', activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

Add the best parameters from the first `fit()` method's hyperparamter tuning:

In [88]:
clf_params = {
    "epochs": 100,
    "batch_size": 10,
    "validation_split": 0.2,
    "shuffle": True,
    "use_multiprocessing": True,
    "verbose": 0
}

In [89]:
model = KerasClassifier(build_fn=create_model)

In [90]:
model.set_params(**clf_params)

<tensorflow.python.keras.wrappers.scikit_learn.KerasClassifier at 0x7f8f583096d0>

In [91]:
model.get_params()

{'epochs': 100,
 'batch_size': 10,
 'validation_split': 0.2,
 'shuffle': True,
 'use_multiprocessing': True,
 'verbose': 0,
 'build_fn': <function __main__.create_model(neurons=1)>}

Set up the values from 1 to 30 in steps of 5:

In [92]:
param_grid = {
    "neurons": [1, 5, 10, 15, 20, 25, 30]
}

In [93]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)

In [94]:
classifier = grid.fit(X, y)

In [95]:
print("Best accuracy: %.2f" % classifier.best_score_)
print("Best parameters for training: %s" % classifier.best_params_)


means = classifier.cv_results_['mean_test_score']
stds = classifier.cv_results_['std_test_score']
params = classifier.cv_results_['params']

print()
print("Means, STDs, and Parameters for each grid combination:")
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best accuracy: 0.72
Best parameters for training: {'neurons': 15}

Means, STDs, and Parameters for each grid combination:
0.695312 (0.019401) with: {'neurons': 1}
0.720052 (0.033197) with: {'neurons': 5}
0.699219 (0.011049) with: {'neurons': 10}
0.722656 (0.027805) with: {'neurons': 15}
0.690104 (0.014382) with: {'neurons': 20}
0.714844 (0.006379) with: {'neurons': 25}
0.713542 (0.004872) with: {'neurons': 30}
