# Grid Search Example

In this example we will perform a GridSearch using Scikit-learn and Keras on the [breast cancer](https://github.com/autonomio/datasets/blob/master/autonomio-datasets/breast_cancer.csv) classification task. You can run this example on CPU. It will take more or less 5 minutes.

This example was taken from [Talos](https://github.com/autonomio/talos/blob/master/examples/Hyperparameter%20Optimization%20on%20Keras%20with%20Breast%20Cancer%20Data.ipynb) which is another super interesting library to perfrom Grid and Random Search with Keras.

## Initial Setup

Import the packages we need for the computation.

In [1]:
import os
import pandas as pd
import wrangle as wr

from numpy import nan

from keras.utils import to_categorical
from keras.wrappers.scikit_learn import KerasClassifier

# Mounting point
MP = '/floyd/input/bcds'

Using TensorFlow backend.


### Load the dataset

Load, clean and preprocess the dataset

In [2]:
def breast_cancer():
    '''Load and preprocess(cleaning) the dataset'''
    df = pd.read_csv(os.path.join(MP, 'breast_cancer.csv'))
    
    # then some minimal data cleanup
    df.drop("Unnamed: 32", axis=1, inplace=True)
    df.drop("id", axis=1, inplace=True)

    # separate to x and y
    y = df.diagnosis.values
    x = df.drop('diagnosis', axis=1).values

    # convert the string labels to binary
    y = (y == 'M').astype(int)

    return x, y

In [3]:
# Load the dataset
x, y = breast_cancer()

# Normalize every feature to mean 0, std 1
x = wr.mean_zero(pd.DataFrame(x)).values

input_dim = x.shape[1] # number of columns

### Model definition

Define the model and the variables to search.

In [4]:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten

# Function to create model, required for KerasClassifier
def create_model(first_neuron=9,
                 activation='relu',
                 kernel_initializer='uniform',
                 dropout_rate=0,
                 optimizer='Adam'):
    
    # Create model
    model = Sequential()
    # L1
    model.add(Dense(first_neuron, 
                    input_dim=input_dim, 
                    kernel_initializer=kernel_initializer, 
                    activation=activation))
    # Dropout
    model.add(Dropout(dropout_rate))
    # L2
    model.add(Dense(1, kernel_initializer=kernel_initializer, activation='sigmoid'))
    # Compile model
    model.compile(loss='binary_crossentropy', 
                  optimizer=optimizer, 
                  metrics=['accuracy'])
    return model

In [5]:
# Create the model
model = KerasClassifier(build_fn=create_model) 

### Range of Values - The Grid

Defining the parameter space boundaries.

In [6]:
# Define the range of values

# Model Design Components
first_neurons = [8, 9] 
activation =  ['relu', 'elu'] # You can also try 'tanh', 'sigmoid', 'hard_sigmoid', 'linear'
kernel_initializer = ['uniform', 'normal'] # You can also try lecun_uniform', 'zero', 'glorot_normal', 'glorot_uniform', 'he_normal', 'he_uniform'
optimizer = ['Adam', 'Nadam'] # You can also try 'SGD', 'RMSprop', 'Adagrad', 'Adadelta', 'Adamax'

# Hyperparameters
epochs = [10] # You can also try 20, 30, 40, etc...
batch_size = [1024] # You can also try 2, 4, 8, 16, 32, 64, 128 etc...
dropout_rate = [0.0] # No dropout, but you can also try 0.1, 0.2 etc...

In [7]:
# Prepare the Grid
param_grid = dict(epochs=epochs, 
                  batch_size=batch_size, 
                  optimizer=optimizer,
                  dropout_rate=dropout_rate,
                  activation=activation,
                  kernel_initializer=kernel_initializer,
                  first_neuron=first_neurons)

### Grid Search

Run the search: 3 folds for cross-validation (`cv=3`) on a single process (`n_jobs=1`).

In [8]:
# Perform the Search!
from sklearn.model_selection import GridSearchCV

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3, verbose=2)
grid_result = grid.fit(x, y) 

Fitting 3 folds for each of 16 candidates, totalling 48 fits
[CV] activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam 
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV]  activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam, total=   0.7s
[CV] activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam 


[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.7s remaining:    0.0s


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV]  activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam, total=   0.7s
[CV] activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam 
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV]  activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Adam, total=   0.8s
[CV] activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=Nadam 
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
[CV]  activation=relu, batch_size=1024, dropout_rate=0.0, epochs=10, first_neuron=8, kernel_initializer=uniform, optimizer=N

[Parallel(n_jobs=1)]: Done  48 out of  48 | elapsed:  1.2min finished


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


### Results

Let's see which configuration give us the best performance.

In [9]:
# Show results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.938489 using {'activation': 'elu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 9, 'kernel_initializer': 'uniform', 'optimizer': 'Adam'}
0.859402 (0.094322) with: {'activation': 'relu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 8, 'kernel_initializer': 'uniform', 'optimizer': 'Adam'}
0.896309 (0.043754) with: {'activation': 'relu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 8, 'kernel_initializer': 'uniform', 'optimizer': 'Nadam'}
0.866432 (0.058552) with: {'activation': 'relu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 8, 'kernel_initializer': 'normal', 'optimizer': 'Adam'}
0.926186 (0.034017) with: {'activation': 'relu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 8, 'kernel_initializer': 'normal', 'optimizer': 'Nadam'}
0.887522 (0.055293) with: {'activation': 'relu', 'batch_size': 1024, 'dropout_rate': 0.0, 'epochs': 10, 'first_neuron': 9

**That's all folks - don't forget to shutdown your workspace once you're done 🙂**