# **Keras Hyperparameter Tuning**


## Part of the class Advanced Methods in Data Analysis II


Keras provides a wrapper class KerasClassifier that allows us to use our deep learning models with scikit-learn, this is especially useful when you want to tune hyperparameters using scikit-learn's RandomizedSearchCV or GridSearchCV.

To use it, we first define a function that takes the arguments that we wish to tune, inside the function, you define the network's structure as usual and compile it. Then the function is passed to KerasClassifier's build_fn parameter. Note that like all other estimators in scikit-learn, build_fn should provide default values for its arguments, so that we could create the estimator even without passing in values for every parameters.

In [1]:
import numpy as np
import pandas as pd

from keras.models import Sequential
from keras.datasets import fashion_mnist
from tensorflow.keras.utils import to_categorical
from keras import models
from keras import layers

from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import cross_val_score
from sklearn.metrics import accuracy_score

Preparing data and labels

In [2]:
(X_train_image, y_train_labels), (X_test_image, y_test_labels) = fashion_mnist.load_data()

X_train_image = X_train_image.reshape((60000, 28 * 28))
X_train_image = X_train_image.astype('float32') / 255
X_test_image = X_test_image.reshape((10000, 28 * 28))
X_test_image = X_test_image.astype('float32') / 255

y_train_labels = to_categorical(y_train_labels)
y_test_labels = to_categorical(y_test_labels)

Returns the indices of the maximum values along an axis.

In [3]:
rounded_labels=np.argmax(y_train_labels, axis=1)
rounded_labels[1]

0

Model that you are optimizing. This is an instance of the model with values of hyperparameters set that you want to optimize.

In [4]:
def build_keras_base(hidden_layers = [512, 10],  
                     optimizer='sgd'):
    model_opt = models.Sequential()
    model_opt.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,)))
    model_opt.add(layers.Dense(10, activation='softmax'))
    model_opt.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
    return model_opt

To use these wrappers you must define a function that creates and returns your Keras sequential model, then pass this function to the build_fn argument when constructing the KerasClassifier class.

In [5]:
model_keras = KerasClassifier(
    build_fn = build_keras_base, verbose=0
   
)

When constructing this class you must provide a dictionary of hyperparameters to evaluate in the param_distributions argument. This is a map of the model parameter name and an array of values to try.

In [6]:
hidden_layers_opt=[[90,10], [128,10], [512, 10]] 
batch_size_opt= [128, 256] 
epochs_opt= [5, 10]
optimizer= ['sgd', 'rmsprop']

parameters_opt = {'hidden_layers':hidden_layers_opt,
                  'optimizer': optimizer,
                  'batch_size': batch_size_opt, 
                  'epochs': epochs_opt , 
                  
                 }

Random search is great for discovery and getting hyperparameter combinations that you would not have guessed intuitively, although it often requires more time to execute.

In [7]:
grid_search =  RandomizedSearchCV(model_keras,
                                  param_distributions = parameters_opt, 
                                  scoring = 'accuracy', 
                                  n_iter = 3, 
                                  cv = 10)


Once completed, you can access the outcome of the grid search in the result object returned from grid.fit(). The best_accuracy_ member provides access to the best score observed during the optimization procedure and the best_parameters_ describes the combination of parameters that achieved the best results.

In [8]:
grid_result = grid_search.fit(X_train_image, rounded_labels, verbose=0)

In [9]:
best_parameters = grid_search.best_params_
best_accuracy = grid_search.best_score_
print(best_parameters)
print(best_accuracy)

{'optimizer': 'rmsprop', 'hidden_layers': [512, 10], 'epochs': 10, 'batch_size': 256}
0.8858333333333335


-------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------

Grid Search. Define a search space as a grid of hyperparameter values and evaluate every position in the grid.
Grid search is great for spot-checking combinations that are known to perform well generally. 

In [10]:
grid_search = GridSearchCV(estimator=model_keras, 
                           param_grid =parameters_opt, 
                           scoring = 'accuracy', 
                           cv = 10)

In [None]:
grid_result = grid_search.fit(X_train_image, rounded_labels)

In [13]:
best_parameters = grid_search.best_params_
best_accuracy = grid_search.best_score_
print(best_parameters)
print(best_accuracy)

{'batch_size': 128, 'epochs': 10, 'hidden_layers': [90, 10], 'optimizer': 'rmsprop'}
0.8921333333333334
