# Keras Hyperparametr Tuning with ScikitLearn GridSearchCV

Use [SciKeras.](https://www.adriangb.com/scikeras/stable/index.html) for hyper parameter tuning using Scikit Learn.

* [SciKeras.](https://www.adriangb.com/scikeras/stable/index.html)

> The goal of scikeras is to make it possible to use Keras/TensorFlow with sklearn. This is achieved by providing a wrapper around Keras that has an Scikit-Learn interface. SciKeras is the successor to tf.keras.wrappers.scikit_learn, and offers many improvements over the TensorFlow version of the wrappers. See Migration for a more details.

* [How to Grid Search Hyperparameters for Deep Learning Models in Python with Keras](https://machinelearningmastery.com/grid-search-hyperparameters-deep-learning-models-python-keras/)

> Keras models can be used in scikit-learn by wrapping them with the KerasClassifier or KerasRegressor class from the module SciKeras. 

* [How to pass callbacks to scikit_learn wrappers (e.g. KerasClassifier)](https://github.com/keras-team/keras/issues/4278)

> ```
> model = KerasClassifier(build_fn=create_model, epochs=100, verbose=1, validation_split=.2)
> grid_search = GridSearchCV(model, para_grid, n_jobs=-1, cv=5, refit='accuracy')
> grid_search.fit(X, y, callbacks=EarlyStopping(monitor='val_loss', patience=3))
> ```

## KerasClassifier

```KerasClassifier``` has its required parameters so that it can compile keras Model.

* [scikeras.wrappers.KerasClassifier](https://www.adriangb.com/scikeras/stable/generated/scikeras.wrappers.KerasClassifier.html#scikeras-wrappers-kerasclassifier)


In [1]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import (
    Model,
    Sequential
)
from tensorflow.keras.layers import (
    Layer,
    Dense, 
    Dropout, 
    Flatten, 
    Normalization,
    BatchNormalization,
    Activation,
    Conv2D, 
    MaxPooling2D,
)

import sklearn
from sklearn.model_selection import (
    GridSearchCV
)
from scikeras.wrappers import (
    KerasClassifier, 
)

In [2]:
print('The scikit-learn version is {}.'.format(sklearn.__version__))

The scikit-learn version is 1.1.3.


# CIFAR-10

In [3]:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
input_shape = x_train[0].shape
number_of_classes = 10

In [4]:
max_value = float(np.max(x_train))
x_train_normed, x_test_normed = x_train/max_value, x_test/max_value

# Model

In [5]:
learning_rate: float = 1e-2

In [6]:
def create_model():
    model = Sequential([
        Conv2D(
            name="conv",
            filters=32,
            kernel_size=(3, 3),
            strides=(1, 1),
            padding="same",
            activation='relu',
            input_shape=input_shape
        ),
        MaxPooling2D(
            name="maxpool",
            pool_size=(2, 2)
        ),
        Flatten(),
        Dense(
            name="full",
            units=100,
            activation="relu"
        ),
        Dense(
            name="label",
            units=number_of_classes,
            activation="softmax"
        )
    ])
    model.compile(
        loss=tf.keras.losses.sparse_categorical_crossentropy,
        optimizer=tf.keras.optimizers.Adam(learning_rate=learning_rate),
        metrics=['accuracy']
    )
    return model

In [7]:
model = KerasClassifier(model=create_model, verbose=2)

# Grid Search

In [8]:
batch_size = [32]
epochs = [2, 3]
lr = [1e-1, 1e-2]
param_grid = dict(batch_size=batch_size, epochs=epochs)


param_grid = {
    'optimizer__learning_rate': [1e-3, 1e-2, 1e-1],
    'batch_size': [32, 64]
}

In [9]:
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1, cv=3)
grid_result = grid.fit(x_train, y_train)

2023-02-17 22:46:12.036434: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.492552: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.492972: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.494443: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.495896: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.502127: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.510536: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
2023-02-17 22:46:12.592541: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


1042/1042 - 25s - loss: 7.8157 - accuracy: 0.0980 - 25s/epoch - 24ms/step
1042/1042 - 25s - loss: 6.0225 - accuracy: 0.1017 - 25s/epoch - 24ms/step
1042/1042 - 25s - loss: 7.0546 - accuracy: 0.1013 - 25s/epoch - 24ms/step
1042/1042 - 25s - loss: 8.9751 - accuracy: 0.0978 - 25s/epoch - 24ms/step
1042/1042 - 25s - loss: 5.3383 - accuracy: 0.0993 - 25s/epoch - 24ms/step
1042/1042 - 25s - loss: 8.9572 - accuracy: 0.0972 - 25s/epoch - 24ms/step
1042/1042 - 26s - loss: 6.1767 - accuracy: 0.0967 - 26s/epoch - 25ms/step
1042/1042 - 26s - loss: 7.8066 - accuracy: 0.0986 - 26s/epoch - 25ms/step
521/521 - 5s - 5s/epoch - 10ms/step
521/521 - 6s - 6s/epoch - 11ms/step
521/521 - 6s - 6s/epoch - 11ms/step
521/521 - 5s - 5s/epoch - 10ms/step
521/521 - 6s - 6s/epoch - 11ms/step
521/521 - 6s - 6s/epoch - 11ms/step
521/521 - 5s - 5s/epoch - 11ms/step
521/521 - 5s - 5s/epoch - 10ms/step
521/521 - 24s - loss: 17.3070 - accuracy: 0.0989 - 24s/epoch - 45ms/step
521/521 - 24s - loss: 7.6591 - accuracy: 0.0974

2023-02-17 22:47:21.250478: W tensorflow/core/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


782/782 - 6s - loss: 8.2860 - accuracy: 0.0984 - 6s/epoch - 8ms/step


# Result

In [10]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

Best: 0.100460 using {'batch_size': 64, 'optimizer__learning_rate': 0.1}
0.100180 (0.000110) with: {'batch_size': 32, 'optimizer__learning_rate': 0.001}
0.100300 (0.000198) with: {'batch_size': 32, 'optimizer__learning_rate': 0.01}
0.100220 (0.000160) with: {'batch_size': 32, 'optimizer__learning_rate': 0.1}
0.100440 (0.000215) with: {'batch_size': 64, 'optimizer__learning_rate': 0.001}
0.100440 (0.000340) with: {'batch_size': 64, 'optimizer__learning_rate': 0.01}
0.100460 (0.000708) with: {'batch_size': 64, 'optimizer__learning_rate': 0.1}
