# Keras Scikit-Learn Wrappers in Hyperparameters Fune-Tuning
There are many techniques to explore a search space and it is adviced to use a Python library for it. You can see below the state-of-the-art libraries that you can use.



1.   Hyperas (https://github.com/maxpumperla/hyperas) - An important library for optimizing hyperparameters for Keras models
2.   Scikit-Optimize (https://scikit-optimize.github.io) - A probability based library. The BayesSearchCV class does bayesian optimization and it has an interface similar to GridSearchCV
3.   Sklearn-Deap (https://github.com/rsteca/sklearn-deap) - A evolutionary algorithms library with a GridSearchCV interface.


On the other hand, Scikit-Learn provides the GridSearchCV and RandomSearchCV that can be used for the same process. So, we have to convert the Keras model to a Scikit-Learn object. Keras wrappers allow the developer to wrap a Keras model in objects that mimic a regular Scikit-Learn regressor.

By default the keras.wrappers.scikit_learn.KerasRegressor() method requires as an argument a function, which creates a Keras Model. Hence, the first step is to create this function.

In [67]:
def model_fun(hidden=3, neurons=30, learning_rate=0.003, input_shape=[8], activation="relu"):
  model = keras.models.Sequential()
  model.add(keras.layers.InputLayer(input_shape=input_shape))
  for layer in range(hidden):
    model.add(keras.layers.Dense(neurons, activation=activation))
  model.add(keras.layers.Dense(1))
  optimizer = keras.optimizers.SGD(lr=learning_rate)
  model.compile(loss="mse", optimizer=optimizer)
  return model

In [68]:
# Import our Set up
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow import keras

from scipy.stats import uniform
from sklearn.model_selection import RandomizedSearchCV

In [69]:
# Data preprocessing
housing = fetch_california_housing()
X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)

# Pretend these are new instances
X_new = X_test[:3]

The KerasRegressor object is created when we pass the model_fun() method. Now, it performs like a classic Scikit-Learn object (ie regressor) and we can train, evaluate and make predictions.

In [70]:
keras_reg = keras.wrappers.scikit_learn.KerasRegressor(model_fun)
keras_reg.fit(X_train, y_train, epochs=50, validation_data=(X_valid, y_valid), callbacks=[keras.callbacks.EarlyStopping(patience=10)])
mse = keras_reg.score(X_test, y_test)
y_pred = keras_reg.predict(X_new)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [71]:
print(y_pred)

[1.853442  1.8426552 0.8084258]


When we work on the Hyperparameters Fine-Tuning, we want to train and evaluate different combinations of hidden layers, neurons, activation function, learning rates, etc. So, we can use the default Scikit-Learn classes for that step in order to result in the best model-selection for our case study. The RandomizedSearchCV constructor requires as attributes
 the Scikit-Learn object, the parameters (ie a dictionary that maps our 

*   The Scikit-Learn object
*   The parameters (ie a dictionary that maps our proposed values to each hyperparameter)
*   The number of iterations
*   The number of jobs (ie multithreading programming)

The process may take a few hours depending on the complexity of the task. That means, the model/dataset combination.

In [72]:
parameters = {"hidden": [1,2,3,4,5,6], "neurons": [x for x in range(20,50)], "learning_rate": [3e-6, 3e-5, 3e-4, 3e-3, 3e-2], "activation": ["relu", "tanh"]}
random_search = RandomizedSearchCV(keras_reg, parameters, n_iter=10, cv=3)
random_search.fit(X_train, y_train, epochs=100, validation_data=(X_valid, y_valid), callbacks=[keras.callbacks.EarlyStopping(patience=10)])

print(("Best parameters: {}").format(random_search.best_params_))
print(("Best Score: {}").format(random_search.best_score_))

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78/100
Epoch 79/100
Epoch 80/100
Epoch 81/100
Epoch 82/100
Epoch 83/100
Epoch 84/100
Epoch 85/100
Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 2

Saving this model is simply enough. We can now save it in the form of a h5 file and evaluate it on our test set.

In [73]:
model = random_search.best_estimator_.model
# Save the Model
model.save("best-model.h5")
# Load the Model
loaded_model = keras.models.load_model("best-model.h5")
# Evaluating on the test set
mse = loaded_model.evaluate(X_test, y_test)
# Assuming new data
y_pred = model.predict(X_new)
print(y_pred)

[[1.6239903 ]
 [1.5181246 ]
 [0.72772276]]
