# Hyperparameters Tuning

In this notebook i want to show you both how you can use the Scikit-learn grid search capability and give you an example that you can copy-and-paste into your own project as a starting point.


We will not go into the details of some passages since they have already been proposed several times in previous notebooks.

## 1. Load the knowledge base
First of all, you need to load the knowledge base, ie the training data contained in one of the files generated in the previous notebook. Use `m`, `N` and `num_of_matches` to load the right file.

To do this:

In [None]:
import pandas

# These parameters must be set to load the correct training set

m = 1
N = 10
num_of_matches = 10

path = 'output/train_set_m{}/num_of_matches_{}.txt'.format(m, num_of_matches)
dataset = pandas.read_csv(path, ',', delimiter=None, header=None)
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values



print("Dataset: " + path + '\n')
print(dataset)
print("\nx:")
print(X)
print("\ny:")
print(y)

#### Use the Label Encoding and One Hot Encoding!
Label Encoder is used to convert categorical data, or text data, into numbers, which our predictive models can better understand. What one hot encoding does is, it takes a column which has categorical data, which has been label encoded, and then splits the column into multiple columns. The numbers are replaced by 1s and 0s, depending on which column has what value.

This is to ensure that each example has an expected probability of 1.0 for the actual class value and an expected probability of 0.0 for all other class values when `softmax` activation function is used. This can be achieved using the `to_categorical()` Keras function.

In [None]:
from sklearn.preprocessing import LabelEncoder
from keras.utils import to_categorical

encoder = LabelEncoder()
encoder.fit(y)
encoded_y = encoder.transform(y)
y_tc = to_categorical(encoded_y, 4)
print(y)
print("is converted into")
print(encoded_y)
print("\n one hot encoding")
print(y_tc)

## 2.  Split your data!
All you have to do is divide your training data into **training set** and **test set** because later we want to evaluate our classifier's performance.

To do is invoke these simple commands:

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y_tc, test_size=0.3, random_state=4)

# print the shapes of the new X objects
print("\nTraining set dimensions (X_train):")
print(X_train.shape)
print("\nTest set dimensions (X_test):")
print(X_test.shape)

# print the shapes of the new y objects
print("\nTraining set dimensions (y_train):")
print(y_train.shape)
print("\nTest set dimensions (y_test):")
print(y_test.shape)

#### Scaling data
Many machine learning algorithms require that features are on the same scale. Also, optimization algorithms such as gradient descent work best if our features are centered at mean zero with a standard deviation of one — i.e., the data has the properties of a standard normal distribution.

In [None]:
from sklearn.preprocessing import StandardScaler


# Define the scaler
scaler = StandardScaler().fit(X_train)
# Scale the training set
X_train = scaler.transform(X_train)
# Scale the test set
X_test = scaler.transform(X_test)

## 3. Use Keras Models in scikit-learn

Keras models can be used in scikit-learn by wrapping them with the `KerasClassifier` or `KerasRegressor` class.

To use these wrappers you must define a function that creates and returns your Keras sequential model, then pass this function to the `build_fn` argument when constructing the KerasClassifier class.

The hyperparameters we want to search must appear as formal parameters of `create_model`function.

For example:

In [None]:
from keras.models import Sequential
from keras.layers import Activation, Dense, Dropout
from keras.wrappers.scikit_learn import KerasClassifier


def create_model(layers, activation, dropout_rate):
    model = Sequential()
    for i, nodes in enumerate(layers):
        if i == 0:
            model.add(Dense(nodes, input_dim=X_train.shape[1]))
            model.add(Activation(activation))
            model.add(Dropout(dropout_rate))
        else:
            model.add(Dense(nodes))
            model.add(Activation(activation))
            model.add(Dropout(dropout_rate))
    model.add(Dense(4))  # Note: no activation beyond this point
    model.add(Activation('softmax'))

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, verbose=0)


## 4. Create a dictionary of hyperparameters

As you will already have understood, the hyperparameters that want to look for are the number of hidden layers, the activation function of each and the dropout rate.


You can search for any hyperparameter, the only rule is that you must specify the dictionary that contains them all!


In [None]:
layers = [[45, 30, 15]]
activations = ['sigmoid', 'relu', 'elu']
dropout_rate = [0.0, 0.1]
param_grid = dict(layers=layers, activation=activations, batch_size = [60,128, 256], epochs=[200], dropout_rate=dropout_rate)

## 5. The GridSearchCV class

Grid search is a model hyperparameter optimization technique.

In scikit-learn this technique is provided in the `GridSearchCV` class.

When constructing this class you must provide a dictionary of hyperparameters to evaluate in the param_grid argument. This is a map of the model parameter name and an array of values to try.

By default, accuracy is the score that is optimized, but other scores can be specified in the score argument of the GridSearchCV constructor.

By default, the grid search will only use one thread. By setting the n_jobs argument in the GridSearchCV constructor to -1, the process will use all cores on your machine. Depending on your Keras backend, this may interfere with the main neural network training process.

The GridSearchCV process will then construct and evaluate one model for each combination of parameters. Cross validation is used to evaluate each individual model and the default of 3-fold cross validation is used, although this can be overridden by specifying the cv argument to the GridSearchCV constructor.

In [None]:
from sklearn.model_selection import GridSearchCV

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=-1)
grid_result = grid.fit(X_train, y_train)
# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

Once completed, you can access the outcome of the grid search in the result object returned from `grid.fit()`. The`best_score_` member provides access to the best score observed during the optimization procedure and the`best_params_` describes the combination of parameters that achieved the best results.