# Optimisation des hyperparamètres

Nous allons maintenant reprendre l'exemple vu dans le TP sur le perceptron multicouche et essayer de l'optimiser grâce à la bibliothèque keras tuner : https://keras.io/guides/keras_tuner/getting_started/

Dans un premier temps on peut récupérer les données du TP

In [1]:
import pandas as pd
import numpy as np

data_read = pd.read_csv('data_cancer.csv')

# Copy data in variable data (copy)
data = data_read.copy()

# Select last column as label (pop)
data_labels = data.pop('Classification')

# Convert into numpy array and substract 1 to have class 0 and class 1 
data_labels = np.array(data_labels) - 1

# Convert into numpy array 
data_features = np.array(data)

# Shuffle index
ind = np.arange(0, np.shape(data_features)[0])
np.random.shuffle(ind)

# Apply to data ;
data_features = data_features[ind]
data_labels = data_labels[ind]

# Separate train set and test set
# n_train : number of samples for train test
n_train = int(np.shape(data_features)[0] * 0.6)
n_val = int(np.shape(data_features)[0] * 0.8)

x_train, x_val, x_test = data_features[0:n_train], data_features[n_train:n_val], data_features[n_val:]
y_train, y_val, y_test = data_labels[0:n_train], data_labels[n_train:n_val], data_labels[n_val:]

In [2]:
from keras.models import Sequential
from keras.layers import Dense, Normalization

2023-10-11 21:18:17.010257: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


On peut maintenant définir une fonction prenant en entrer les hyper-parametres `hp`. On pourra ensuite essayer d'optimiser ces derniers.

In [3]:
def build_model(hp):
    layer_norm = Normalization()
    layer_norm.adapt(x_train)

    model = Sequential()
    # add layers
    model.add(layer_norm)
    model.add(Dense(hp.Int("dense1", min_value=1, max_value=200), activation='relu', input_dim=np.shape(x_train)[1]))
    model.add(Dense(30, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))

    # Compilation
    model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy', 'AUC'])
    
    return model

On peut ensuite créé notre modèle avec la fonction build model et l'utiliser pour créer une instance de notre d'optimisateur.

In [4]:
import keras_tuner

model = build_model(keras_tuner.HyperParameters())
model.summary()

tuner = keras_tuner.BayesianOptimization(
    hypermodel=build_model,
    objective="val_accuracy", 
    max_trials=50,
    executions_per_trial=2,
    overwrite=True,
)

Using TensorFlow backend
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 normalization (Normalizatio  (None, 9)                19        
 n)                                                              
                                                                 
 dense (Dense)               (None, 1)                 10        
                                                                 
 dense_1 (Dense)             (None, 30)                60        
                                                                 
 dense_2 (Dense)             (None, 1)                 31        
                                                                 
Total params: 120
Trainable params: 101
Non-trainable params: 19
_________________________________________________________________


On peut finalement exécuter l'optimisation sur nos données d'optimisation. À cette étape on doit également choisir le nombre d'epochs utilisé à chaque étape de l'optimisation. Les meilleurs modèles peuvent être récupérés avec `get_best_models`.

In [5]:
tuner.search(x_train, y_train, epochs=10, validation_data=(x_val, y_val))

models = tuner.get_best_models(num_models=2)
best_model = models[0]

best_model.summary()

Trial 50 Complete [00h 00m 03s]
val_accuracy: 0.679999977350235

Best val_accuracy So Far: 0.7299999892711639
Total elapsed time: 00h 02m 26s


AttributeError: 'Sequential' object has no attribute 'kiwi'

Une fois le meilleur modèle récupéré il peut être entrainé comme ont l'a vu dans le TP précédent.

In [None]:
from utils import draw_history

# Learning step
history = best_model.fit(x_train, y_train,
          epochs=50,            # nb of epochs
          validation_split=0.2) # % of data used for the validation


draw_history(history)

score = best_model.evaluate(x_test, y_test)
print("The cost function on the test set is %.3f. The rate of correct prediction is %.2f"%(score[0], score[1]))


output_predict = best_model.predict(x_test)
for sample_predict, sample_true in zip(output_predict[0:5], y_test[0:5]):
    print(sample_predict, sample_true)


Lorsque l'on crée un modèle avec des hyperparamètres 4 type de paramètres peuvent être utilisé : 

- Des entiers : `hp.Int(name, min, max)`

- Un choix parmis une liste : `hp.Choice(name, [liste])`

- Un bouléen : `hp.Boolean(name)`

- Un nombre floatant : `hp.Float(name, min max)`

Avec ces informations essayait d'optimiser votre réseau : 

- Quel est le nombre optimal de couches ?
- L'ajout de Dropout est-il utile ?
- Quels sont les meilleurs fonction d'activation pour nos couches ?

In [None]:
from keras import layers

def build_better_model(hp):
    layer_norm = Normalization()
    layer_norm.adapt(x_train)

    model = Sequential()
    # add layers
    model.add(layer_norm)
    for i in range(hp.Int("num_layers", 1, 10)):
        model.add(Dense(hp.Int(f"num_units_{i}", min_value=1, max_value=200), activation=hp.Choice(f"activation_{i}", ["relu", "tanh"])))
        if hp.Boolean(f"dropout_{i}"):
            model.add(layers.Dropout(rate=0.25))

    model.add(Dense(1, activation='sigmoid'))

    # Compilation
    model.compile(optimizer='sgd', loss='binary_crossentropy', metrics=['accuracy', 'AUC'])

    # Check the model
    model.summary()
    
    return model

In [None]:
import keras_tuner

model = build_better_model(keras_tuner.HyperParameters())
model.summary()
tuner = keras_tuner.BayesianOptimization(
    hypermodel=build_better_model,
    objective="val_accuracy", 
    max_trials=50,
    executions_per_trial=2,
    overwrite=True,
)

In [None]:
tuner.search(x_train, y_train, epochs=10, validation_data=(x_val, y_val))

models = tuner.get_best_models(num_models=2)
best_model = models[0]
best_model.summary()

In [None]:
from utils import draw_history

# Learning step
history = best_model.fit(x_train, y_train,
          epochs=50,            # nb of epochs
          validation_split=0.2) # % of data used for the validation


draw_history(history)

score = best_model.evaluate(x_test, y_test)
print("The cost function on the test set is %.3f. The rate of correct prediction is %.2f"%(score[0], score[1]))


output_predict = best_model.predict(x_test)
for sample_predict, sample_true in zip(output_predict[0:5], y_test[0:5]):
    print(sample_predict, sample_true)