### Keras Tuner con el problema MNIST

Keras Tuner es una librería bastante reciente que simplifica en gran medida el ajuste de los hiperparámetros de una red neuronal. Toda la documentación en este enlace:

https://keras-team.github.io/keras-tuner/

In [1]:
import keras_tuner as kt

Carga de los datos:

In [2]:
import tensorflow as tf
from tensorflow import keras

import numpy as np
import matplotlib.pyplot as plt
from time import time
import shutil

In [3]:
(img_train, label_train), (img_test, label_test) = keras.datasets.mnist.load_data()

Normalización:

In [4]:
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0

Lo primero que hay que hacer es definir un hipermodelo, que es una función que genera un modelo de Keras que depende de unos hiperparámetros con los que vamos a jugar. Los hiperparámetros se muestrean a partir del argumento ``hp`` de la función.

En este ejemplo sólo vamos a ajustar la constante de regularización de la capa oculta:

In [5]:
def model_builder(hp):
  hp_lambda = hp.Choice('lambda', values = [1.0, 0.1, 0.01, 0.001, 0.0001])
  hp_lr = hp.Choice('lr', values = [1.0, 0.1, 0.01, 0.001, 0.0001])

  model = keras.Sequential()
  model.add(keras.layers.Input(shape=(28, 28)))
  model.add(keras.layers.Flatten())
  model.add(keras.layers.Dense(units = 50, activation = 'relu', kernel_regularizer=keras.regularizers.l2(hp_lambda)))
  model.add(keras.layers.Dense(10, activation="softmax"))

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_lr),
                loss='sparse_categorical_crossentropy',
                metrics=['acc'])

  return model

Borramos el directorio de logs:

In [6]:
!rm -rf my_dir/intro_to_kt/

Lo siguiente es crear un ``tuner`` para hacer el ajuste de los hiperparámetros. Existen distintos tipos:

- RandomSearch
- Hyperband
- BayesianOptimization
- Sklearn

Lo más fácil es hacer una búsqueda aleatoria con ``RandomSearch``. Al crear el ``tuner`` hay que especificar:

- El hipermodelo.
- La variable a optimizar.
- El número total de pruebas.
- El número de ejecuciones por prueba.

In [7]:
tuner = kt.RandomSearch(model_builder,
                        objective='val_acc',
                        max_trials=10,
                        executions_per_trial=3,
                        directory='my_dir',
                        project_name='intro_to_kt')

Un resumen del espacio de búsqueda:

In [8]:
tuner.search_space_summary()

Search space summary
Default search space size: 2
lambda (Choice)
{'default': 1.0, 'conditions': [], 'values': [1.0, 0.1, 0.01, 0.001, 0.0001], 'ordered': True}
lr (Choice)
{'default': 1.0, 'conditions': [], 'values': [1.0, 0.1, 0.01, 0.001, 0.0001], 'ordered': True}


Y lanzamos la búsqueda:

In [9]:
tuner.search(img_train, label_train,
             epochs=1,
             validation_data=(img_test, label_test))

Trial 10 Complete [00h 00m 21s]
val_acc: 0.8803999821345011

Best val_acc So Far: 0.9199000000953674
Total elapsed time: 00h 03m 55s


Acceso al mejor modelo. Hay que tener en cuenta que ya está entrenado, y siempre es mejor reentrenarlo con todos los datos.

In [10]:
best_model = tuner.get_best_models()[0]
best_model.evaluate(img_test, label_test)

  saveable.load_own_variables(weights_store.get(inner_path))


[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - acc: 0.9089 - loss: 0.4729


[0.43340086936950684, 0.920799970626831]

Resumen de los resultados:

In [11]:
tuner.results_summary()

Results summary
Results in my_dir/intro_to_kt
Showing 10 best trials
Objective(name="val_acc", direction="max")

Trial 04 summary
Hyperparameters:
lambda: 0.01
lr: 0.001
Score: 0.9199000000953674

Trial 02 summary
Hyperparameters:
lambda: 0.001
lr: 0.0001
Score: 0.9015666643778483

Trial 00 summary
Hyperparameters:
lambda: 0.0001
lr: 0.0001
Score: 0.9011333187421163

Trial 03 summary
Hyperparameters:
lambda: 0.01
lr: 0.0001
Score: 0.8956000010172526

Trial 09 summary
Hyperparameters:
lambda: 0.01
lr: 0.01
Score: 0.8803999821345011

Trial 08 summary
Hyperparameters:
lambda: 1.0
lr: 0.001
Score: 0.8059333364168803

Trial 05 summary
Hyperparameters:
lambda: 1.0
lr: 0.01
Score: 0.6856666604677836

Trial 07 summary
Hyperparameters:
lambda: 0.0001
lr: 0.1
Score: 0.5072000126043955

Trial 01 summary
Hyperparameters:
lambda: 0.1
lr: 1.0
Score: 0.10339999943971634

Trial 06 summary
Hyperparameters:
lambda: 0.001
lr: 1.0
Score: 0.10053333640098572


Obtenemos los parámetros del mejor modelo y lo reentrenamos:

In [12]:
best_hps = tuner.get_best_hyperparameters()[0]
model = tuner.hypermodel.build(best_hps)
model.fit(img_train, label_train, epochs = 10, validation_data = (img_test, label_test))

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 3ms/step - acc: 0.8468 - loss: 0.8775 - val_acc: 0.9239 - val_loss: 0.4169
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - acc: 0.9173 - loss: 0.4286 - val_acc: 0.9341 - val_loss: 0.3642
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - acc: 0.9299 - loss: 0.3698 - val_acc: 0.9432 - val_loss: 0.3211
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 2ms/step - acc: 0.9391 - loss: 0.3331 - val_acc: 0.9433 - val_loss: 0.3155
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - acc: 0.9431 - loss: 0.3170 - val_acc: 0.9427 - val_loss: 0.2979
Epoch 6/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - acc: 0.9466 - loss: 0.2966 - val_acc: 0.9494 - val_loss: 0.2816
Epoch 7/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[

<keras.src.callbacks.history.History at 0x7fa695ca00d0>

#### Otro ejemplo:

Ajuste del learning rate, el parámetro de regularización y el número de neuronas en la capa oculta con un tuner de tipo hyperband.

https://arxiv.org/pdf/1603.06560.pdf

Hipermodelo:

In [14]:
def model_builder_2(hp):
  hp_lambda = hp.Choice('lambda', values = [0.001, 0.0001])
  hp_units = hp.Int('units', min_value = 32, max_value = 128, step = 32)
  hp_learning_rate = hp.Choice('learning_rate', values = [1.0, 0.1, 0.01, 0.001])

  model = keras.Sequential()
  model.add(keras.layers.Input(shape=(28, 28)))
  model.add(keras.layers.Flatten())
  model.add(keras.layers.Dense(units = hp_units, activation = 'relu', kernel_regularizer=keras.regularizers.l2(hp_lambda)))
  model.add(keras.layers.Dense(10, activation="softmax"))

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
                loss='sparse_categorical_crossentropy',
                metrics=['acc'])

  return model

Borramos la carpeta de logs:

In [15]:
!rm -rf my_dir/intro_hyperband/

Creamos el tuner:

In [16]:
tuner = kt.Hyperband(model_builder_2,
                     objective = 'val_acc',
                     max_epochs = 10,
                     factor = 3,
                     directory = 'my_dir',
                     project_name = 'intro_hyperband')

Resumen del espacio de búsqueda:

In [17]:
tuner.search_space_summary()

Search space summary
Default search space size: 3
lambda (Choice)
{'default': 0.001, 'conditions': [], 'values': [0.001, 0.0001], 'ordered': True}
units (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 128, 'step': 32, 'sampling': 'linear'}
learning_rate (Choice)
{'default': 1.0, 'conditions': [], 'values': [1.0, 0.1, 0.01, 0.001], 'ordered': True}


Búsqueda:

In [18]:
tuner.search(img_train, label_train,
             epochs=10,
             validation_data=(img_test, label_test))

Trial 30 Complete [00h 01m 06s]
val_acc: 0.5182999968528748

Best val_acc So Far: 0.9787999987602234
Total elapsed time: 00h 16m 00s


Mejores hiperparámetros:

In [19]:
best_hps = tuner.get_best_hyperparameters()[0]
best_hps.values

{'lambda': 0.0001,
 'units': 96,
 'learning_rate': 0.001,
 'tuner/epochs': 10,
 'tuner/initial_epoch': 0,
 'tuner/bracket': 0,
 'tuner/round': 0}

Reentrenamiento del modelo:

In [20]:
model = tuner.hypermodel.build(best_hps)
model.fit(img_train, label_train, epochs = 10, validation_data = (img_test, label_test))

Epoch 1/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 3ms/step - acc: 0.8648 - loss: 0.4818 - val_acc: 0.9473 - val_loss: 0.1980
Epoch 2/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 3ms/step - acc: 0.9603 - loss: 0.1611 - val_acc: 0.9665 - val_loss: 0.1345
Epoch 3/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - acc: 0.9723 - loss: 0.1229 - val_acc: 0.9740 - val_loss: 0.1163
Epoch 4/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - acc: 0.9773 - loss: 0.1069 - val_acc: 0.9677 - val_loss: 0.1338
Epoch 5/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 3ms/step - acc: 0.9812 - loss: 0.0950 - val_acc: 0.9729 - val_loss: 0.1187
Epoch 6/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - acc: 0.9832 - loss: 0.0876 - val_acc: 0.9770 - val_loss: 0.1059
Epoch 7/10
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[

<keras.src.callbacks.history.History at 0x7fa6933c3c50>