Afinación de parámetros con Keras Tuner --- 21:03 min
===

* Última modificación: Mayo 10, 2021 | [YouTube](https://youtu.be/5IL3etLzVGk)

Adaptado de:

* https://www.tensorflow.org/tutorials/keras/keras_tuner

En la construcción de modelos de redes neuronales siempre se busca encontrar el mejor modelo en términos de la precisión del pronóstico. Sin embargo, el desempeño de un modelo de redes neuronales depende de su arquitectura (cantidad de capas y cantidad de neuronas por capa, funciones de activación) y del proceso de entrenamiento (algoritmo usado, número de iteraciones, tasa de aprendizaje, etc.). Keras Tuner es una librería que implementa varios algoritmos de búsqueda para encontrar la mejor combinación de parámetros relacionados con la arquitectura y el entrenamiento de una red neuronal. En esta lección se explican los fundamentos de Keras Tuner y como aplicarlo para la obtención del mejor modelo.

In [1]:
import keras_tuner as kt
import tensorflow as tf
from tensorflow import keras

print(tf.__version__)

2.5.0


In [2]:
#
#  Descarga de los datos
#
(
    (train_images, train_labels),
    (test_images, test_labels),
) = keras.datasets.fashion_mnist.load_data()

## Especificación del modelo

In [3]:
def model_builder(hp):
    
    #
    # Evalua distintas cantidades de neuronas en la capa
    # oculta
    #
    hp_units = hp.Int(
        "units",
        min_value=32,
        max_value=512,
        step=32,
    )
        
    model = keras.Sequential(
        [
            #
            # Esta capa reduce una dimension de la matriz de entrada.
            # El conjunto de entrada es una lista de matrices de 28x28.
            # Flatten genera una lista de vectores de dimensión 28x28
            #            
            keras.layers.Flatten(input_shape=(28, 28)),
            
            #
            # Preprocesamiento de la entrada [0, 255] --> [0, 1]
            #
            keras.layers.experimental.preprocessing.Rescaling(scale=1.0 / 255),
            
            #
            # Capa de procesamiento. Aquí se modifica dinamicamente la 
            # cantidad de neuronas en la capa
            #
            keras.layers.Dense(
                units=hp_units,
                activation="relu",
            ),
            
            #
            # Capa de salida
            #
            keras.layers.Dense(10)
        ]
    )

    #
    # Escoge el mejor valor para la tasa de aprendizaje
    #
    hp_learning_rate = hp.Choice(
        "learning_rate",
        values=[0.01, 0.001, 0.0001],
    )

    model.compile(
        #
        # Optimizador con distintos valores 
        # para la tasa de aprendizaje
        #
        optimizer=keras.optimizers.Adam(
            learning_rate=hp_learning_rate,
        ),
        #
        # Función de pérdida
        #
        loss=keras.losses.SparseCategoricalCrossentropy(
            from_logits=True,
        ),
        #
        # Métrica a monitoreas
        #
        metrics=["accuracy"],
    )

    return model

## Monitoreo del modelo con Early Stopping

In [4]:
callbacks = [
    tf.keras.callbacks.EarlyStopping(
        #
        # Metrica a monitorear
        #
        monitor="val_loss",
        
        #
        # Número de iteraciones sin mejora antes
        # de finalizar el entrenamiento
        #
        patience=5,
    )
]

## Búsqueda usando HyperBand

In [5]:
!rm -rf /tmp/hyperband_kt

hyperband_tuner = kt.Hyperband(
    hypermodel=model_builder,      # construye el modelo
    objective="val_accuracy",      # criterio para seleccionar los parametros
    max_epochs=10,                 # Número máximo de iteraciones
    factor=3,                      # factor para reducir la cantidad de modelos
    directory="/tmp/hyperband_kt", # directorio de trabajo
    project_name="hyperband_kt",   # nombre del proyecto
    overwrite=True,                # sobre-escribe la carpeta si existe
)

#
# Resumen de los parametros de la búsqueda
#
hyperband_tuner.search_space_summary()

Search space summary
Default search space size: 2
units (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [6]:
hyperband_tuner.search(
    train_images,
    train_labels,
    epochs=50,
    validation_split=0.2,
    callbacks=callbacks,
    verbose=1,
)

hyperband_tuner.results_summary()

Trial 30 Complete [00h 00m 55s]
val_accuracy: 0.8912500143051147

Best val_accuracy So Far: 0.8912500143051147
Total elapsed time: 00h 09m 20s
INFO:tensorflow:Oracle triggered exit
Results summary
Results in /tmp/hyperband_kt/hyperband_kt
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
units: 352
learning_rate: 0.001
tuner/epochs: 10
tuner/initial_epoch: 0
tuner/bracket: 0
tuner/round: 0
Score: 0.8912500143051147
Trial summary
Hyperparameters:
units: 224
learning_rate: 0.001
tuner/epochs: 10
tuner/initial_epoch: 0
tuner/bracket: 0
tuner/round: 0
Score: 0.8896666765213013
Trial summary
Hyperparameters:
units: 192
learning_rate: 0.001
tuner/epochs: 10
tuner/initial_epoch: 0
tuner/bracket: 0
tuner/round: 0
Score: 0.8864166736602783
Trial summary
Hyperparameters:
units: 160
learning_rate: 0.001
tuner/epochs: 10
tuner/initial_epoch: 4
tuner/bracket: 1
tuner/round: 1
tuner/trial_id: 422bccbb6eb9a85556e5cac0b0191ae0
Score: 0.88574999570846

## Búsqueda usando random search

In [7]:
!rm -rf /tmp/randomsearch_kt

randomsearch_tuner = kt.RandomSearch(
    hypermodel=model_builder,         # construye el modelo
    objective="val_accuracy",         # criterio para seleccionar los parametros
    max_trials=4,                     # número máximo de ensayos
    directory="/tmp/randomsearch_kt", # directorio de trabajo
    project_name="randomsearch_kt",   # Nombre del proyecto
    overwrite=True,                   # sobre-escribe la carpeta si existe
)

#
# Resumen de los parametros de la búsqueda
#
randomsearch_tuner.search_space_summary()

Search space summary
Default search space size: 2
units (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [8]:
randomsearch_tuner.search(
    train_images,
    train_labels,
    epochs=50,
    validation_split=0.2,
    callbacks=callbacks,
    verbose=1,
)

randomsearch_tuner.results_summary()

Trial 4 Complete [00h 01m 39s]
val_accuracy: 0.8943333625793457

Best val_accuracy So Far: 0.8955833315849304
Total elapsed time: 00h 10m 52s
INFO:tensorflow:Oracle triggered exit
Results summary
Results in /tmp/randomsearch_kt/randomsearch_kt
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
units: 352
learning_rate: 0.0001
Score: 0.8955833315849304
Trial summary
Hyperparameters:
units: 416
learning_rate: 0.0001
Score: 0.8945833444595337
Trial summary
Hyperparameters:
units: 256
learning_rate: 0.001
Score: 0.8943333625793457
Trial summary
Hyperparameters:
units: 256
learning_rate: 0.01
Score: 0.8689166903495789


## Búsqueda usando Bayesian Optimization

In [9]:
!rm -rf /tmp/bayesianopt_kt

bayesianoptimization_tuner = kt.BayesianOptimization(
    hypermodel=model_builder,         # construye el modelo
    objective="val_accuracy",         # criterio para seleccionar los parametros
    max_trials=4,                     # número máximo de ensayos
    seed=123456,                      # semilla del generador de aleatorios
    directory="/tmp/bayesianopt_kt",  # directorio de trabajo
    project_name="bayesianopt_kt",    # Nombre del proyecto
    overwrite=True,                   # sobre-escribe la carpeta si existe
)

#
# Resumen de los parametros de la búsqueda
#
bayesianoptimization_tuner.search_space_summary()

Search space summary
Default search space size: 2
units (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [10]:
bayesianoptimization_tuner.search(
    train_images,
    train_labels,
    epochs=50,
    validation_split=0.2,
    callbacks=callbacks,
    verbose=1,
)

bayesianoptimization_tuner.results_summary()

Trial 4 Complete [00h 05m 27s]
val_accuracy: 0.8993333578109741

Best val_accuracy So Far: 0.8993333578109741
Total elapsed time: 00h 11m 06s
INFO:tensorflow:Oracle triggered exit
Results summary
Results in /tmp/bayesianopt_kt/bayesianopt_kt
Showing 10 best trials
Objective(name='val_accuracy', direction='max')
Trial summary
Hyperparameters:
units: 512
learning_rate: 0.0001
Score: 0.8993333578109741
Trial summary
Hyperparameters:
units: 416
learning_rate: 0.001
Score: 0.8896666765213013
Trial summary
Hyperparameters:
units: 128
learning_rate: 0.0001
Score: 0.8810833096504211
Trial summary
Hyperparameters:
units: 512
learning_rate: 0.01
Score: 0.8696666955947876


## Obtención de los mejores hiperparámetros

In [11]:
best_hyperband_hps = hyperband_tuner.get_best_hyperparameters(num_trials=1)[0]

#
# Cantidad optima de neuronas en la capa oculta 
# y tasa de aprendizaje
#
best_hyperband_hps.get("units"), best_hyperband_hps.get('learning_rate')

(352, 0.001)

In [12]:
best_randomsearch_hps = randomsearch_tuner.get_best_hyperparameters(num_trials=1)[0]

#
# Cantidad optima de neuronas en la capa oculta 
# y tasa de aprendizaje
#
best_randomsearch_hps.get("units"), best_randomsearch_hps.get('learning_rate')

(352, 0.0001)

In [13]:
best_bayesianopt_hps = bayesianoptimization_tuner.get_best_hyperparameters(num_trials=1)[0]

#
# Cantidad optima de neuronas en la capa oculta 
# y tasa de aprendizaje
#
best_bayesianopt_hps.get("units"), best_bayesianopt_hps.get('learning_rate')

(512, 0.0001)

## Obtención del mejor modelo

In [14]:
#
# Construcción del modelo con los hiperparametros óptimos
# y entrenamiento para 50 epochs.
#
# Se busca obtener el número óptimo de epochs para
# entrenar el modelo
#
model = hyperband_tuner.hypermodel.build(best_hyperband_hps)

history = model.fit(
    train_images,
    train_labels,
    epochs=50,
    validation_split=0.2,
    verbose=2,
)

val_acc_per_epoch = history.history["val_accuracy"]
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1

#
# Número máximo de epochs usadas para entrenar el modelo
#
best_epoch

Epoch 1/50
1500/1500 - 7s - loss: 0.5013 - accuracy: 0.8223 - val_loss: 0.4512 - val_accuracy: 0.8363
Epoch 2/50
1500/1500 - 6s - loss: 0.3725 - accuracy: 0.8647 - val_loss: 0.3723 - val_accuracy: 0.8633
Epoch 3/50
1500/1500 - 6s - loss: 0.3345 - accuracy: 0.8758 - val_loss: 0.3505 - val_accuracy: 0.8692
Epoch 4/50
1500/1500 - 6s - loss: 0.3068 - accuracy: 0.8862 - val_loss: 0.3516 - val_accuracy: 0.8733
Epoch 5/50
1500/1500 - 7s - loss: 0.2877 - accuracy: 0.8939 - val_loss: 0.3228 - val_accuracy: 0.8867
Epoch 6/50
1500/1500 - 7s - loss: 0.2733 - accuracy: 0.8983 - val_loss: 0.3235 - val_accuracy: 0.8851
Epoch 7/50
1500/1500 - 7s - loss: 0.2585 - accuracy: 0.9038 - val_loss: 0.3313 - val_accuracy: 0.8869
Epoch 8/50
1500/1500 - 7s - loss: 0.2444 - accuracy: 0.9088 - val_loss: 0.3223 - val_accuracy: 0.8842
Epoch 9/50
1500/1500 - 7s - loss: 0.2348 - accuracy: 0.9120 - val_loss: 0.3110 - val_accuracy: 0.8869
Epoch 10/50
1500/1500 - 7s - loss: 0.2263 - accuracy: 0.9141 - val_loss: 0.3177 - 

45

In [15]:
#
# Reentrena el modelo usando el número óptimo de epochs
#
model = hyperband_tuner.hypermodel.build(best_hyperband_hps)

history = model.fit(
    train_images,
    train_labels,
    epochs=best_epoch,
    validation_split=0.2,
    verbose=2,
)

Epoch 1/45
1500/1500 - 6s - loss: 0.5012 - accuracy: 0.8226 - val_loss: 0.4090 - val_accuracy: 0.8553
Epoch 2/45
1500/1500 - 5s - loss: 0.3721 - accuracy: 0.8645 - val_loss: 0.3534 - val_accuracy: 0.8719
Epoch 3/45
1500/1500 - 5s - loss: 0.3324 - accuracy: 0.8790 - val_loss: 0.3757 - val_accuracy: 0.8610
Epoch 4/45
1500/1500 - 5s - loss: 0.3071 - accuracy: 0.8873 - val_loss: 0.3847 - val_accuracy: 0.8598
Epoch 5/45
1500/1500 - 6s - loss: 0.2904 - accuracy: 0.8929 - val_loss: 0.3215 - val_accuracy: 0.8844
Epoch 6/45
1500/1500 - 5s - loss: 0.2705 - accuracy: 0.8994 - val_loss: 0.3312 - val_accuracy: 0.8784
Epoch 7/45
1500/1500 - 5s - loss: 0.2562 - accuracy: 0.9048 - val_loss: 0.3255 - val_accuracy: 0.8844
Epoch 8/45
1500/1500 - 5s - loss: 0.2463 - accuracy: 0.9075 - val_loss: 0.3328 - val_accuracy: 0.8824
Epoch 9/45
1500/1500 - 5s - loss: 0.2356 - accuracy: 0.9107 - val_loss: 0.3169 - val_accuracy: 0.8860
Epoch 10/45
1500/1500 - 5s - loss: 0.2242 - accuracy: 0.9156 - val_loss: 0.3410 - 