# Strojenie hiperparametrów

## Zadania

### Poszukiwanie ręczne

Pobierz zestaw danych Boston Housing:

In [105]:
import tensorflow as tf

(X_train, y_train), (X_test, y_test) = tf.keras.datasets.boston_housing.load_data()

Przygotuj funkcję budującą model według parametrów podanych jako argumenty:
- n_hidden – liczba warstw ukrytych,
- n_neurons – liczba neuronów na każdej z warstw ukrytych,
- optimizer – gradientowy algorytm optymalizacji, funkcja powinna rozumieć wartości: sgd, nesterov, momentum oraz adam,
- learning_rate – krok uczenia,
- momentum – współczynnik przyspieszenia dla algorytmów z pędem.

In [106]:
X_train

array([[1.23247e+00, 0.00000e+00, 8.14000e+00, ..., 2.10000e+01,
        3.96900e+02, 1.87200e+01],
       [2.17700e-02, 8.25000e+01, 2.03000e+00, ..., 1.47000e+01,
        3.95380e+02, 3.11000e+00],
       [4.89822e+00, 0.00000e+00, 1.81000e+01, ..., 2.02000e+01,
        3.75520e+02, 3.26000e+00],
       ...,
       [3.46600e-02, 3.50000e+01, 6.06000e+00, ..., 1.69000e+01,
        3.62250e+02, 7.83000e+00],
       [2.14918e+00, 0.00000e+00, 1.95800e+01, ..., 1.47000e+01,
        2.61950e+02, 1.57900e+01],
       [1.43900e-02, 6.00000e+01, 2.93000e+00, ..., 1.56000e+01,
        3.76700e+02, 4.38000e+00]])

In [107]:
def build_model(optimizer, n_hidden=1, learning_rate=10e-5, n_neurons=25, momentum=0):
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.InputLayer(input_shape=X_train.shape[1:]))
    for layer in range(n_hidden):
        model.add(tf.keras.layers.Dense(n_neurons, activation="relu"))
    model.add(tf.keras.layers.Dense(1))
    if optimizer == "sgd":
        optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate)
    elif optimizer == "nesterov":
        optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, nesterov=True)
    if optimizer == "momentum":
        optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=momentum)
    if optimizer == "adam":
        optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
    model.compile(loss="mse", optimizer=optimizer, metrics=['mae'])
    return model

Przy uczeniu wykorzystaj mechanizm early stopping o cierpliwości równej 10 i minimalnej poprawie
funkcji straty równej 1.00, uczenie maksymalnie przez 100 epok.

In [108]:
es = tf.keras.callbacks.EarlyStopping(patience=10,
                                      min_delta=1.00)

In [109]:
def get_run_logdir(name, value):
    import time
    import os
    root_logdir = os.path.join(os.curdir, "tb_logs")
    ts = int(time.time())

    run_id = str(ts) + "_" + name + "_" + str(value)
    return os.path.join(root_logdir, run_id)




Przed eksperymentami wyczyść sesję TensorFlow i ustal generatory liczb losowych:

In [110]:
import numpy as np

tf.keras.backend.clear_session()
np.random.seed(42)
tf.random.set_seed(42)

In [111]:
epochs = 100
validation_split = .1

krok uczenia(lr): 10−6, 10−5, 10−4

In [112]:
results_lr = []
for lr in (10e-6, 10e-5, 10e-4):
    run_logdir = get_run_logdir("lr", lr)
    tensorboard = tf.keras.callbacks.TensorBoard(run_logdir)
    model = build_model(learning_rate=lr, optimizer="sgd")
    model.fit(X_train, y_train, epochs=epochs, validation_split=validation_split, callbacks=[es, tensorboard])
    score = model.evaluate(X_test, y_test)
    results_lr.append((lr, score[0], score[1]))

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoc

In [113]:
results_lr

[(1e-05, 96.47820281982422, 8.2672700881958),
 (0.0001, 423.1081848144531, 18.43541145324707),
 (0.001, 658428.0, 811.384521484375)]

liczba warstw ukrytych (hl): od 0 do 3,

In [114]:
results_hl = []
for hl in (0, 1, 2, 3):
    run_logdir = get_run_logdir("hl", hl)
    tensorboard = tf.keras.callbacks.TensorBoard(run_logdir)
    model = build_model(n_hidden=hl, optimizer="sgd")
    model.summary()
    model.fit(X_train, y_train, epochs=epochs, validation_split=validation_split, callbacks=[es, tensorboard])
    score = model.evaluate(X_test, y_test)
    results_hl.append((hl, score[0], score[1]))


Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_6 (Dense)             (None, 1)                 14        
                                                                 
Total params: 14
Trainable params: 14
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_7 (Dense)             (None, 25)                350       
                                                                 
 dense_8 (Dense)             (None, 1)                 26        
                                                                 
Total params: 376
Trainable params: 376
Non-tra

In [115]:
results_hl

[(0, nan, nan),
 (1, 432.7367858886719, 18.69472885131836),
 (2, 83.7286148071289, 6.532285213470459),
 (3, nan, nan)]

liczba neuronów na warstwę (nn): 5, 25, 125

In [116]:
results_nn = []
for nn in (5, 25, 125):
    run_logdir = get_run_logdir("nn", nn)
    tensorboard = tf.keras.callbacks.TensorBoard(run_logdir)
    model = build_model(optimizer="sgd", n_neurons=nn)
    model.summary()
    model.fit(X_train, y_train, epochs=epochs, validation_split=validation_split, callbacks=[es, tensorboard])
    score = model.evaluate(X_test, y_test)
    results_nn.append((nn, score[0], score[1]))


Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_16 (Dense)            (None, 5)                 70        
                                                                 
 dense_17 (Dense)            (None, 1)                 6         
                                                                 
Total params: 76
Trainable params: 76
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36

In [117]:
results_nn

[(5, 415.259033203125, 18.221284866333008),
 (25, 828.7384643554688, 27.303747177124023),
 (125, 438.08441162109375, 18.83722686767578)]

algorytm optymalizacji (opt): wszystkie 4 algorytmy (pęd = 0.5)

In [118]:
results_alg = []
for alg in ("sgd", "nesterov", "momentum", "adam"):
    run_logdir = get_run_logdir("alg", alg)
    tensorboard = tf.keras.callbacks.TensorBoard(run_logdir)
    model = build_model(optimizer=alg, momentum=.5)
    model.summary()
    model.fit(X_train, y_train, epochs=epochs, validation_split=validation_split, callbacks=[es, tensorboard])
    score = model.evaluate(X_test, y_test)
    results_alg.append((alg, score[0], score[1]))


Model: "sequential_10"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_22 (Dense)            (None, 25)                350       
                                                                 
 dense_23 (Dense)            (None, 1)                 26        
                                                                 
Total params: 376
Trainable params: 376
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Model: "sequential_11"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_24 (Dense)            (None, 25)                350       
                                            

In [119]:
results_alg

[('sgd', 62.78519821166992, 5.925008773803711),
 ('nesterov', 340.63629150390625, 16.072277069091797),
 ('momentum', 310.46917724609375, 15.184233665466309),
 ('adam', 144.28643798828125, 9.175629615783691)]

 pęd (mom): 0.1, 0.5, 0.9 (dla algorytmu momentum).

In [120]:
results_mom = []
for mom in (.1, .5, .9):
    run_logdir = get_run_logdir("momentum", mom)
    tensorboard = tf.keras.callbacks.TensorBoard(run_logdir)
    model = build_model(optimizer="momentum", momentum=mom)
    model.summary()
    model.fit(X_train, y_train, epochs=epochs, validation_split=validation_split, callbacks=[es, tensorboard])
    score = model.evaluate(X_test, y_test)
    results_mom.append((mom, score[0], score[1]))


Model: "sequential_14"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_30 (Dense)            (None, 25)                350       
                                                                 
 dense_31 (Dense)            (None, 1)                 26        
                                                                 
Total params: 376
Trainable params: 376
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch

In [121]:
results_mom

[(0.1, 493.1888427734375, 20.24709701538086),
 (0.5, 254.59796142578125, 13.4472017288208),
 (0.9, 171.73822021484375, 11.43071174621582)]

### Automatyczne poszukiwanie przestrzeni argumentów

Przygotuj słownik zawierający przeszukiwane wartości parametrów

In [138]:
param_distribs = {
    "n_hidden": [0, 1, 2, 3],
    "n_neurons": [5, 25, 125],
    "learning_rate": [10e-6, 10e-5, 10e-4],
    "optimizer": ["sgd", "adam", "nesterov", "momentum"],
    "momentum": [.1, .5, .9]
}

Przygotuj callback early stopping i obuduj przygotowaną wcześniej funkcję build_model obiektem
KerasRegressor

In [140]:
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor

es = tf.keras.callbacks.EarlyStopping(patience=10, min_delta=1.0, verbose=1)
keras_reg = KerasRegressor(build_model, callbacks=[es])

  keras_reg = KerasRegressor(build_model, callbacks=[es])


In [147]:
from sklearn.model_selection import RandomizedSearchCV

rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=30, cv=3, verbose=2)
rnd_search_cv.fit(X_train, y_train, epochs=100, validation_split=0.1)

Fitting 3 folds for each of 30 candidates, totalling 90 fits
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 10: early stopping
[CV] END learning_rate=0.0001, momentum=0.5, n_hidden=3, n_neurons=25, optimizer=momentum; total time=   1.3s
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 10: early stopping
[CV] END learning_rate=0.0001, momentum=0.5, n_hidden=3, n_neurons=25, optimizer=momentum; total time=   1.1s
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 10: early stopping
[CV] END learning_rate=0.0001, momentum=0.5, n_hidden=3, n_neurons=25, optimizer=momentum; total time=   1.1s
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/10

 -8.01099080e+03 -2.56524972e+02 -5.72521642e+01 -5.51805585e+01
 -6.09553363e+01 -4.73256772e+04 -6.04231962e+01 -1.36446545e+06
 -6.71445612e+04 -6.76066775e+25 -2.46250505e+02 -9.42294617e+01
 -4.17691355e+02 -7.20970245e+02 -3.00464732e+02 -2.46198772e+02
 -8.62007243e+01 -7.61032842e+01             nan             nan
             nan -9.70506420e+13 -2.81771966e+04             nan
             nan -1.38826374e+03]


Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 44: early stopping


RandomizedSearchCV(cv=3,
                   estimator=<keras.wrappers.scikit_learn.KerasRegressor object at 0x00000245E213ABF0>,
                   n_iter=30,
                   param_distributions={'learning_rate': [1e-05, 0.0001, 0.001],
                                        'momentum': [0.1, 0.5, 0.9],
                                        'n_hidden': [0, 1, 2, 3],
                                        'n_neurons': [5, 25, 125],
                                        'optimizer': ['sgd', 'adam', 'nesterov',
                                                      'momentum']},
                   verbose=2)

Zapisz najlepsze znalezione parametry w postaci słownika do pliku rnd_search.pkl.

In [148]:
rnd_search_cv.best_params_

{'optimizer': 'adam',
 'n_neurons': 125,
 'n_hidden': 2,
 'momentum': 0.5,
 'learning_rate': 0.0001}

dict

In [149]:
import pickle

with open('lr.pkl', 'wb') as file:
    pickle.dump(results_lr, file)

with open('hl.pkl', 'wb') as file:
    pickle.dump(results_hl, file)

with open('nn.pkl', 'wb') as file:
    pickle.dump(results_nn, file)

with open('opt.pkl', 'wb') as file:
    pickle.dump(results_alg, file)

with open('mom.pkl', 'wb') as file:
    pickle.dump(results_mom, file)

with open('rnd_search.pkl', 'wb') as file:
    pickle.dump(rnd_search_cv.best_params_, file)

[(0.1, 493.1888427734375, 20.24709701538086),
 (0.5, 254.59796142578125, 13.4472017288208),
 (0.9, 171.73822021484375, 11.43071174621582)]