<a href="https://colab.research.google.com/github/SandeeeeeeeeepDey/data-science-11-weeks-progg/blob/main/keras_tuner.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hyperperameter Tuning

In [1]:
%pip install -q -U keras-tuner

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/129.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/129.1 kB[0m [31m1.8 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
import tensorflow as tf
import keras_tuner as kt
from time import strftime
from pathlib import Path

In [3]:
all_data = tf.keras.datasets.fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [4]:
(X_full, y_full), (X_test, y_test) = all_data

In [5]:
X_train, X_valid = X_full[:5000], X_full[5000:]
y_train, y_valid = y_full[:5000], y_full[5000:]

## Model_builder Function

In [6]:
def model_builder(hp):
  n_hidden = hp.Int("n_hidden", min_value = 0, max_value = 8, default = 8)
  n_neurons = hp.Int("n_neuron", min_value = 16, max_value = 256)
  learning_rate = hp.Float("learning_rate", min_value = 1e-4, max_value = 1e-2, sampling = "log")
  optimizer = hp.Choice("optimizer", values = ["sgd", "adam"])
  if optimizer == "sgd":
    optimizer = tf.keras.optimizers.SGD(learning_rate = learning_rate)

  if optimizer == "adam":
    optimizer = tf.keras.optimizers.Adam(learning_rate = learning_rate)

  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten())

  for _ in range(n_hidden):
    model.add(tf.keras.layers.Dense(n_neurons, activation = "relu"))
  model.add(tf.keras.layers.Dense(10, activation = "softmax"))
  model.compile(loss = "sparse_categorical_crossentropy", optimizer = optimizer, metrics = ["accuracy"])
  return model

### Random Search Optimizer

Randomly searches the different combinations in each trials

Defining

In [7]:
random_search_tuner = kt.RandomSearch(model_builder, objective = "val_accuracy", max_trials = 10, overwrite = True,
                                      directory = "my_fashion_mnist", project_name = "my_rand_search", seed = 42)

Trial 10 Complete [00h 03m 10s]
val_accuracy: 0.8211636543273926

Best val_accuracy So Far: 0.8373636603355408
Total elapsed time: 00h 29m 30s


Create callbacks

In [None]:
checkpoint = tf.keras.callbacks.ModelCheckpoint("cp1", save_weights_only=False)
early_bird = tf.keras.callbacks.EarlyStopping(monitor = "val_loss", patience = 10)

Tuning

In [None]:
random_search_tuner.search(X_train, y_train, epochs = 30, validation_data = (X_valid, y_valid), callbacks = [checkpoint, early_bird])

####To make use of the models and configs found using tuning

In [8]:
top3_models = random_search_tuner.get_best_models(num_models = 3)
best_model = top3_models[0]

In [9]:
top3_params = random_search_tuner.get_best_hyperparameters(num_trials = 3)
top3_params[0].values

{'n_hidden': 7,
 'n_neuron': 124,
 'learning_rate': 0.0005509513888645584,
 'optimizer': 'adam'}

Each tuner is guided by a so-called oracle: before each trial, the tuner asks
the oracle to tell it what the next trial should be.

Since the oracle keeps track of all the
trials, we can ask it to give you the best one

####Doesn't work anymore

In [41]:
best_trial = random_search_tuner.get_best_trials(num_trials = 3)[0]
best_trial.summary

AttributeError: 'RandomSearch' object has no attribute 'get_best_trials'

can also view the hyperparameter from this

In [None]:
best_trial.metrics.get_last_value("val_accuracy")

####Continuation

In [10]:
best_model.fit(X_full, y_full, epochs = 200)
test_loss, test_accuracy = best_model.evaluate(X_test, y_test)

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

## Hyperparam_tuning using submodule and the Hyperparameter Tuner function
if we want to fine-tune data preprocessing
hyperparameters, or model.fit() arguments, such as the batch size

In [11]:
class MyHyperModule(kt.HyperModel):
  def build(self,hp):
    return model_builder(hp)

  def fit(self, hp, model, X, y, **kwargs):
    if hp.Boolean("normalize"):
      normer = tf.keras.layers.Normalization()
      X = normer(X)
    return model.fit(X,y, **kwargs)

### Hyperband Optimizer

**Step 1 (Initialization)**: samples different combinations of hyperparameter configs.

**Step 2 (Exploration)**: Tries the configs on a franction of max epochs and discards the worst

**Step 3 (Successive Halving)**: Trains the good ones on  more epocs and halves the number of combinations each epoch. Until factor number of top is remaining

Faster than Random Search but make sure confics can perform better in the long run rather than short term.(eg. have requirement of lower learnitg rate)

In [12]:
hyperband_tuner = kt.Hyperband(MyHyperModule(), objective = "val_accuracy", seed = 42, max_epochs = 30, factor = 3,
                               hyperband_iterations = 2, overwrite = True, directory = "fashion_mnist", project_name = "hyperband")

In [13]:
%pip install -q -U tensorboard_plugin_profile

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.6/5.6 MB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[?25h

#### Tensor Board Notes



```
def get_run_logdir(root_logdir="my_logs"):
  return Path(root_logdir) / strftime("run_%Y_%m_%d_%H_%M_%S")
  
run_logdir = get_run_logdir() # e.g., my_logs/run_2022_08_01_17_25_59
```


The good news is that Keras provides a convenient TensorBoard()
callback that will take care of creating the log directory for you (along with
its parent directories if needed), and it will create event files and write
summaries to them during training. It will measure your model’s training
and validation loss and metrics (in this case, the MSE and RMSE), and it
will also profile your neural network. It is straightforward to use:
 ```
tensorboard_cb = tf.keras.callbacks.TensorBoard(run_logdir,
profile_batch=(100, 200))
```

Callback setups

In [15]:
root_logdir = Path(hyperband_tuner.project_dir) / "tensorboard"
tensorboard_cb = tf.keras.callbacks.TensorBoard(root_logdir)
early_stopping_cb = tf.keras.callbacks.EarlyStopping(patience = 2)

Trial 180 Complete [00h 01m 00s]
val_accuracy: 0.787745475769043

Best val_accuracy So Far: 0.8316181898117065
Total elapsed time: 01h 21m 24s


Tuning

In [None]:
hyperband_tuner.search(X_train, y_train, epochs = 30, validation_data = (X_valid, y_valid),
                       callbacks = [early_stopping_cb, tensorboard_cb])

#BayesianOptimization implementing Gaussian process

> **alpha**

> represents the level of noise you expect
in the performance measures across trials (it defaults to 10 )

> **beta**

> specifies how much you want the algorithm to explore, instead of simply
exploiting the known good regions of hyperparameter space

In [None]:
bayesian_opt_tuner = kt.BayesianOptimization(
    MyHyperModule(), objective = "val_accuracy", seed = 42,
    max_trial = 10, alpha = 1e-4, beta = 2.6, overwrite = True, directory= "my_fashion_mnist", project_name = "bayesian_opt"
)

bayesian_opt_tuner.search(X_train, y_train, epochs = 200, validation_data = (X_valid, y_valid), callbacks = [early_stopping_cb, tensorboard_cb])

##Extra Notes


Google
has also used an evolutionary approach, not just to search for
hyperparameters

**but**

 also to explore all sorts of model architectures:

 it
powers their AutoML service on Google Vertex AI.

The
term AutoML refers to any system that takes care of a large part of the ML
workflow.

Evolutionary algorithms have even been used successfully to
train individual neural networks, replacing the ubiquitous gradient descent!