<a href="https://colab.research.google.com/github/j0hnn/j0hnn/blob/main/HyperParameterTuning%5BKeras%5D.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**What are not hyperparameters ?** 
Weights and biases that the nework learns during training
  
List of *Hyperparams* that can be adjusted before we train the network
1.   **Data - level params** : data augmentation, stratification
2.   **network architecture params** : num of layers, num of nodes in a layer,dropout, batch normalisation,  network initialisation, etc...
3.   **model training params** : optimiser, learning rate, momentum, learning rate schedulers, mini-batch or batch, early_stoppping





For a simple network in keras, we'll adjust appropriate hyperparameters, as we encounter issues of low performance ( *possibly due to overfitting, underfitting and the like* )

Let us start by adjusting the learning rate and the number of nodes in some layers.


*   LEARNING_RATE
*   DENSE_1
*   DENSE_2


In [1]:
# imports
import tensorflow as tf
from tensorflow import keras
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import inspect

In [None]:
# create model
def create_model(learning_rate, dense_1, dense_2):
    assert learning_rate > 0 and dense_1 > 0 and dense_2 > 0, "set value higher than 0"

    model =tf.keras.models.Sequential()
    model.add(tf.keras.layers.Conv2D(16, (3,3), activation="relu", input_shape=(32,32,3), padding="same"))
    model.add(tf.keras.layers.Conv2D(16, (3,3), activation="relu", padding="same"))
    model.add(tf.keras.layers.Flatten())
    model.add(tf.keras.layers.Dense(int(dense_1), activation="relu", name="fc1"))
    model.add(tf.keras.layers.Dense(int(dense_2), activation="relu", name="fc2"))
    model.add(tf.keras.layers.Dense(10, name="output"))

    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    optimizer = tf.keras.optimizers.SGD(lr=learning_rate)
    model.compile(optimizer, loss=loss_fn, metrics=['accuracy'])

    return model

In [None]:
# train model
def train_model():

  # specify the hyperparameters
  LEARNING_RATE =  # eg., 0.001
  DENSE_1 =        # eg., 32
  DENSE_2 =        # eg., 32

  (train_x, train_y), (test_x, test_y) = tf.keras.datasets.cifar10.load_data()
  train_x, test_x = train_x / 255.0, test_x / 255.0

  model = create_model(learning_rate=LEARNING_RATE, dense_1=DENSE_1, dense_2=DENSE_2)

  checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    "model.h5", monitor='accuracy', save_best_only=True, save_freq=2)
  
  # Training
  model.fit(
      train_x, train_y, 
      validation_data=(test_x, test_y),
      verbose=0, batch_size=32, epochs=2, callbacks=[checkpoint_callback])
  return model

In [None]:
original_model = train_model()  # This trains the model and returns it.

# test the model
(train_x, train_y), (test_x, test_y) = tf.keras.datasets.cifar10.load_data()
test_x = test_x / 255.0

original_loss, original_accuracy = original_model.evaluate(test_x, test_y, verbose=0)
print("Loss is {:0.4f}".format(original_loss))
print("Accuracy is {:0.4f}".format(original_accuracy))

  "The `lr` argument is deprecated, use `learning_rate` instead.")


Loss is 1.9615
Accuracy is 0.2928


#**Ray[tune]**

---
It is one hyperparameter tuning tool that works with all deeplearning 
frameworks. These tools enable experimentation with a range of possible 
values for each hyperparameter instead of manually setting them. This allows for a comprehensive search for the best hyperparameters.

---


In [None]:
# Hyperparameter tuning with ray. Do not run if already installed

# install and import ray

!pip uninstall -y -q pyarrow
!pip install -q -U ray[tune]
!pip install -q ray[debug]

# After installation, goto runtime in the menubar ( at the top ) and restart runtime or 'Ctrl + M'



In [None]:
import ray

ray.shutdown()  # Restart Ray defensively in case the ray connection is lost. 
ray.init(log_to_driver=False)
# We clean out the logs before running for a clean visualization later.
! rm -rf ~/ray_results/tune_iris

In [None]:
# just add this custom callback for using tune with keras.
# This callback reports the performance of the model after every epoch of the current trial to the tune master

from ray import tune

class TuneReporterCallback(keras.callbacks.Callback):
    """Tune Callback for Keras.
    
    The callback is invoked every epoch.
    """

    def __init__(self, logs={}):
        self.iteration = 0
        super(TuneReporterCallback, self).__init__()

    def on_epoch_end(self, batch, logs={}):
        self.iteration += 1
        tune.report(keras_info=logs, mean_accuracy=logs.get("accuracy"), mean_loss=logs.get("loss"))

Modify the training function to use tuning

1. Pass config as argument to the train function


```
def tune_model(config):
```


2. change create_model call with options from config.


```
model = create_model(learning_rate=config['lr'], dense_1 = config['dense_1'], dense_2=config['dense_2']
```


In [None]:
def tune_model( ):
  (train_x, train_y), (test_x, test_y) = tf.keras.datasets.cifar10.load_data()
  train_x, test_x = train_x / 255.0, test_x / 255.0

  # Change here
  model = create_model(learning_rate=, dense_1=, dense_2=) 

  checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
  "model.h5", monitor='accuracy', save_best_only=True, save_freq=2)

  # Train the model
  model.fit(
      train_x, train_y, 
      validation_data=(test_x, test_y),
      verbose=0, batch_size=32, epochs=2, callbacks=[checkpoint_callback, TuneReporterCallback()])

Add necessary parameter choices in the configuration dictionary

```
  # Choices of values for each hyperparameter can be specified as a python dictionary.
  hyperparameter_space =  {
      "lr": tune.choice([0.001, 0.1]), 
      "dense_1": tune.choice(2, 20, 64, 128),
      "dense_2": tune.choice(2, 32, 64, 128, 256)
  } 
```
The number of samples is roughly equivalent to the number of experiment trials you would like to run with the above choice combinations.


In [None]:
hyperparameter_space =  { } 
num_samples =   # TODO: Fill me out. eg. 20

In [None]:
# tune
analysis = tune.run(
    tune_model, 
    config=hyperparameter_space,
    resources_per_trial={'cpu':2, 'gpu':1},
    num_samples=num_samples
    )

In [None]:
# get best model for testing
logdir = analysis.get_best_logdir("keras_info/val_loss", mode="min")
# We saved the model as `model.h5` in the logdir of the trial.
from tensorflow.keras.models import load_model
tuned_model = load_model(logdir + "/model.h5")

In [None]:
# test tuned model
(train_x, train_y), (test_x, test_y) = tf.keras.datasets.cifar10.load_data()
test_x = test_x / 255.0

tuned_loss, tuned_accuracy = tuned_model.evaluate(test_data, test_labels, verbose=0)


**Ref:**


1.   https://docs.ray.io/en/latest/tune/tutorials/overview.html
2.   https://www.youtube.com/watch?v=2QX6jjMt1Eg&t=494s



Exercise :



1.   For the above 3 hyperparameters, find which hyperparameter has the most impact on model accuracy and state possible reason.
2.   For the same model architecture, add atleast one additional hyperparameter in the search apace for tuning the model. Update the code relevantly and graph the performance impact of this hyperparameter

