<a href="https://colab.research.google.com/github/Ajay-user/DataScience/blob/master/Notes/Introduction_to_the_Keras_Tuner.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://www.tensorflow.org/tutorials/keras/keras_tuner#download_and_prepare_the_dataset


Using the Keras Tuner to perform hypertuning for an image classification application.

In [1]:
pip install keras-tuner

Collecting keras-tuner
  Downloading keras_tuner-1.0.4-py3-none-any.whl (97 kB)
[?25l[K     |███▍                            | 10 kB 24.6 MB/s eta 0:00:01[K     |██████▊                         | 20 kB 27.1 MB/s eta 0:00:01[K     |██████████                      | 30 kB 29.7 MB/s eta 0:00:01[K     |█████████████▍                  | 40 kB 31.0 MB/s eta 0:00:01[K     |████████████████▊               | 51 kB 33.0 MB/s eta 0:00:01[K     |████████████████████            | 61 kB 34.5 MB/s eta 0:00:01[K     |███████████████████████▍        | 71 kB 31.1 MB/s eta 0:00:01[K     |██████████████████████████▊     | 81 kB 32.9 MB/s eta 0:00:01[K     |██████████████████████████████▏ | 92 kB 34.9 MB/s eta 0:00:01[K     |████████████████████████████████| 97 kB 6.9 MB/s 
Collecting kt-legacy
  Downloading kt_legacy-1.0.4-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras-tuner
Successfully installed keras-tuner-1.0.4 kt-legacy-1.0.4


In [2]:
import tensorflow as tf
import keras_tuner as kt

The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called hyperparameter tuning or hypertuning.

**Hyperparameters are of two types:**

* Model hyperparameters which influence model selection such as the number and width of hidden layers
* Algorithm hyperparameters which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier

## Download and prepare the dataset

In [3]:
(train_images, train_labels),(test_images, test_labels) = tf.keras.datasets.fashion_mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz


In [4]:
print('Shape of training images',train_images.shape)
print('Shape of training labels',train_labels.shape)
print('Shape of testing images',test_images.shape)
print('Shape of testing labels',test_labels.shape)

Shape of training images (60000, 28, 28)
Shape of training labels (60000,)
Shape of testing images (10000, 28, 28)
Shape of testing labels (10000,)


## Normalize pixel values between 0 and 1

In [5]:
train_images_norm = train_images.astype('float32')/255
test_images_norm = test_images.astype('float32')/255

## Model building

When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a hypermodel.

You can define a hypermodel through two approaches:

* By using a model builder function
* By subclassing the HyperModel class of the Keras Tuner API

We can also use two pre-defined HyperModel classes - `HyperXception` and `HyperResNet` for computer vision applications.

In [6]:
def model_builder(hp):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28,28)))

  # tune the number of units in the dense layer
  # hp.int(units , ..range..) -- sample an integer from the range
  hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
  model.add(tf.keras.layers.Dense(units=hp_units, activation='relu'))
  model.add(tf.keras.layers.Dense(10))

  # this fn returns a compiled model
  hp_lr = hp.Choice('learning_rate', values=[1e-2,1e-3,1e-4])
  model.compile(optimizer = tf.keras.optimizers.Adam(learning_rate=hp_lr),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])
  
  return model

Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available - RandomSearch, Hyperband, BayesianOptimization, and Sklearn. 

To instantiate the Hyperband tuner, you must specify the hypermodel, the objective to optimize and the maximum number of epochs to train (max_epochs).

In [7]:
tuner = kt.Hyperband(model_builder, objective='val_accuracy', max_epochs=10, factor=3, directory='my_dir', project_name='intro_to_kt')

The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing 1 + logfactor(max_epochs) and rounding it up to the nearest integer.

In [8]:
early_stopping_cb = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=1)

Run the hyperparameter search. The arguments for the search method are the same as those used for tf.keras.model.fit in addition to the callback above.

In [9]:
tuner.search(train_images_norm, train_labels, epochs=50, validation_split=0.2, callbacks=[early_stopping_cb])

Trial 30 Complete [00h 00m 21s]
val_accuracy: 0.8460833430290222

Best val_accuracy So Far: 0.8829166889190674
Total elapsed time: 00h 10m 47s
INFO:tensorflow:Oracle triggered exit


In [10]:
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]


print('Best Hyperparameters found')
print('No: units in first hidden layer',best_hps.get('units'))
print('Learning rate for the optimizer',best_hps.get('learning_rate'))

Best Hyperparameters found
No: units in first hidden layer 320
Learning rate for the optimizer 0.001


## Train the model
Find the optimal number of epochs to train the model with the hyperparameters obtained from the search.

In [12]:
model = tuner.hypermodel.build(best_hps)
history = model.fit(train_images_norm, train_labels, validation_split=0.2, epochs=50, callbacks=[early_stopping_cb])

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50


In [15]:
val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch))+1

In [16]:
print('best epoch',best_epoch)

best epoch 7


Re-instantiate the hypermodel and train it with the optimal number of epochs.

In [17]:
hyper_model = tuner.hypermodel.build(best_hps)
history_hm = hyper_model.fit(train_images_norm, train_labels, validation_split=0.2, epochs=best_epoch)

Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7


Evaluate the hypermodel on the test data.

In [18]:
loss, accuracy = hyper_model.evaluate(test_images_norm, test_labels)
print('Loss', loss)
print('Accuracy',accuracy)

Loss 0.3523252010345459
Accuracy 0.8754000067710876
