# Keras Tuner MNIST 98% Intro demo

This is an early demo of Kerastuner that show case its end-user API.

If you would like to be involve or alpha test it please get in touch with us:   solarwinds-ai@google.com

This tuner is developed by:
elieb@, invernizzi@, fchollet@

In [4]:
# Install dependendencies
!pip install terminaltables colorama art etaprogress pandas
!pip install -i "https://google:protectG00gl3@pypi-dot-protect-research.appspot.com/pypi" -U kerastuner

Collecting pandas
  Downloading https://files.pythonhosted.org/packages/d0/4e/9db3468e504ac9aeadb37eb32bcf0a74d063d24ad1471104bd8a7ba20c97/pandas-0.24.2-cp36-cp36m-win_amd64.whl (8.8MB)
Installing collected packages: pandas
Successfully installed pandas-0.24.2
Looking in indexes: https://google:****@pypi-dot-protect-research.appspot.com/pypi
Requirement already up-to-date: kerastuner in c:\users\elie\appdata\local\programs\python\python36\lib\site-packages (0.7.1547413205)


In [2]:
!pip list --extra-index-url "https://google:protectG00gl3@pypi-dot-protect-research.appspot.com/pypi" | grep kerastuner

In [15]:
# standard imports
from IPython.display import clear_output
import numpy as np
import pandas as pd 
import matplotlib.pyplot as plt
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D, GlobalMaxPooling2D
from tensorflow.keras.optimizers import Adam

# Importing the MNIST dataset
X is normalized, and Y is one-hot encoded.

In [16]:
def normalize(x, y):
  x = x.reshape(-1, 28, 28, 1).astype('float32') / 255.0
  y = to_categorical(y, 10)
  return x, y

In [17]:
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, y_train = normalize(x_train, y_train)
x_test, y_test = normalize(x_test, y_test)

# Training a simple sequential model

In [18]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(20, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))
model.compile(loss= 'categorical_crossentropy' , optimizer= 'adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 24, 24, 64)        18496     
_________________________________________________________________
flatten (Flatten)            (None, 36864)             0         
_________________________________________________________________
dense (Dense)                (None, 20)                737300    
_________________________________________________________________
dropout (Dropout)            (None, 20)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 10)                210       
Total params: 756,326
Trainable params: 756,326
Non-trainable params: 0
_________________________________________________________________


In [19]:
model.fit(x_train, y_train, epochs=3)
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Epoch 1/3
Epoch 2/3
Epoch 3/3
Test loss: 0.04414784894739278
Test accuracy: 0.9857


# Hypertuning model


Defining what to hypertune is as easy as writing a Keras/TF 2.0 model. The only difference is the use of distribution of parameters instead of fixed values in the layer definitions.


## Keras Tuner imports
Keras Tuner is a standard package that is imported like every other python packages. There are two main type of import that are needed:
- **distributions**: those import are used to instanciate the hyperparameters
- **tuners**: this import specify which hypertuner algorithm you would like to use

In [20]:
# kerastuner imports
from kerastuner.distributions import Range, Choice, Boolean, Fixed, Linear, clear_hyper_parameters
from kerastuner.tuners import RandomSearch

## hypermodel creation

Creating a hypertunable model is as easy as taking the initial model, replacing some of its fixed parameters and wrapping it into a function that can be passed to the tuner. Here is how to do it for the MNIST model defined above

In [21]:
# this is the wrapping function that will be passed to the tuner
def model_fn():
    
    # define hyper_params
    L1_NUM_FITLERS = Range('l1_num_filters', 8, 64, 8, group='cnn')
    L2_NUM_FITLERS = Range('l2_num_filters', 8, 64, 8, group='cnn')
    NUM_DIMS = Range('num_dims', 8, 32, 8, group='dense')
    NUM_LAYERS = Range('num_layers', 1, 3, group='dense')
    DROPOUT_RATE = Linear('dropout_rate', 0.0, 0.5, 5, group='dense')

    # hypermodel: simply replace fixed parameters with hyper ones.
    model = Sequential()
    model.add(Conv2D(L1_NUM_FITLERS, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
    model.add(Conv2D(L2_NUM_FITLERS, kernel_size=(3, 3), activation='relu'))
    model.add(Flatten())
    for _ in range(NUM_LAYERS):
      model.add(Dense(NUM_DIMS, activation='relu'))
      model.add(Dropout(DROPOUT_RATE))
    model.add(Dense(10, activation='softmax'))
    model.compile(loss= 'categorical_crossentropy' , optimizer='adam', metrics=['accuracy'])
    return model

### testing hypermodel
Let's make sure our hypermodel works as intented by calling the function. This is one of the nice feature of Keras Tuner: the hyper model once called is a standard TF/Keras model which means you can use your normal workflow to look at them.

In [22]:
clear_hyper_parameters()  # this will be removed with upcoming new distribution system
test = model_fn()
test.summary()  # let's just check if the model summary look reasonable

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 26, 26, 56)        560       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 24, 24, 24)        12120     
_________________________________________________________________
flatten_1 (Flatten)          (None, 13824)             0         
_________________________________________________________________
dense_2 (Dense)              (None, 24)                331800    
_________________________________________________________________
dropout_1 (Dropout)          (None, 24)                0         
_________________________________________________________________
dense_3 (Dense)              (None, 24)                600       
_________________________________________________________________
dropout_2 (Dropout)          (None, 24)                0         
__________

## Instantiating tuner
FIXME: explain the function

In [23]:
# 5 models with 3 epochs each
hm = RandomSearch(model_fn, epoch_budget=30, max_epochs=3, project='mnist_ht', architecture='conv')

[INFO] Model checkpoint enabled - metric:val_loss mode:min


HBox(children=(IntProgress(value=0, description='Finding previously trained models', max=10, style=ProgressSty…




## hypermodel summary

Similar to TF/Keras model hypermodel provide you a summary. This summary allows you to know how big is your search space

In [25]:
hm.summary()

Group,Size
cnn,49
dense,30


Group,Param,Space size
cnn,l1_num_filters,7
cnn,l2_num_filters,7
dense,num_dims,3
dense,num_layers,2
dense,dropout_rate,5


## Cloud service
FIXME explain cloud service

In [0]:
hm.enable_cloud(api_key="6a6b3c36962914f2")

AttributeError: ignored

## searching for the best model

In [26]:
hm.search(x_train, y_train, validation_data=(x_test, y_test))

Hyperparameter,Value
cnn:l1_num_filters,40.0
cnn:l2_num_filters,32.0
dense:num_dims,8.0
dense:num_layers,1.0
dense:dropout_rate,0.375


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,9223372036854775807,0.6114
val_loss,9223372036854775807,0.1282
acc,-1,0.7524
val_acc,-1,0.9721

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,24
cnn:l2_num_filters,8
dense:num_dims,16
dense:num_layers,1
dense:dropout_rate,0


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.6114,0.0489
val_loss,0.1282,0.0544
acc,0.7524,0.9847
val_acc,0.9721,0.9822

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,8
cnn:l2_num_filters,48
dense:num_dims,16
dense:num_layers,2
dense:dropout_rate,0


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0489,0.0434
val_loss,0.0544,0.0532
acc,0.9847,0.9863
val_acc,0.9822,0.9837

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,40.0
cnn:l2_num_filters,32.0
dense:num_dims,8.0
dense:num_layers,2.0
dense:dropout_rate,0.5


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,1.6775
val_loss,0.0532,0.9992
acc,0.9863,0.3669
val_acc,0.9837,0.7419

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,32
cnn:l2_num_filters,56
dense:num_dims,8
dense:num_layers,1
dense:dropout_rate,0


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,0.0603
val_loss,0.0532,0.0637
acc,0.9863,0.9825
val_acc,0.9837,0.9821

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,24
cnn:l2_num_filters,8
dense:num_dims,8
dense:num_layers,2
dense:dropout_rate,0


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,0.0987
val_loss,0.0532,0.0942
acc,0.9863,0.9701
val_acc,0.9837,0.973

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,56.0
cnn:l2_num_filters,32.0
dense:num_dims,8.0
dense:num_layers,2.0
dense:dropout_rate,0.5


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,1.7356
val_loss,0.0532,1.1459
acc,0.9863,0.3132
val_acc,0.9837,0.6236

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,40
cnn:l2_num_filters,48
dense:num_dims,8
dense:num_layers,2
dense:dropout_rate,0


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,0.0895
val_loss,0.0532,0.0873
acc,0.9863,0.9754
val_acc,0.9837,0.9782

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,40.0
cnn:l2_num_filters,56.0
dense:num_dims,24.0
dense:num_layers,1.0
dense:dropout_rate,0.5


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))




Metric,Best model,Last model
loss,0.0434,0.2832
val_loss,0.0532,0.055
acc,0.9863,0.8915
val_acc,0.9837,0.9839

Error,count
collisions,0
invalid models,0
over size models,0


Hyperparameter,Value
cnn:l1_num_filters,56.0
cnn:l2_num_filters,48.0
dense:num_dims,24.0
dense:num_layers,1.0
dense:dropout_rate,0.375


HBox(children=(IntProgress(value=0, max=1875), HTML(value='')))

KeyboardInterrupt: 

In [27]:
hm.display_result_summary()

HBox(children=(IntProgress(value=0, description='Parsing results', max=19, style=ProgressStyle(description_wid…

metric,model 0,model 1,model 2,model 3,model 4,model 5,model 6,model 7,model 8
loss,0.0434,0.0489,0.0603,0.0895,0.0987,0.2832,0.6114,1.6775,1.7356
acc,0.9863,0.9847,0.9825,0.9754,0.9701,0.8915,0.7524,0.3669,0.3132
val_acc,0.9837,0.9822,0.9821,0.9782,0.973,0.9839,0.9721,0.7419,0.6236
val_loss,0.0532,0.0544,0.0637,0.0873,0.0942,0.055,0.1282,0.9992,1.1459


hyperparam,model 0,model 1,model 2,model 3,model 4,model 5,model 6,model 7,model 8
cnn,,,,,,,,,
|-l1_num_filters,8.0,24.0,32.0,40.0,24.0,40.0,40.0,40.0,56.0
|-l2_num_filters,48.0,8.0,56.0,48.0,8.0,56.0,32.0,32.0,32.0
dense,,,,,,,,,
|-dropout_rate,0.0,0.0,0.0,0.0,0.0,0.5,0.375,0.5,0.5
|-num_dims,16.0,16.0,8.0,8.0,8.0,24.0,8.0,8.0,8.0
|-num_layers,2.0,1.0,1.0,2.0,2.0,1.0,1.0,2.0,2.0
hidden_layers,,,,,,,,,
|-2nd hidden layer,,,,,,,,,
|-activation,,,,,,,,,
