# Designing and tuning a Deep Learning framwork with Grid Search

In this case we will simply use the MNIST, since ist is conveniently loaded from Keras.

## Imports

In [1]:
import keras
from keras.models import Model
from keras.layers import Input, Dense
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.optimizers import RMSprop
from keras.layers import LeakyReLU

from keras import backend as K
import numpy as np

Using TensorFlow backend.


## Loading and preprocessing the MNIST

In [2]:
batch_size = 128
num_classes = 10
epochs = 20

# load and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

## Designing a simple model and Grid Search

We will choose the best Activation function for the first layer and the optimizer function for the parameters. The grid will be #activation (3) * #optimizers (7) = 21. This is solved with nested loops.

Let's save the scores for each hyper parameter and then we can carry on with the best performing hyper parameter combination.

In [3]:
import pandas as pd

# define parameter grid
activation_functions_layer_1 = ['sigmoid','tanh','relu']
optimizers = ['rmsprop','adagrad','adadelta','adam','Nadam','Adamax','SGD']
num_hyperparams = len(optimizers)*len(activation_functions_layer_1)
counter = 1
df = pd.DataFrame(columns=['optimizers','activation_functions_layer_1','score','file name'])

# optimize over parameter grid (grid search)
for activation_function_layer_1 in activation_functions_layer_1:
    for optimizer in optimizers:
        print('Model %s of %s. Hyperparams: %s, %s' % (counter, num_hyperparams, optimizer, activation_function_layer_1))
        counter = counter+1
        model = Sequential()
        model.add(Dense(512, activation = activation_function_layer_1, input_shape=(784,)))
        model.add(Dense(num_classes, activation='softmax'))

        model.compile(loss='categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])

        model.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=epochs,
              verbose=1,
              validation_data=(x_test, y_test))
        
        score = model.evaluate(x_test, y_test, verbose=0)
        save_path = "ker_func_mnist_model_2.%s.%s.%s.h5" % (activation_function_layer_1,optimizer,score[1])
        model.save(save_path)
        
        df = df.append({'activation_functions_layer_1' : activation_function_layer_1,'optimizers': optimizer, 'score' : score[1], 'file name': save_path}, ignore_index=True)

W0817 17:12:16.247957  5180 deprecation_wrapper.py:119] From C:\Users\ChristianV700\Anaconda3\envs\keras-gpu\lib\site-packages\keras\backend\tensorflow_backend.py:74: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead.

W0817 17:12:16.263594  5180 deprecation_wrapper.py:119] From C:\Users\ChristianV700\Anaconda3\envs\keras-gpu\lib\site-packages\keras\backend\tensorflow_backend.py:517: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

W0817 17:12:16.263594  5180 deprecation_wrapper.py:119] From C:\Users\ChristianV700\Anaconda3\envs\keras-gpu\lib\site-packages\keras\backend\tensorflow_backend.py:4138: The name tf.random_uniform is deprecated. Please use tf.random.uniform instead.

W0817 17:12:16.294851  5180 deprecation_wrapper.py:119] From C:\Users\ChristianV700\Anaconda3\envs\keras-gpu\lib\site-packages\keras\optimizers.py:790: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimize

Model 1 of 21. Hyperparams: rmsprop, sigmoid


W0817 17:12:16.482302  5180 deprecation_wrapper.py:119] From C:\Users\ChristianV700\Anaconda3\envs\keras-gpu\lib\site-packages\keras\backend\tensorflow_backend.py:986: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.



Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model 2 of 21. Hyperparams: adagrad, sigmoid
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model 3 of 21. Hyperparams: adadelta, sigmoid
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
Model 4 of 21. Hyperparams: adam, sigmoid
Train on 60000 samples, 

## Model evaluation
Let's have a look at all the models and see which hyper parameter configuration was the best one. We can see that the majority of hyper parameter combinations yield >95% validation accuracy.

In [4]:
df = df.sort_values(by=['score'], ascending=False)
df

Unnamed: 0,optimizers,activation_functions_layer_1,score,file name
16,adadelta,relu,0.9846,ker_func_mnist_model_2.relu.adadelta.0.9846.h5
7,rmsprop,tanh,0.9829,ker_func_mnist_model_2.tanh.rmsprop.0.9829.h5
19,Adamax,relu,0.9827,ker_func_mnist_model_2.relu.Adamax.0.9827.h5
0,rmsprop,sigmoid,0.9826,ker_func_mnist_model_2.sigmoid.rmsprop.0.9826.h5
9,adadelta,tanh,0.9824,ker_func_mnist_model_2.tanh.adadelta.0.9824.h5
12,Adamax,tanh,0.9824,ker_func_mnist_model_2.tanh.Adamax.0.9824.h5
15,adagrad,relu,0.9818,ker_func_mnist_model_2.relu.adagrad.0.9818.h5
11,Nadam,tanh,0.9818,ker_func_mnist_model_2.tanh.Nadam.0.9818.h5
10,adam,tanh,0.9816,ker_func_mnist_model_2.tanh.adam.0.9816.h5
3,adam,sigmoid,0.9809,ker_func_mnist_model_2.sigmoid.adam.0.9809.h5


## Save the model for deployment

Now it's time to create a tarball out of the best performing model. This tarball can be sent to Cloud clients such as IBM Watson ML and deployed from there.

In [5]:
!tar -zcvf my_best_model.tgz ker_func_mnist_model_2.relu.adadelta.0.9846.h5

a ker_func_mnist_model_2.relu.adadelta.0.9846.h5
