# Explanation

Grid search is a model hyperparameter optimization technique.

When constructing this class you must provide a dictionary of hyperparameters to evaluate in the param_grid argument. This is a map of the model parameter name and an array of values to try.

By default, the grid search will only use one thread. By setting the n_jobs argument in the GridSearchCV constructor to -1, the process will use all cores on your machine. Depending on your Keras backend, this may interfere with the main neural network training process.

The GridSearchCV process will then construct and evaluate one model for each combination of parameters. Cross validation is used to evaluate each individual model and the default of 3-fold cross validation is used, although this can be overridden by specifying the cv argument to the GridSearchCV constructor.

In [8]:
# Seed value
# Apparently you may use different seed values at each stage
seed_value = 42

# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)

In [9]:
import numpy as np
import pandas as pd
import joblib
import tensorflow as tf
from sklearn.model_selection import GridSearchCV
from tensorflow.keras.wrappers.scikit_learn import KerasClassifier
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense, Activation, BatchNormalization
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.optimizers import SGD, Adamax
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import img_to_array, load_img, ImageDataGenerator

# Dataset preperation

Make a DataFrame

In [21]:
filenames = os.listdir("input/data/train")
categories = []
file_with_path = []
for filename in filenames:
    category = filename.split('.')[0]
    if category == 'dog':
        categories.append(1)
    else:
        categories.append(0)
    filename = 'input/data/train/'+filename
    file_with_path.append(filename)

df = pd.DataFrame({
    'filename': file_with_path,
    'category': categories
})

In [22]:
# See how DF looks like
df.head()
df.filename.describe()

count                              25000
unique                             25000
top       input/data/train/dog.12215.jpg
freq                                   1
Name: filename, dtype: object

## Function for creating the model to use by Grid Search

In [23]:
IMAGE_WIDTH = 128
IMAGE_HEIGHT = 128
IMAGE_SIZE = (IMAGE_WIDTH, IMAGE_HEIGHT)
IMAGE_CHANNELS = 3

# Data split into train and validation df

In [24]:
from tqdm import tqdm
img_pixel = []
for i, img_id in tqdm(enumerate(df['filename'].values)):
    img = load_img(img_id, target_size=(IMAGE_WIDTH, IMAGE_HEIGHT))
    img = img_to_array(img)
    img_pixel.append(img)

25000it [01:58, 210.68it/s]


In [None]:
# img_pixel=np.array([img_to_array(load_img(img, target_size=IMAGE_SIZE)) for img in df['filename'].values.tolist() ])

In [None]:
img_pixel.shape

Label encoding dog and cat for prediction

In [38]:
img_label=df.category
img_label=pd.get_dummies(df.category)
img_label.head()

Unnamed: 0,0,1
0,0,1
1,1,0
2,0,1
3,1,0
4,1,0


Final X, Y matrix for deep learning prediction

In [39]:
X=np.array(img_pixel[:25000]) # for testing purpose, put shape to only use 1000 images
y=img_label.values # Same here. Array's needs to be a match
print(X.shape)
print(y.shape)

(25000, 128, 128, 3)
(25000, 2)


## Train, Validation split

In [40]:
from sklearn.model_selection import train_test_split

In [41]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(X_train.shape)
print(y_train.shape)
print(X_test.shape)
print(y_test.shape)

(20000, 128, 128, 3)
(20000, 2)
(5000, 128, 128, 3)
(5000, 2)


# Creating callbacks (things to help our model)
We've got a model ready to go but before we train it we'll make some callbacks

Callbacks are helper functions a model can use during training to do things such as save a models progres, check a models progress or stop training early if a model stops improving.

## Tensorboard Callback

TensorBoard helps provide a visual way to monitor the progress of your model during and after training.

It can be used directly in a notebook to track the performance measures of a model such as loss and accuracy.

To set up a TensorBoard callback and view TensorBoard in a notbook, we need to do three things:

1. Load the TensorBoard notebook extension.

2. Create a TensorBoard callback which is able to save logs to a directory and pass it to our model's fit() function.

3. Visualize the our models trainigs logs using %tensorboard magic function (we'll do this later on)

In [10]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


In [30]:
import datetime
import os

# Create a function to build a TensorBoard callback
def create_tensorboard_callback():
    # Create a log directly for storing TensorBoard logs
    logdir = os.path.join('logs', datetime.datetime.now().strftime('%d%m%Y-%H%M%S'))

    return tf.keras.callbacks.TensorBoard(logdir)

## Save checkpoints during training

You can use a trained model without having to retrain it, or pick-up training where you left off in case the training process was interrupted. The tf.keras.callbacks.ModelCheckpoint callback allows you to continually save the model both during and at the end of training.

Saves every 5 epoch a checkpoint

In [31]:
def create_checkpoint(model):
    checkpoint_path = "training_checkpoint/cp-{epoch:04d}.ckpt"
    checkpoint_dir = os.path.dirname(checkpoint_path)

    # Create a callback that saves the model's weights
    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path, save_weights_only=True, verbose=1, save_freq=5)

    # Save the weights using the `checkpoint_path` format
    # model.save_weights(checkpoint_path.format(epoch=0))

# Defining Grid search

Seperating the grid seard per search cat.

First, optimizer

Second, batch and epoch

## Optimizer search

In [None]:
# define the grid search parameters
optimizer   = ['RMSprop', 'Adadelta', 'Adamax']

def create_model(optimizer='adam'):
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    model.add(Dense(2, activation='softmax'))

    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model


model = KerasClassifier(build_fn=create_model, epochs=50, batch_size=16, verbose=1)

param_grid = dict(optimizer=optimizer)

create_checkpoint(model)

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X, y)

In [None]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

## Learn rate and momentum search

In [None]:
# define the grid search parameters
learn_rate  = [0.1, 0.01, 0.001]
momentum    = [0.0, 0.4, 0.8]

def create_model(learn_rate=0.01, momentum=0):
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    model.add(Dense(2, activation='softmax'))

    optimizer = Adamax(learning_rate=learn_rate)
    
    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, verbose=1)

param_grid = dict(learn_rate=learn_rate, momentum=momentum)
grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X, y)

In [None]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

## Batch size and Epoch GridSearch

In [42]:
class LossHistory(tf.keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.losses = []

    def on_epoch_end(self, batch, logs={}):
        with open('somefile.txt', 'a') as f:
            stats = []
            stats.append(str(batch))
            stats.append('Optimizer,' + self.model.optimizer.__class__.__name__)
            stats.append('Batch_size,' + str(self.params['batch_size']))
            stats.append('accuracy,'+str(logs.get('accuracy')))
            stats.append('val_loss,'+str(logs.get('val_loss')))
            f.write(','.join(stats)+'\n')

In [43]:
# define the grid search parameters
batches     = [8, 16, 32]
epochs      = [30, 50, 100]

def create_model(learn_rate=0.001):
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())
    model.add(Dense(512, activation='relu'))
    model.add(BatchNormalization())
    model.add(Dropout(0.5))
    model.add(Dense(2, activation='softmax'))

    optimizer = Adamax(learning_rate=learn_rate)

    # Compile model
    model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

model = KerasClassifier(build_fn=create_model, verbose=1)

param_grid = dict(batch_size=batches, epochs=epochs)

# tensorboard = create_tensorboard_callback()
# create_checkpoint(model)
history = LossHistory()

grid = GridSearchCV(estimator=model, param_grid=param_grid, n_jobs=1, cv=3)
grid_result = grid.fit(X_train, y_train, callbacks=[history])

Epoch 1/30

KeyError: 'batch_size'

In [None]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

In [None]:
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))

# Checking the TensorBoard logs

Now our model has been trained, we can make its performance visual by checking the TensorBoard logs.

The TensorBoard magic function (%tensorboard) will access the logs directory we created earlier and viualize its contents.


Thanks to our early_stopping callback, the model stopped training after 26 or so epochs (in my case, yours might be slightly different). This is because the validation accuracy failed to improve for 3 epochs.

But the good new is, we can definitely see our model is learning something. The validation accuracy got to 65% in only a few minutes.

This means, if we were to scale up the number of images, hopefully we'd see the accuracy increase.

To see the logs visit : http://localhost:6006 in your browser

In [6]:
%tensorboard --logdir logs

## Load weigts from checkpoint

In [32]:
# def create_model(learn_rate=0.001):
#     model = Sequential()
#     model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, IMAGE_CHANNELS)))
#     model.add(BatchNormalization())
#     model.add(MaxPooling2D(pool_size=(2, 2)))

#     model.add(Conv2D(64, (3, 3), activation='relu'))
#     model.add(BatchNormalization())
#     model.add(MaxPooling2D(pool_size=(2, 2)))

#     model.add(Conv2D(128, (3, 3), activation='relu'))
#     model.add(BatchNormalization())
#     model.add(MaxPooling2D(pool_size=(2, 2)))

#     model.add(Flatten())
#     model.add(Dense(512, activation='relu'))
#     model.add(BatchNormalization())
#     model.add(Dropout(0.5))
#     model.add(Dense(2, activation='softmax'))

#     optimizer = Adamax(learning_rate=learn_rate)

#     # Compile model
#     model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
#     return model

In [34]:
# model = KerasClassifier(build_fn=create_model, verbose=1)
# model = create_model()

# #create_checkpoint(model)
# checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
# checkpoint_dir = os.path.dirname(checkpoint_path)

# latest = tf.train.latest_checkpoint(checkpoint_dir)
# print(latest)

# # Loads the weights
# model.load_weights(latest)

training_2/cp-0002.ckpt


AttributeError: 'KerasClassifier' object has no attribute 'load_weights'