<h1 style="color:blue;">CIFAR-10 example using Keras</h1>

<b>This notebook will show you how to create, train and evaluate a small convolution network to work on the CIFAR-10 dataset.
<br>
The first thing we will do is the usual housekeeping ..import the required python libraries and set up directories. There's no machine learning specific code required here, just plain vanilla Python &#40;..I use Python3&#41;.</b>

In [None]:
import os
import sys
import shutil
import numpy as np
from keras import datasets, utils, layers, models, optimizers, callbacks

K_CHKPT_FILE = 'float-model-{epoch:02d}-{val_acc:.2f}.hdf5'
K_MODEL_DIR = './k_model'
K_CHKPT_DIR = './k_chkpts'
TB_LOG_DIR = './tb_logs'
K_CHKPT_PATH = os.path.join(K_CHKPT_DIR, K_CHKPT_FILE)


# create a directory for the saved model if it doesn't already exist
# delete it and recreate if it already exists
if (os.path.exists(K_MODEL_DIR)):
    shutil.rmtree(K_MODEL_DIR)
os.makedirs(K_MODEL_DIR)
print("Directory " , K_MODEL_DIR ,  "created ")


# create a directory for the TensorBoard data if it doesn't already exist
# delete it and recreate if it already exists
if (os.path.exists(TB_LOG_DIR)):
    shutil.rmtree(TB_LOG_DIR)
os.makedirs(TB_LOG_DIR)
print("Directory " , TB_LOG_DIR ,  "created ") 


# create a directory for the checkpoints if it doesn't already exist
# delete it and recreate if it already exists
if (os.path.exists(K_CHKPT_DIR)):
    shutil.rmtree(K_CHKPT_DIR)
os.makedirs(K_CHKPT_DIR)
print("Directory " , K_CHKPT_DIR ,  "created ")

<h2 style="color:blue;">Data wrangling</h2>

<b>Next, we download the dataset. Keras conveniently provides the CIFAR-10 dataset and functions for loading it.

The dataset is already split into training and test data - 50k images & labels for training, 10k images & labels for test.
</b>

In [None]:
(X_train, Y_train), (X_test, Y_test) = datasets.cifar10.load_data()

<b>The 'images' are actually numpy arrays, with a shape of (32,32) and datatype uint8. This means the actual data values of each element of the arrays (i.e. the pixels') can have a value of 0 to 255. Let's scale them back to range 0 to 1.0.  Note that dividing by 255.0 converts the array elements from integer to float.</b>

In [None]:
X_train = X_train / 255.0
X_test = X_test / 255.0

<b>For convenience, we'll create a list of labels for the 10 categories of image in the CIFAR-10 dataset. We 'll use it later when making predictions.</b>

In [None]:
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

<b>NExt we will 'steal' the last 5k images and labels from the training set to create a set of data which will remain unseen to the network during training and evaluation. We will use this new dataset to test the network after training has completed.</b>

In [None]:
# create unseen dataset for predictions - 5k images
X_predict = X_train[45000:]
Y_predict = Y_train[45000:]

# reduce training dataset to 45k images
X_train = X_train[:45000]
Y_train = Y_train[:45000]

# one-hot encode the labels
Y_train = utils.to_categorical(Y_train)
Y_test = utils.to_categorical(Y_test)


<b>Now for the training parameters. We'll set up the batch size to be 128, a learning rate of 0.0001 and decay rate of 1e-6 for the Adaptive Momentum optimizer.

The maximum number of epochs is set to 250, but we are unlikely to reach this limit due to the Early Stop call back which we will see later.

You are encourged to modify these parameters to see what effect they have on the final accuracy.</b>

In [None]:
BATCHSIZE = 128
LEARN_RATE = 0.0001
DECAY_RATE = 1e-6

EPOCHS = 1
#EPOCHS = 250

<h2 style="color:blue;">Define the functional model on the CNN</h2>

<b>
This next section creates our CNN. It is a Keras functional model and built up of layers.

Note how we need to define the shape of the input to the first layer, the others are automatically calculated.
</b>

In [None]:
# miniVGGNet as Keras functional model

inputs = layers.Input(shape=(32, 32, 3))
net = layers.Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')(inputs)
net = layers.BatchNormalization(axis=-1)(net)
net = layers.Conv2D(32, kernel_size=(3, 3), padding='same', activation='relu')(net)
net = layers.BatchNormalization(axis=-1)(net)
net = layers.MaxPooling2D(pool_size=(2,2))(net)
net = layers.Dropout(0.25)(net)
net = layers.Conv2D(64, kernel_size=(3, 3), padding='same', activation='relu')(net)
net = layers.BatchNormalization(axis=-1)(net)
net = layers.Conv2D(64, kernel_size=(3, 3), padding='same', activation='relu')(net)
net = layers.BatchNormalization(axis=-1)(net)
net = layers.MaxPooling2D(pool_size=(2,2))(net)
net = layers.Dropout(0.25)(net)
net = layers.Flatten()(net)
net = layers.Dense(512, activation='relu')(net)
net = layers.BatchNormalization()(net)
net = layers.Dropout(0.5)(net)
prediction = layers.Dense(10, activation='softmax')(net)

model = models.Model(inputs=inputs, outputs=prediction)

<b>Keras makes it easy to print out a summary of our network model...</b>

In [None]:
print(model.summary())
print("Model Inputs: {ips}".format(ips=(model.inputs)))
print("Model Outputs: {ops}".format(ops=(model.outputs)))

<h2 style="color:blue;">Callbacks</h2>

<b>..and now for the callbacks. These will be used during training.
The first callback sets up the TensorBoard logging.
The second one sets a limit for the training and will stop it if the loss doesn't improve by the value of min_delta (0.001 in this case) for at least 3 epochs.
The third callback defines where the checkpoint will be saved.</b>

In [None]:
# create Tensorboard callback
tb_call = callbacks.TensorBoard(log_dir=TB_LOG_DIR,
                                         histogram_freq=10,
                                         batch_size=BATCHSIZE,
                                         write_graph=True,
                                         write_grads=False,
                                         write_images=False )


# Early stop callback
earlystop_call = callbacks.EarlyStopping(min_delta=0.001, patience=3)

# checkpoint save callback
chk_call = callbacks.ModelCheckpoint(K_CHKPT_PATH, save_best_only=True)

<h2 style="color:blue;">Training, evaluation, prediction</h2>

<b>The .compile method defines the learning process by setting the type of optimizer (Adaptive Momentum in this case) and its parameters such as learning rate and decay rate and the metric that it needs to optimize.</b>

In [None]:
model.compile(loss='categorical_crossentropy', 
              optimizer=optimizers.Adam(lr=LEARN_RATE, decay=DECAY_RATE),
              metrics=['accuracy']
              )

<b>The .fit method trains the model for a certain number of epochs.
The validation data will be used to evaluate the model metrics at the end of each epoch.
On both the training and test labels, we use the Keras .to_categorical() method to convert the scalar values to one-hot encoded vectors.
Note that the callbacks we set up earlier are used here.</b>

In [None]:
model.fit(X_train,
          Y_train,
          batch_size=BATCHSIZE,
          shuffle=True,
          epochs=EPOCHS,
          validation_data=(X_test, Y_test),
          callbacks=[earlystop_call,tb_call,chk_call])

<b>The .evaluate method will use the supplied dataset to evaluate the trained model.</b>

In [None]:
scores = model.evaluate(X_test, 
                        Y_test,
                        batch_size=BATCHSIZE
                        )

print('Loss: %.3f' % scores[0])
print('Accuracy: %.3f' % scores[1])

<b>..and then the .predict method will use the trained model to make some predictions - this would best be done using 'previously unseen' validation data, but here I'm just using the first 10 images from the test dataset.</b>

In [None]:
print('Make some predictions with the trained model..')
predictions = model.predict(X_predict, batch_size=BATCHSIZE)

# each prediction is an array of 10 values
# the max of the 10 values is the model's 
# highest "confidence" classification
# use numpy argmax function to get highest of the set of 10

correct_predictions = 0
wrong_predictions = 0

for i in range(len(predictions)):
    pred=np.argmax(predictions[i])
    actual=(Y_predict[i][0])

    if (pred == actual):
        correct_predictions += 1
    else:
        wrong_predictions += 1

print('Validation dataset size: ' , len(predictions), ' Correct Predictions: ', correct_predictions, ' Wrong Predictions: ', wrong_predictions)
print ('-------------------------------------------------------------')

Finally, we save the trained model.

In [None]:
print("Saving the Keras model in keras format..")

# save just the weights (no architecture) to an HDF5 format file
model.save_weights(os.path.join(K_MODEL_DIR,'k_model_weights.h5'))

# save just the architecture (no weights) to a JSON file
with open(os.path.join(K_MODEL_DIR,'k_model_architecture.json'), 'w') as f:
    f.write(model.to_json())

# save weights, model architecture & optimizer to an HDF5 format file
model.save(os.path.join(K_MODEL_DIR,'k_complete_model.h5'))


print('FINISHED!')
