We explored the labels and sublabel structure of our data, and decided to break it into several, smaller objectives that are solvable by a single model. The first of these is the primary classification, that is to label the image to one of three options (glasses/sunglasses, trousers/jeans, shoes).

First, we need to think about what our approach would be.
* 1) Complex network architecture that we try to optimse (find the best hyperparameters)
* 2) Try out many different architectures without much effort into optimising them
The truth is, as always, somewhere in the middle.

Limitations:
* passable model can be made in 2 weeks !
* good model in 3 months !
* great model in 1 year !

The plan for this project:
* 1) Concentrate on primary classification
* 2) Considering the best approach for the Trousers and Jeans category
* 3) Applying the techniques of L2 regularisation, dropout and data augmentation

In [15]:
import io
import itertools

import numpy as np # for the datasets
import sklearn.metrics

import tensorflow as tf
from tensorboard.plugins.hparams import api as hp

import matplotlib.pyplot as plt # for the cm

In [16]:
# Loading the datasets
data_train = np.load(r"Dataset/Primary categories - Train.npz") # .npz format is a numpy extension for storing multiple numpy arrays (contains 2 arrays: "labels" and "images")
data_val = np.load(r"Dataset/Primary categories - Validation.npz")
data_test = np.load(r"Dataset/Primary categories - Test.npz")

In [17]:
# Extracting the arrays from the imported data
images_train = data_train['images']
labels_train = data_train['labels']

images_val = data_val['images']
labels_val = data_val['labels']

images_test = data_test['images']
labels_test = data_test['labels']

In [18]:
# Scaling the pixel values of all images
images_train = images_train/255.0
images_val = images_val/255.0
images_test = images_test/255.0

When working with arrays:
* Scaling the data is easy
* TensorFlow can automatically shuffle and batch the dataset. This is done in the '.fit()' method.

We are going to use the same structure as the one we had in the MNIST example, given the forms of the images which are clearly different (horizontal, vetical and transparent).

3x4=12 combinations to test.

In [19]:
# Defining constants
EPOCHS = 15 # to prevent the training from becoming too long
BATCH_SIZE = 64 # not an hyperparameter to tune (in general, the batch size may affect the speed of the training, but not the accuracy* not true for every network, dataset and problem)

In [20]:
# Defining the hyperparameters we would tune, and their values to be tested
HP_FILTER_SIZE = hp.HParam('filter_size', hp.Discrete([3,5,7]))
HP_FILTER_NUM = hp.HParam('filters_number', hp.Discrete([32,64,96,128]))

METRIC_ACCURACY = 'accuracy'

# Logging setup info
with tf.summary.create_file_writer(r'Logs/Model 1/hparam_tuning/').as_default():
    hp.hparams_config(
        hparams=[HP_FILTER_SIZE, HP_FILTER_NUM],
        metrics=[hp.Metric(METRIC_ACCURACY, display_name='Accuracy')],
    )

Logs structure:

Logs folder where every log is written to. Inside there are folders for the different models we will test:
* Model 1: in every such mdoel folder resides two types of logs (hparam_tuning, fit=training process log)
* Model 2: same
* ...

In [24]:
# Wrapping our model and training in a function
def train_test_model(hparams, session_num):
    
    # Outlining the model/architecture of our CNN
    model = tf.keras.Sequential([
        tf.keras.layers.Conv2D(hparams[HP_FILTER_NUM], hparams[HP_FILTER_SIZE], activation='relu', input_shape=(120,90,3)),
        tf.keras.layers.MaxPooling2D(pool_size=(2,2)),
        tf.keras.layers.Conv2D(hparams[HP_FILTER_NUM], 3, activation='relu'),
        tf.keras.layers.MaxPooling2D(pool_size=(2,2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(3)
    ])
    
    # Defining the loss function
    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

    # Compiling the model
    model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

    # Defining the logging directory
    log_dir = "Logs/Model 1/fit/" + "run-{}".format(session_num)
    
    
    def plot_confusion_matrix(cm, class_names):
        """
        Returns a matplotlib figure containing the plotted confusion matrix.

        Args:
          cm (array, shape = [n, n]): a confusion matrix of integer classes
          class_names (array, shape = [n]): String names of the integer classes
        """
        figure = plt.figure(figsize=(12, 12))
        plt.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
        plt.title("Confusion matrix")
        plt.colorbar()
        tick_marks = np.arange(len(class_names))
        plt.xticks(tick_marks, class_names, rotation=45)
        plt.yticks(tick_marks, class_names)

        # Normalize the confusion matrix.
        cm = np.around(cm.astype('float') / cm.sum(axis=1)[:, np.newaxis], decimals=2)

        # Use white text if squares are dark; otherwise black.
        threshold = cm.max() / 2.
        for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
            color = "white" if cm[i, j] > threshold else "black"
            plt.text(j, i, cm[i, j], horizontalalignment="center", color=color)

        plt.tight_layout()
        plt.ylabel('True label')
        plt.xlabel('Predicted label')
        return figure
    
    
    
    def plot_to_image(figure):
        """Converts the matplotlib plot specified by 'figure' to a PNG image and
        returns it. The supplied figure is closed and inaccessible after this call."""
        # Save the plot to a PNG in memory.
        buf = io.BytesIO()
        plt.savefig(buf, format='png')
        # Closing the figure prevents it from being displayed directly inside
        # the notebook.
        plt.close(figure)
        buf.seek(0)
        # Convert PNG buffer to TF image
        image = tf.image.decode_png(buf.getvalue(), channels=4)
        # Add the batch dimension
        image = tf.expand_dims(image, 0)
        return image
    
    
    # Defining a file writer for Confusion Matrix logging purposes
    file_writer_cm = tf.summary.create_file_writer(log_dir + '/cm')     
    
    
    def log_confusion_matrix(epoch, logs):
        # Use the model to predict the values from the validation dataset.
        test_pred_raw = model.predict(images_val)
        test_pred = np.argmax(test_pred_raw, axis=1)

        # Calculate the confusion matrix.
        cm = sklearn.metrics.confusion_matrix(labels_val, test_pred)
        # Log the confusion matrix as an image summary.
        figure = plot_confusion_matrix(cm, class_names=['Glasses/Sunglasses', 'Trousers/Jeans', 'Shoes'])
        cm_image = plot_to_image(figure)

        # Log the confusion matrix as an image summary.
        with file_writer_cm.as_default():
            tf.summary.image("Confusion Matrix", cm_image, step=epoch)
    
    
    
    # Define the Tensorboard and Confusion Matrix callbacks.
    tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1, profile_batch=0)
    cm_callback = tf.keras.callbacks.LambdaCallback(on_epoch_end=log_confusion_matrix)

    
    # Defining early stopping to prevent overfitting
    early_stopping = tf.keras.callbacks.EarlyStopping(
        monitor = 'val_loss',
        mode = 'auto',
        min_delta = 0,
        patience = 2,
        verbose = 0, 
        restore_best_weights = True
    )
    
    # Training the model
    model.fit(
        images_train,
        labels_train,
        epochs = EPOCHS,
        batch_size = BATCH_SIZE, # if the batch_size parameter is set, TensorFlow would automatically SHUFFLE and BATCH the NumPy arrays (that is why we didn't do this earlier)
        callbacks = [tensorboard_callback, cm_callback, early_stopping],
        validation_data = (images_val,labels_val), # tuple of the numpy arrays
        verbose = 2 # verbosity to 2 = limited printable information as to not clutter the screen
    )
    
    
    # Evaluating the model's performance on the validation set
    _, accuracy = model.evaluate(images_val,labels_val) # important to make this evluation on the validation set and not on the test set, as we are yet to finalise the model
    
    # the model at that point is lost, as the variable that contains it is lost (the local variable is not logged!)
    # so if want to test the model with a different dataset or continue the training we won't be able to do it
    # hence, we will also export the model:
    # Saving the current model for future reference
    model.save(r"saved_models/Model 1/Run-{}".format(session_num)) # takes a lot of space so be careful
    
    return accuracy

In [25]:
# Creating a function to log the resuls
def run(log_dir, hparams, session_num):
    
    with tf.summary.create_file_writer(log_dir).as_default():
        hp.hparams(hparams)  # record the values used in this trial
        accuracy = train_test_model(hparams, session_num)
        tf.summary.scalar(METRIC_ACCURACY, accuracy, step=1)

In [26]:
session_num = 1

for filter_size in HP_FILTER_SIZE.domain.values:
    for filter_num in HP_FILTER_NUM.domain.values:

        hparams = {
            HP_FILTER_SIZE: filter_size,
            HP_FILTER_NUM: filter_num
        }

        run_name = "run-%d" % session_num
        print('--- Starting trial: %s' % run_name)
        print({h.name: hparams[h] for h in hparams})
        run('Logs/Model 1/hparam_tuning/' + run_name, hparams, session_num)

        session_num += 1

--- Starting trial: run-1
{'filter_size': 3, 'filters_number': 32}
Epoch 1/15
203/203 - 120s - loss: 0.0946 - accuracy: 0.9716 - val_loss: 0.0164 - val_accuracy: 0.9957
Epoch 2/15
203/203 - 116s - loss: 0.0152 - accuracy: 0.9978 - val_loss: 0.0140 - val_accuracy: 0.9969
Epoch 3/15
203/203 - 108s - loss: 0.0102 - accuracy: 0.9985 - val_loss: 0.0024 - val_accuracy: 0.9994
Epoch 4/15
203/203 - 118s - loss: 0.0087 - accuracy: 0.9988 - val_loss: 0.0023 - val_accuracy: 0.9994
Epoch 5/15
203/203 - 102s - loss: 0.0042 - accuracy: 0.9996 - val_loss: 0.0126 - val_accuracy: 0.9951
Epoch 6/15
203/203 - 125s - loss: 0.0058 - accuracy: 0.9991 - val_loss: 0.0147 - val_accuracy: 0.9975
INFO:tensorflow:Assets written to: saved_models/Model 1/Run-1/assets
--- Starting trial: run-2
{'filter_size': 3, 'filters_number': 64}
Epoch 1/15
203/203 - 244s - loss: 0.0905 - accuracy: 0.9759 - val_loss: 0.0059 - val_accuracy: 0.9988
Epoch 2/15
203/203 - 222s - loss: 0.0127 - accuracy: 0.9986 - val_loss: 0.0022 - va

Epoch 2/15
203/203 - 684s - loss: 0.0859 - accuracy: 0.9798 - val_loss: 0.0158 - val_accuracy: 0.9969
Epoch 3/15
203/203 - 554s - loss: 0.0211 - accuracy: 0.9974 - val_loss: 0.0050 - val_accuracy: 0.9988
Epoch 4/15
203/203 - 1627s - loss: 0.0147 - accuracy: 0.9982 - val_loss: 0.0060 - val_accuracy: 0.9988
Epoch 5/15
203/203 - 632s - loss: 0.0117 - accuracy: 0.9988 - val_loss: 0.0042 - val_accuracy: 0.9988
Epoch 6/15
203/203 - 656s - loss: 0.0083 - accuracy: 0.9989 - val_loss: 0.0064 - val_accuracy: 0.9994
Epoch 7/15
203/203 - 681s - loss: 0.0082 - accuracy: 0.9988 - val_loss: 0.0035 - val_accuracy: 0.9988
Epoch 8/15
203/203 - 659s - loss: 0.0050 - accuracy: 0.9995 - val_loss: 0.0040 - val_accuracy: 0.9988
Epoch 9/15
203/203 - 700s - loss: 0.0075 - accuracy: 0.9980 - val_loss: 0.0044 - val_accuracy: 0.9988
INFO:tensorflow:Assets written to: saved_models/Model 1/Run-12/assets


In [27]:
# Loading a model to evaluate on the test set
model = tf.keras.models.load_model(r"saved_models/Model 1/Run-1")

In [28]:
test_loss, test_accuracy = model.evaluate(images_test,labels_test)



In [29]:
# Printing the test results
print('Test loss: {0:.4f}. Test accuracy: {1:.2f}%'.format(test_loss, test_accuracy*100.))

Test loss: 0.0191. Test accuracy: 99.88%


The task is so easy, that whatever we do on the hyperparameters we get more than 99.9% accuracy! (see tensorboard in next cell) Indeed, we can't really see any correlation (tab: scatter plot matrix view) between the hyperparameters and the accuracy computed, nor for the filter size neither for the filter number. That is the reason why we won't investigate the primary classification any further (no use of any regularisation technique or create different architectures).

In [30]:
%load_ext tensorboard
%tensorboard --logdir "Logs/Model 1/hparam_tuning"

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


ERROR: Timed out waiting for TensorBoard to start. It may still be running as pid 10473.

In [31]:
%load_ext tensorboard
%tensorboard --logdir "Logs/Model 1/fit"

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Recap:
* Primary classification task to create a CNN that distinguish between an image of a shoe, trousers or glasses. 
* To achieve this, we set up a relatively simple model consisting of: 2 convotional layers, 2 maxpool layers and the compulsory dense outcome layer. 
* We decided to use this configuration because it keeps the training time lower, and allows us to check for different hyperparameters.
* In terms of code, we imported datasets and preprocessed them, created the functions for hyperparameters tuning and confusion matrix and log the training process using TensorBoard.
* We tried out the model with a filter size of 3, 5 and 7, and number of filters set to 32, 64, 96 and 128.
* The results were impressive, with more than 99.9% accuracy accross all different combinations of hyerparamaters.