# Comparing the performance with and without the bright-dark filter 

In this notebook, we shall compare the performance on the basic model (ran on just one epoch) with images which have been resized against images which have been resized and had the bright-dark filter applied.

### Model trained on resized images

Below we load the necessary packages.

In [1]:
import numpy as np
import pandas as pd
import os
import cv2
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from PIL import Image
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from sklearn.metrics import accuracy_score
np.random.seed(42)

from matplotlib import style
style.use('fivethirtyeight')

If you would like to replicate this notebook, change the paths below to the file paths where the unedited train and test npy files are stored on your desktop. They can be downloaded from [here]()

In [15]:
training_data = np.load('32_original_train_data.npy')
training_label = np.load('32_original_train_labels.npy')


In [16]:
training_data.shape

(39209, 32, 32, 3)

In [6]:
NUM_CATEGORIES = 43

Next, the train validation split is performed. Note, the pixel values are normalised to be between 0-1.

In [17]:
X_train, X_val, y_train, y_val = train_test_split(training_data, training_label, test_size=0.3, random_state=42, shuffle=True)

X_train = X_train/255 
X_val = X_val/255

print("X_train.shape", X_train.shape)
print("X_valid.shape", X_val.shape)
print("y_train.shape", y_train.shape)
print("y_valid.shape", y_val.shape)

X_train.shape (27446, 32, 32, 3)
X_valid.shape (11763, 32, 32, 3)
y_train.shape (27446,)
y_valid.shape (11763,)


This next section of code converts the labels by one-hot encoding.

In [18]:
y_train = keras.utils.to_categorical(y_train, NUM_CATEGORIES)
y_val = keras.utils.to_categorical(y_val, NUM_CATEGORIES)

print(y_train.shape)
print(y_val.shape)

(27446, 43)
(11763, 43)


Here the structure of the model is specified, e.g., the number of layers, number of neurons per layer.

In [19]:
model = keras.models.Sequential([    
    keras.layers.Conv2D(filters=16, kernel_size=(3,3), activation='relu', input_shape=(32,32,3)),
    keras.layers.Conv2D(filters=32, kernel_size=(3,3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.BatchNormalization(axis=-1),
    
    keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'),
    keras.layers.Conv2D(filters=128, kernel_size=(3,3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.BatchNormalization(axis=-1),
    
    keras.layers.Flatten(),
    keras.layers.Dense(512, activation='relu'),
    keras.layers.BatchNormalization(),
    keras.layers.Dropout(rate=0.5),
    
    keras.layers.Dense(43, activation='softmax')
])

The optimiser is defined here, as well as the number of epochs (which is specified as 1 for time-saving purposes).

In [20]:
lr = 0.001
epochs = 1

opt = tf.keras.optimizers.legacy.Adam(lr=lr, decay=lr / (epochs * 0.5))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

  super().__init__(name, **kwargs)


Below the model is fitted, but not before data augmentation is applied.

In [21]:
aug = ImageDataGenerator(
    rotation_range=10,
    zoom_range=0.15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.15,
    horizontal_flip=False,
    vertical_flip=False,
    fill_mode="nearest")

history = model.fit(aug.flow(X_train, y_train, batch_size=32), epochs=epochs, validation_data=(X_val, y_val))



Here the test data is prepared and the model is used to make predictions.

In [22]:
test_data = np.load('32_original_test_data.npy')
test_labels = np.load('32_original_test_labels.npy')

In [23]:
test_labels

array([16,  1, 38, ...,  6,  7, 10], dtype=int64)

The predictions are compared to the ground truth.

In [24]:

X_test = test_data
X_test = X_test/255

pred = np.argmax(model.predict(X_test), axis=-1)

#Accuracy with the test data
print('Test Data accuracy: ',accuracy_score(test_labels, pred)*100)

Test Data accuracy:  93.04829770387965


In [25]:
from sklearn.metrics import confusion_matrix
cf = confusion_matrix(test_labels, pred)

In [26]:
from sklearn.metrics import classification_report

print(classification_report(test_labels, pred))

              precision    recall  f1-score   support

           0       0.90      0.73      0.81        60
           1       0.94      0.99      0.96       720
           2       0.99      0.97      0.98       750
           3       0.80      0.92      0.86       450
           4       0.95      0.98      0.96       660
           5       0.72      0.88      0.79       630
           6       0.99      0.95      0.97       150
           7       0.99      0.68      0.80       450
           8       0.97      0.83      0.89       450
           9       0.96      1.00      0.98       480
          10       1.00      0.97      0.99       660
          11       0.93      1.00      0.96       420
          12       0.93      0.98      0.96       690
          13       0.97      1.00      0.98       720
          14       1.00      0.99      1.00       270
          15       0.98      1.00      0.99       210
          16       1.00      0.93      0.97       150
          17       1.00    

### Model trained on resized and bright-dark filter images

Below the bright_dark filter is defined. It increases the brightness of dark images and darkens those which are too bright (perhaps due to camera flash). Note, the pixel values are normalised.

In [2]:
bd_training_data = np.load('32_filter_training_data.npy')
bd_training_label = np.load('32_filter_training_labels.npy')

Note, the bright-dark filter data has already been normalised to between 0-1.

In [4]:
X_train, X_val, y_train, y_val = train_test_split(bd_training_data, bd_training_label, test_size=0.3, random_state=42, shuffle=True)

X_train = X_train
X_val = X_val

print("X_train.shape", X_train.shape)
print("X_valid.shape", X_val.shape)
print("y_train.shape", y_train.shape)
print("y_valid.shape", y_val.shape)

X_train.shape (27446, 32, 32, 3)
X_valid.shape (11763, 32, 32, 3)
y_train.shape (27446,)
y_valid.shape (11763,)


In [7]:
y_train = keras.utils.to_categorical(y_train, NUM_CATEGORIES)
y_val = keras.utils.to_categorical(y_val, NUM_CATEGORIES)

print(y_train.shape)
print(y_val.shape)

(27446, 43)
(11763, 43)


In [8]:
model = keras.models.Sequential([    
    keras.layers.Conv2D(filters=16, kernel_size=(3,3), activation='relu', input_shape=(32,32,3)),
    keras.layers.Conv2D(filters=32, kernel_size=(3,3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.BatchNormalization(axis=-1),
    
    keras.layers.Conv2D(filters=64, kernel_size=(3,3), activation='relu'),
    keras.layers.Conv2D(filters=128, kernel_size=(3,3), activation='relu'),
    keras.layers.MaxPool2D(pool_size=(2, 2)),
    keras.layers.BatchNormalization(axis=-1),
    
    keras.layers.Flatten(),
    keras.layers.Dense(512, activation='relu'),
    keras.layers.BatchNormalization(),
    keras.layers.Dropout(rate=0.5),
    
    keras.layers.Dense(43, activation='softmax')
])

In [9]:
lr = 0.001
epochs = 1

opt = tf.keras.optimizers.legacy.Adam(lr=lr, decay=lr / (epochs * 0.5))
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

  super().__init__(name, **kwargs)


In [10]:
aug = ImageDataGenerator(
    rotation_range=10,
    zoom_range=0.15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.15,
    horizontal_flip=False,
    vertical_flip=False,
    fill_mode="nearest")

history2 = model.fit(aug.flow(X_train, y_train, batch_size=32), epochs=epochs, validation_data=(X_val, y_val))



In [11]:
test_data = np.load('32_filter_test_data.npy')
test_labels = np.load('32_filter_test_label.npy')

In [12]:
test_data

array([[[[0.68627453, 0.54901963, 0.45490196],
         [0.67450982, 0.53725493, 0.45490196],
         [0.67843139, 0.5411765 , 0.46666667],
         ...,
         [0.57254905, 0.4627451 , 0.39215687],
         [0.58039218, 0.47450981, 0.3882353 ],
         [0.53333336, 0.43137255, 0.35294119]],

        [[0.69411767, 0.55686277, 0.45490196],
         [0.68627453, 0.54901963, 0.4509804 ],
         [0.68627453, 0.5529412 , 0.45882353],
         ...,
         [0.69803923, 0.56470591, 0.47450981],
         [0.68627453, 0.56078434, 0.47843137],
         [0.67843139, 0.55686277, 0.47843137]],

        [[0.68235296, 0.55686277, 0.45882353],
         [0.68627453, 0.5529412 , 0.45882353],
         [0.67843139, 0.54901963, 0.44705883],
         ...,
         [0.70588237, 0.56470591, 0.47058824],
         [0.7019608 , 0.56470591, 0.47843137],
         [0.69803923, 0.56078434, 0.47058824]],

        ...,

        [[0.65882355, 0.53725493, 0.45882353],
         [0.64705884, 0.52941179, 0.45490196]

In [13]:

pred = np.argmax(model.predict(test_data), axis=-1)

#Accuracy with the test data
print('Test Data accuracy: ',accuracy_score(test_labels, pred)*100)

Test Data accuracy:  94.37054631828978


In [14]:
from sklearn.metrics import confusion_matrix
cf = confusion_matrix(test_labels, pred)

## Conclusion

As we can see, the model trained on the image vectors which had the bright-dark filter applied first, performed better than the model trained on images without the filter (94.4% in comparison to 93.0% test set accuracy). 

As a further test, a copy of this notebook was ran on the HPC for 100 epochs. The model trained on the original image vector scored an accuracy of 98.7% on the test set whereas the model trained on the filtered image vector scored 98.8%. Although the improvement is small, we choose to use the image vectors with the filter applied going forward in our model. The model is saved as basic_cnn_model_filter.h5.