# Dragon vs. Bear Classifier

The goal of this work is to build a model that classifies images as either a dragon or a bear. Alas, dragons are fickle creatures and rarely seen in the wild. So, this preliminary classifier is build to recognize plush dragons and bears. This notebook shows the first proof of concept of the dragon vs. bear classifier. The image dataset is composed of some possibly copyrighted data and thus not included for distribution in the repo. As such, this notebook is intended to show the model development process but is not reproducible as written.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1338)
import os
import tensorflow as tf
tf.random.set_seed(123)
from tensorflow.keras import layers
import seaborn as sns

## Dataset Exploration

The dataset is 634 images or plush bears and dragons scraped from the web. The dataset is pretty well balanced with 334 images of dragons and 300 images of bears. The images are loaded as TensorFlow datasets. Loading into TensorFlow datasets accomplishes a few things including warping all images to the same size, shuffling the dataset, separating them into training and validation sets and defining batch size. Because our training set is small, batch size in this work is defined as 1. This means the model is trained with stochastic gradient descent.

In [2]:
def load_preprocess_data(data_dir='../data', batch_size=1, img_height=160, img_width=160, display_sample=False):
    """
    Loads and preprocesses training and validation data.
    
    args:
    data_dir (str, default='./data'): directory containing the images,
    each class should have its own directory with the class name being the directory name
    batch_size (int, default=1): batch size, default is set to 1 and training performs stochastic grad descemnt
    img_height (int, default=160): image height after preprocessing
    img_width (int, default=160): image width after preprocessing
    display (bool, default=True): display samples from data
    
    returns:
    train_ds, val_test_ds (tf.data.Dataset, tf.data.Dataset): training and validation_test datasets
    
    """
    
    try:
    
        #Use keras to load train ds and val ds
        train_ds = tf.keras.preprocessing.image_dataset_from_directory(
            data_dir,
            shuffle=True,
            labels='inferred',
            validation_split=0.2,
            subset="training",
            seed=123,
            image_size=(img_height, img_width),
            batch_size=batch_size)

        val_test_ds = tf.keras.preprocessing.image_dataset_from_directory(
            data_dir,
            shuffle=True,
            labels='inferred',
            validation_split=0.2,
            subset="validation",
            seed=123,
            image_size=(img_height, img_width),
            batch_size=batch_size)
        
        #Display samples from a batch
        if display_sample:
            plt.figure(figsize=(10, 10))
            for images, labels in train_ds.take(1):
                for i in range(min(9, len(labels))):
                    ax = plt.subplot(3, 3, i + 1)
                    plt.imshow(images[i].numpy().astype("uint8"))
                    plt.title(train_ds.class_names[labels[i]])
                    plt.axis("off")
                    
        return train_ds, val_test_ds
        
    except:
        print("Could not load train and validation datasets.")
        
train_ds, val_test_ds = load_preprocess_data()

Found 634 files belonging to 2 classes.
Using 508 files for training.
Found 634 files belonging to 2 classes.
Using 126 files for validation.


## Model

The classifier uses Google's pre-trained MobileNetV2 convolutional neural network for feature extraction. This architecture is lightweight which lends to flexibility for deployment in web or embedded environments. The pre-trained MobileNetV2 network was trained in the ImageNet dataset (1.4 million images comprising 1000 classes). We use transfer learning to take advantage of the of all the features already learned by the pre-trained architecture.

The complete flow of the new model is as follows:
1. Preprocess images: Rescales the pixel values from [0,255] to [-1,1].
2. Perform data augmentation (optional): Randomly performs horizontal flips and up to 10% rotation to emulate a more realistic image capture.
3. Load MobileNetV2 neural network: This model is pre-trained with the ImageNet dataset and the last layer is not included.
4. Freeze model: All weights of the model so far won't be updated during training.
5. Add new classification layers: Add a global average pooling layer and a fully connected layer to output a single value representing the two classes. The output is a logit.

This model uses the Adam optimizer to optimize a cross entropy (log loss) cost function. The learning rate is small (0.0001), because the stochastic gradient descent (rather than mini-batch) is more stable for small learning rates. The training data set is small and thus training is not very time constrained. Therefore, I allowed the model to train for 20 epochs to ensure it reached an optimal value. 

The learning curve shows the model's accuracy on the training and validation sets after each epoch. They converge nicely as training progresses. Final accuracy on the validation set is 97.62%. Variance is low (i.e. the model is not overfitting) since the training and validation accuracy are comparable. 

In [None]:
def build_model(IMG_SHAPE, augment_data=True, base_learning_rate = 0.0001):
    """
    Builds a binary image classifer tensorflow model. This model uses the MobileNet V2
    convolutional nn for feature extraction. Transfer learning is performed to use the model for this
    use case. Leveraging transfer learning of the pretrained MobileNet V2 neural network allows us to build
    a deep learning classifier with a lot less data.
    
    args:
    IMG_SHAPE ((int,int,int)): shape of the input images, (height,width,channels)
    augment_data (bool, default=True): If true, data augmentation is performed
    base_learning_rate (float, default=0.0001): learning rate for grad descent during training
    
    returns:
    model (tf.keras.Model)
    
    
    """
    
    preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input
    
    if augment_data:
        data_augmentation = tf.keras.Sequential([
            tf.keras.layers.experimental.preprocessing.RandomFlip('horizontal'),
            tf.keras.layers.experimental.preprocessing.RandomRotation(0.1),
        ])
    
    #use the MobileNet V2 architecture for the base conv net
    base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')
    #freeze these layers
    base_model.trainable = False
    
    #Build the classification head
    global_average_layer = tf.keras.layers.GlobalAveragePooling2D()
    prediction_layer = tf.keras.layers.Dense(1)
    
    #build model using Keras functional API
    inputs = tf.keras.Input(shape=IMG_SHAPE)
    if augment_data:
        x = data_augmentation(inputs)
        x = preprocess_input(x)
    else:
        x = preprocess_input(inputs)
    x = base_model(x, training=False)
    x = global_average_layer(x)
    x = tf.keras.layers.Dropout(0.2)(x)
    outputs = prediction_layer(x)
    model = tf.keras.Model(inputs, outputs)
    
    #compile model
    model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
                  loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
                  metrics=['accuracy'])
    
    return model

#compile model
model = build_model(train_ds.element_spec[0].shape[-3:])

#train model
history = model.fit(train_ds,
                    epochs=20,
                    validation_data=val_test_ds)   
    

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20

In [None]:
def plot_learning_curve(history):
    """
    Plots the learning curves for model training.
    
    args:
    history (tf.keras.callbacks.History): history callbrack returned from model training
    
    returns:
    None
    
    """
    acc = history.history['accuracy']
    val_acc = history.history['val_accuracy']

    loss = history.history['loss']
    val_loss = history.history['val_loss']

    plt.figure(figsize=(8, 8))
    plt.subplot(2, 1, 1)
    plt.plot(acc, label='Training Accuracy')
    plt.plot(val_acc, label='Validation Accuracy')
    plt.legend(loc='lower right')
    plt.ylabel('Accuracy')
    plt.ylim([min(plt.ylim()),1])
    plt.title('Training and Validation Accuracy')

    plt.subplot(2, 1, 2)
    plt.plot(loss, label='Training Loss')
    plt.plot(val_loss, label='Validation Loss')
    plt.legend(loc='upper right')
    plt.ylabel('Cross Entropy')
    plt.ylim([0,1.0])
    plt.title('Training and Validation Loss')
    plt.xlabel('epoch')
    plt.show()
    
plot_learning_curve(history)

## Performance Analysis

We look at a couple different metrics for our final analysis on the validation dataset.

First and foremost, we examine the overall accuracy since this was the classifier's optimizing metric. The test set here is pretty small (126 images), which diminishes the significance of the final performance metrics. However, the classifier's overall accuracy was 96.8% which I consider to be adequate.

A confusion matrix shows the breakdown of actual labels vs. classification. The true positive (bear) rate was 98.1%, while the true negative (dragon) rate was lower at 95.9%. In other words, the classifier more accurately classifies an image of a bear given that the image is a bear than it does a dragon given that the image is a dragon. To give a balanced view of how well the classifier classifiers dragons and bears, I calculated the balanced accuracy  at 97.0%.

For extra visual validation, I show a sample of correctly and incorrectly labeled images below. Given the small validation set and even smaller number of incorrectly classified images (i.e. 4 images), it is not immediately obvious at this time why those images were not correctly classified by our model.

In [None]:
#cache some data for performance analysis
val_test_ds_cahced = val_test_ds.cache()

#show a confusion matrix
def plot_conf_mat(tf_model, ds, class_names):
    
    #run inference on dataset
    predictions = tf_model.predict(ds).squeeze()
    #convert logits to classes
    logit_to_pred = lambda x: x>=0
    #convert to int
    predictions = logit_to_pred(predictions).astype(int)
    
    #create a labels array
    labels = ds.map(lambda img, label: int(label == 1))
    labels = np.fromiter(labels.as_numpy_iterator(), int)
    
    num_predict0_label0 = sum((predictions==0) & (labels==0))
    num_predict1_label1 = sum((predictions==1) & (labels==1))
    num_predict0_label1 = sum((predictions==0) & (labels==1))
    num_predict1_label0 = sum((predictions==1) & (labels==0))
    
    conf_matrix_vals = [[num_predict0_label0, num_predict0_label1],[num_predict1_label0, num_predict1_label1]]

    ax = sns.heatmap(conf_matrix_vals, cmap="YlGnBu", annot=True, xticklabels=class_names, yticklabels=class_names)
    ax.set_xlabel('Labels', fontsize='large')
    ax.set_ylabel('Predictions', fontsize='large')
    
    return conf_matrix_vals

conf_matrix_vals = plot_conf_mat(model, val_test_ds_cahced, val_test_ds.class_names)


#Look at some metrics
print('Overall acccuracy: ', (conf_matrix_vals[0][0] + conf_matrix_vals[1][1])/(conf_matrix_vals[0][0] + conf_matrix_vals[1][1]+conf_matrix_vals[0][1] + conf_matrix_vals[1][0]))
print('True positive (bear) rate: ', (conf_matrix_vals[0][0])/(conf_matrix_vals[0][0] + conf_matrix_vals[1][0]))
print('True negative (dragon) rate: ', (conf_matrix_vals[1][1])/(conf_matrix_vals[1][1] + conf_matrix_vals[0][1]))
print('Balanced Accuracy: ', (conf_matrix_vals[0][0])/(conf_matrix_vals[0][0] + conf_matrix_vals[1][0])/2 + (conf_matrix_vals[1][1])/(conf_matrix_vals[1][1] + conf_matrix_vals[0][1])/2)


In [None]:
def disp_predicted_images(tf_model, ds, class_names, disp_mislabeled=True):
    """
    Displays images and their predctions, either all incorrect or all correct predictions.
    
    args:
    tf_model (tf.keras.Model): compiled and trained tf model
    ds (tf.data.Dataset): cached dataset to perform predictions on
    class_names (list): list of class names
    disp_mislabeled (bool, default=True): If true, displays incorrect predictions, else correct predictions.
    
    returns:
    None
    
    """
    
    #run inference on dataset
    predictions = tf_model.predict(ds).squeeze()
    #convert logits to classes
    logit_to_pred = lambda x: x>=0
    #convert to int
    predictions = logit_to_pred(predictions).astype(int)
    
    #create a labels array
    labels = ds.map(lambda img, label: int(label == 1))
    labels = np.fromiter(labels.as_numpy_iterator(), int)

    
    #compare labels and predictions to find correct/incorrect classifications
    if disp_mislabeled:
        labels_mask = predictions != labels
    else:
        labels_mask = predictions == labels
    labels_indices = np.where(labels_mask)[0]

    #if number of indices >9, plot first 9
    if len(labels_indices) > 9:
        plt.figure(figsize=(10, 10))
        indices_rand = labels_indices[np.random.randint(0,len(labels_indices), 9)]
        for i, ds_i in enumerate(indices_rand):
            image,label = list(ds.as_numpy_iterator())[ds_i]
            ax = plt.subplot(3, 3, i + 1)
            plt.imshow(image.squeeze().astype("uint8"))
            plt.title(class_names[predictions[ds_i]])
            plt.axis("off")
    #else plot all
    else:
        plt.figure(figsize=(10, 10))
        for i, ds_i in enumerate(labels_indices):
            image,label = list(ds.as_numpy_iterator())[ds_i]
            ax = plt.subplot(3, 3, i + 1)
            plt.imshow(image.squeeze().astype("uint8"))
            plt.title(class_names[predictions[ds_i]])
            plt.axis("off")
    
disp_predicted_images(model, val_test_ds_cahced, val_test_ds.class_names, False)        
    

In [None]:
disp_predicted_images(model, val_test_ds_cahced, val_test_ds.class_names, True)        


## Export

After model training and tuning was completed, I exported the model for deployment.

In [None]:
version = 1
export_path = os.path.join(os.getcwd(), 'dragon_bear_classifier_mobilenetv2', str(version))
print('export_path = {}\n'.format(export_path))

tf.keras.models.save_model(
    model,
    export_path,
    overwrite=True,
    include_optimizer=True,
    save_format=None,
    signatures=None,
    options=None
)