#### Notes

_This notebook is meant to demonstrate how to train a patch-based image classifier using Keras (for more tutorials please see Keras' [website](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)._ 

"The CNN patch-based image classifier that was used to provide numerous additional sparse labels to each image as described in the workflow used the EfficientNet-B0 architecture. Instead of using the typical “ImageNet” weights, the classifier was initialized with the “Noisy-Student” weights, which were learned using a semi-supervised training scheme that outperformed the former (Xie et al., 2020). This encoder was followed by a max pooling operation, a dropout layer (80%), and finally a single fully connected layer with seven output nodes (one for each of the class categories). Patches were resized to 224 pixels × 224 pixels and fed to the model as training data after heavy augmentation techniques were applied using the ImgAug (Jung, 2019) library, and normalized to have pixel values between 0 and 1.

The task is considered a multi-categorical classification, therefore the network used a softmax activation function resulting in an output representing the probability distribution of each potential class category. The batch size was set to 32 as this was the largest amount possible given the network architecture, the size of the image patches, and the amount of memory that could be allocated by the GPU being used. The model was trained on 10,000 image patches that were randomly split into a training (90%) and validation (10%) set for 25 epochs; the final model was evaluated using the test set that consisted of 50 manually created ground-truth dense labels (see Table 2).

During training the error between the actual and predicted output was calculated using the categorical-cross entropy loss function. Parameters throughout the network were adjusted using the Adam optimizer with an initial learning rate of 10–4. During training the learning rate was reduced by a factor of 0.5 for every three epochs in which the validation loss failed to decrease, and the weights from the epoch with the lowest validation loss were archived."

![alt text](../Figures/getting_dense_labels.png)

In [None]:
import os
import glob

import numpy as np
import pandas as pd
from skimage import io
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

In [None]:
import keras
from keras import backend as K
from keras.models import Sequential
from keras import optimizers, losses, metrics
from keras.layers import Dense, Activation, Dropout
from keras.applications.nasnet import NASNetMobile
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.model_selection import cross_val_score

import efficientnet.keras as efn 
from imgaug import augmenters as iaa

In [None]:
# labels for each class category of interest used for Pierce et al., 2021

class_categories = {'Branching' : 0, 
                      'Fish' : 1, 
                      'Massive' : 2,
                      'Not Massive' : 3,
                      'Substrate' : 4,
                      'Target' : 5,
                      'Water' : 6}

In [None]:
# Collect all of the patches for all of the classes created using the `patch_extractor.exe` tool
data = glob.glob("Patches\\**\\*.bmp", recursive = True)

In [None]:
# Now split the data into a training, validation and test set. 
# Feel free to vary the ratio of the test_size parameter.
training_files, test_files = train_test_split(data, test_size = .1)
training_files, validation_files  = train_test_split(training_files, test_size = .1)

# Patches were extracted and stored in a folder structure such that
# all patches of a class were grouped together. This assumes that
# the folders have the same name as the class categories.
training_labels = [file.split("\\")[-2] for file in training_files]
validation_labels = [file.split("\\")[-2] for file in validation_files]
test_labels = [file.split("\\")[-2] for file in test_files]

# Creating a pandas dataframe for each set.
train = pd.DataFrame(data = list(zip(training_files, training_labels)), columns = ['images', 'labels'])
valid = pd.DataFrame(data = list(zip(validation_files, validation_labels)), columns = ['images', 'labels'])
test = pd.DataFrame(data = list(zip(test_files, test_labels)), columns = ['images', 'labels'])

len(train), len(valid), len(test)

In [None]:
# Augmentation methods implemented using imgaug; training augmentations should be 
# more intense, whereas the validation and testing augmentations should be minimal to none.

# Setting the amount of dropout for our model (form of data augmentation)
dropout_rate = 0.80

augs_for_train = iaa.Sequential([   iaa.Resize(224, interpolation = 'linear'),
                          iaa.Fliplr(0.5),
                          iaa.Flipud(0.5),
                          iaa.Rot90([1, 2, 3, 4], True),
                          iaa.Sometimes(.3, iaa.Affine(scale = (.95, 1.05))),
                          iaa.Sometimes(.1, iaa.Invert(1.0)),
                          iaa.Sometimes(.5, iaa.SomeOf((0, 1), 
                                             [
                                                 iaa.MedianBlur(3),
                                                 iaa.ChannelShuffle(.7),
                                                 iaa.EdgeDetect(.5)
                                             ])),

                          iaa.Sometimes(.5, iaa.SomeOf((0, 1),
                                            [
                                                 iaa.Dropout(.2),
                                                 iaa.ImpulseNoise(.2),
                                                 iaa.SaltAndPepper(.2)
                                            ]))
                       ])


augs_for_valid = iaa.Sequential([iaa.Resize(224, interpolation = 'linear')])

In [None]:
# Data generators are made to take the patch file paths currentl stored in the dataframes; generators
# create an augmentation pipeline so that patches can be read, augmented, and normalized on-the-fly 
# while training.

# Batch size is dependent on the amount of memory available on your machine
batch_size = 32

# Defines the length of an epoch, all images are used
steps_per_epoch_train = len(train)/batch_size
steps_per_epoch_valid = len(valid)/batch_size

# Learning rate 
lr = .0001

# Training images are augmented, and then normalized
train_augmentor = ImageDataGenerator(preprocessing_function = augs_for_train.augment_image,
                                     rescale = 1.0/255.0)
                                     
                                                                   
# Reading from dataframe
train_generator = train_augmentor.flow_from_dataframe(dataframe = train, directory = None,
                                                      x_col = 'images', y_col = 'labels', target_size = (224, 224), 
                                                      color_mode = "rgb",  class_mode = 'categorical', 
                                                      batch_size = batch_size, shuffle = True, seed = 42)
                                                     

# Only normalize images, no augmentation
validate_augmentor = ImageDataGenerator( preprocessing_function = augs_for_valid.augment_image,
                                         rescale = 1.0/255.0 )

# Reading from dataframe                             
validation_generator = validate_augmentor.flow_from_dataframe(dataframe = valid, directory = None, 
                                                              x_col = 'images', y_col = 'labels', target_size = (224, 224), 
                                                              color_mode = "rgb",  class_mode = 'categorical', 
                                                              batch_size = batch_size, shuffle = True, seed = 42)

In [None]:
# Now we create the model!
# Here we load up the EfficientNet-B0 as found int he Qubvel Repo (link below) and provide the Noisy-Student weights
# https://github.com/qubvel/efficientnet

model = Sequential([
        efn.EfficientNetB0(weights = 'noisy-student', include_top = False,  pooling = 'max'),
        Dropout(dropout_rate),
        Dense(len(list(class_categories))),
        Activation('softmax')
])

# Display the model architecture
if True:
    model.summary()

In [None]:
# Defining the Recall and Precision metric functions

def recall_m(y_true, y_pred):
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
        recall = true_positives / (possible_positives + K.epsilon())
        return recall

def precision_m(y_true, y_pred):
        true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
        predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
        precision = true_positives / (predicted_positives + K.epsilon())
        return precision

In [None]:
# Defining training callbacks, such as learning rate, which will reduce after two epochs by %65 if the validation loss 
# does not decrease. Only the epochs with lower validation loss values will be saved.

os.makedirs("weights\\", exist_ok=False) 

hollabackgirl = [
                 ReduceLROnPlateau(monitor = 'val_loss', factor = .65, patience = 2, verbose = 1),
                 ModelCheckpoint(filepath = 'weights\\model-{epoch:03d}-{acc:03f}-{val_acc:03f}.h5', 
                                 monitor='val_loss', save_weights_only = True, 
                                 save_best_only = True, verbose = 1),
                ]

In [None]:
# sets the loss function, optimizier and metrics, probably don't need to change
# except maybe the learing rate 

model.compile(loss = 'categorical_crossentropy',
              optimizer = optimizers.Adam(lr = lr), 
              metrics=['acc', precision_m, recall_m])

In [None]:
# Train the model, logs the results of the training in history

history = model.fit_generator(train_generator, 
                              steps_per_epoch = steps_per_epoch_train, 
                              epochs = num_epochs, 
                              validation_data = validation_generator, 
                              validation_steps = steps_per_epoch_valid,
                              callbacks = holla,
                              verbose = 1)  

In [None]:
# After training, loads the best weights
model.load_weights('weights\\path_to_best_weights.h5')

In [None]:
# Reads from dataframe for test set
test_generator = validate_augmentor.flow_from_dataframe(dataframe = test, 
                                                 x_col = 'images', y_col = 'labels', target_size = (224, 224), 
                                                 color_mode = "rgb",  class_mode = 'categorical', 
                                                 batch_size = batch_size, shuffle = False, seed = 42)
# Defines the length of an epoch
steps_per_epoch_test = len(test)/batch_size

In [None]:
# Provides a confusion matrix of the results
# Results, stores predictions for thresholding, shuffling needs to stay off for test
predictions = model.predict_generator(test_generator, steps = steps_per_epoch_test)
predict_classes = np.argmax(predictions, axis = 1)

test_y = test_generator.classes
print("# of images:", len(predict_classes))
print(accuracy_score(y_true = test_y, y_pred = predict_classes))
print(confusion_matrix(y_true = test_y, y_pred = predict_classes))

In [None]:
# Higher values represents more sure/confident predictions
# .1 unsure -> .5 pretty sure -> .9 very sure

# Look at creating a graph of the threshold values and the accuracy
# useful for determing how sure the model is when making predictions

threshold_values = np.arange(0.0, 1.0, 0.05)
class_ACC = []

for threshold in threshold_values:
    sure_index = []

    for i in range(0, len(predictions)):
        if( (sorted(predictions[i])[-1]) - (sorted(predictions[i])[-2]) > threshold):
            sure_index.append(i)

    sure_test_y = np.take(test_y, sure_index, axis = 0)
    sure_pred_y = np.take(predict_classes, sure_index)

    class_ACC.append(accuracy_score(sure_test_y, sure_pred_y)) 

import matplotlib.pyplot as plt

plt.figure(figsize=(10, 5))
plt.plot(threshold_values, class_ACC)
plt.xlabel('Threshold Values')
plt.xlim([0, 1])
plt.xticks(ticks = np.arange(0, 1.05, 0.1))
plt.ylabel('Classification Accuracy')
plt.title('Identifying the ideal threshold value')
plt.show()

In [None]:
model.save("Best_Model_and_Weights.h5")