[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jabascal/deep-learning-for-computer-vision-with-keras/blob/main/image_classification_using_transfer_learning_ensembles.ipynb)

# Image classification using transfer learning and ensemble methods

The objectives of this notebook are to create a classifier that leads to the largest accuracy on the given dataset using a transfer learning and fine-tuning approach. Then, ensemble method is used aiming at a final improvement in performance. 

For *transfer learning*, we use the pretrained weights of an architecture pretrained on a large-scale dataset (ImageNet), excluding the top classification layer. Then, we freeze the rest of layers and add a classification layer adapted to the number of classes in our specific dataset. Finally, we train the new model, which has a very low number of weights (few thousands) compared to  the base model (with several millions), on a specific and smaller dataset. 

For *fine tuning*, we unfreeze few or all of the top layers of the base model and then tune the weights to obtain higher accuracy. 

For *ensemble methods*, several instances of the same model are trained and then are combined by majority vote.

## Import dependencies

In [None]:
mode_install = False
if mode_install:
    !pip install tensorflow \
        pillow \
        matplotlib \
        pandas \
        scikit-learn

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import os

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
print(tf.__version__)

import pathlib
import random
import time
import PIL
import json

To access **GPU** go to 'Runtime/Change runtime type' and to check if GPU is available and resources, run the following code:

In [None]:
# Device name
tf.test.gpu_device_name()

# GPU (Tesla), memory limit (14GB)
from tensorflow.python.client import device_lib
device_lib.list_local_devices()


In [None]:
# Memory resources
!cat /proc/meminfo

## Download data

### Mount google drive and paths

Start by mounting your google drive:   

In [None]:
# Mount google drive to access files via colab
mode_colab = False
if mode_colab:
    from google.colab import drive
    drive.mount("/content/gdrive")

Specify the path of the notebook, something like /content/gdrive/MyDrive/deep-learning-for-computer-vision-with-keras/codetest/ (clik on the link to open the contents on the left pannel), and a path to save results. 

In [None]:
name_save = "ensembles"

# Path results
if mode_colab:
      save_dir = "/content/gdrive/MyDrive/Colab_Notebooks/Results/ensembles"
      data_dir = '/content/gdrive/MyDrive/Colab_Notebooks/Data'
else:
      save_dir = "../../Results/ensembles"
      data_dir = '../../Data'
if os.path.exists(save_dir) is False:
      os.mkdir(save_dir)
      print(f"Directory: {save_dir} created.")
if os.path.exists(data_dir) is False:
      os.mkdir(data_dir)
      print(f"Directory: {data_dir} created.")

### Download data

Download the data subset automatically into your drive. 

In [None]:
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, untar=True, cache_dir=data_dir)
print(f"Data downloaded to {os.path.abspath(data_dir)}")
!rm "{data_dir}/../flower_photos.tar.gz"

## Define parameters and general functions

In [None]:
# Image sizes
batch_size = 32
img_height = 180
img_width = 180

learning_rate = 1e-4

## Data pipeline

### Create dataset

We use the tensorflow data API to automatize the data pipeline, chaining transformations (preprocessing and data augmentation), shuffling data. 

Next, we create dataset using 'image_dataset_from_directory' to get similar labeled dataset objects to specified folders. Split data into train, validation and test. 

In [None]:
# Create data set 
# Split in training and validation
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size,
  shuffle=True)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size,
  shuffle=True)

# Class names
class_names = train_ds.class_names
num_classes = len(class_names)
print('Classes names: ')
print(class_names)

# Split Validation and Test
val_batch_size = val_ds.cardinality().numpy()
test_ds = val_ds.take(int(0.5*val_batch_size))
val_ds = val_ds.skip(int(0.5*val_batch_size))


An efficient pipeline can be obtained using 'cache' which keeps the data in RAM memory after the first epoch and 'prefetch' which allows to prepare data for next batch while the model is being trained for the current batch on the GPU. Data is shuffled at each iteration for training data.

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

# shuffle after cache
train_ds = train_ds.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

### Data visualization 

In [None]:
# Display several images
fig = plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
  for i in range(9):
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(images[i].numpy().astype("uint8"))
    plt.title(class_names[labels[i]])
    plt.axis("off")
    fig.savefig(os.path.join(save_dir, f"{name_save}_grid.png"), bbox_inches='tight', dpi=300)

Display an image per class

In [None]:
# Display an image per class
fig, axs = plt.subplots(1, num_classes, figsize=(15, 5))
for i, class_name in enumerate(class_names):
    # Get an image from the class
    image_path = os.path.join(data_dir, class_name, os.listdir(os.path.join(data_dir, class_name))[0])
    image = PIL.Image.open(image_path)
    
    # Display the image
    axs[i].imshow(image)
    axs[i].set_title(class_name)
    axs[i].axis('off')

plt.show()
fig.savefig(os.path.join(save_dir, f"{name_save}_classes.png"), bbox_inches='tight', dpi=300)


Images contain several objects and different background, which may harden the classification task.

Number of data per class:

In [None]:
# Number of samples per class
data_dir = pathlib.Path(data_dir)

class_counts = []
for class_name in class_names:
    class_count = len(list(data_dir.glob(class_name+'/*.jpg')))
    class_counts.append(class_count)

fig = plt.figure(figsize=(4,4))
plt.barh(class_names, class_counts)
plt.title('Number per class')
plt.show()
fig.savefig(os.path.join(save_dir, f"{name_save}_classes_counts.png"), bbox_inches='tight', dpi=300)

### Data augmentation

Data augmentation is performed using random flip,  rotation and zooming. Many more operataions can be used, such as random contrast, brightness, hue, saturation, etc. See [tf.image](https://www.tensorflow.org/api_docs/python/tf/image/) for more details.

In [None]:
# Data augmentation
data_augmentation = keras.Sequential(
  [
    layers.RandomFlip("horizontal", 
                                                 input_shape=(img_height, 
                                                              img_width,
                                                              3)),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
  ]
)

# Data augmentation example
fig = plt.figure(figsize=(7, 7))
for images, _ in train_ds.take(1):
  for i in range(9):
    augmented_images = data_augmentation(images)
    ax = plt.subplot(3, 3, i + 1)
    plt.imshow(augmented_images[0].numpy().astype("uint8"))
    plt.axis("off")
fig.savefig(os.path.join(save_dir, f"{name_save}_augmentation.png"), bbox_inches='tight', dpi=300)

Callback for early stopping:

In [None]:
# Early stopping based on a given metric
earlystop_cb = keras.callbacks.EarlyStopping(
    patience=10, 
    monitor='val_accuracy',
    restore_best_weights=True)

callbacks = [earlystop_cb]


Loading function to load test set for model evaluation

In [None]:
# Load test data from test dataset. 
def get_imgs_from_dataset(ds_test, ds_test_size):
    # Take images from data set ds_test: (data_test, data_test_noisy) 
    data_test = []
    label_test = []
    count = 0
    for img, label in ds_test.take(ds_test_size):
        data_test_this = img.numpy()  
        label_test_this = label.numpy() 
        if count == 0:
            data_test = data_test_this
            label_test = label_test_this
            count = 1
        else:            
            data_test = np.append(data_test, data_test_this, axis=0)
            label_test = np.append(label_test, label_test_this, axis=0)
    return data_test, label_test
    
# Test data
data_test, label_test = get_imgs_from_dataset(val_ds, len(val_ds)-1)

## Model, training and assessment: A transfer learning with fine tuning approach followed by ensemble voting

### Create the model

We use transfer learning using a pretrained model that has been trained on a very large dataset (ImageNet). We try to different models: 'Xception' which provides a high top-5 accuracy (with 20 M parameters) and 'MobileNetV2' which provides great accuracy for a relatively small model size. 

We load the model but skip the 'top' layer to tailored our model to the classes in the dataset. Then, we freeze their layers to train on a small dataset. We also define the model and specify their preprocessing steps. 

Instead of creating and training the model, you can load the trained model: run the code below.

In [None]:
def get_base_model(base_model_name, input_shape):
    # Download pretrained base model 
    if base_model_name == 'mobilenet_v2':
        # mobilenet_v2: small networks
        # Param M: 4.24, top-1 acc: 70.9, top-5 acc:	89.9
        # Imagenet ILSVRC-2012-CLS 
        # Timing: # GTX: 7s and 3s, FT: 
        base_model = tf.keras.applications.MobileNetV2(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.mobilenet_v2.preprocess_input
    elif base_model_name == 'mobilenet_v3':    
        # mobilenet_v3: small networks
        base_model = tf.keras.applications.MobileNetV3Small(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.mobilenet_v3.preprocess_input
    elif base_model_name == 'EfficientNetB0':
        # EfficientNetB0: 5M
        base_model = keras.applications.EfficientNetB0(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.efficientnet.preprocess_input    
    elif base_model_name == 'EfficientNetB4':
        # EfficientNetB4
        base_model = keras.applications.EfficientNetB4(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.efficientnet.preprocess_input
    elif base_model_name == 'EfficientNetB7':
        # EfficientNetB7
        base_model = keras.applications.EfficientNetB7(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.efficientnet.preprocess_input
    elif base_model_name == 'Xception':
        # Xception
        # 20 M parameters
        # no smaller than 71. E.g. (150, 150, 3) 
        # Timing: # GTX: 10s and 6s, FT: 15s and 9s
        base_model = keras.applications.Xception(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.xception.preprocess_input
    elif base_model_name == 'ResNet50':
        # ResNet50: 25 M
        base_model = keras.applications.ResNet50(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.resnet50.preprocess_input
    elif base_model_name == 'vgg19':
        # VGG19: 143 M 
        base_model = keras.applications.VGG19(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.vgg19.preprocess_input
    elif base_model_name == 'inception_v3':
        # InceptionV3: 23M
        base_model = keras.applications.InceptionV3(
                input_shape=input_shape,                                                   
                include_top=False,                                                   
                weights='imagenet')
        preprocess_input = tf.keras.applications.inception_v3.preprocess_input
    
    # Freeze weights
    base_model.trainable = False
    return base_model, preprocess_input

In [None]:
def get_model(inputs_shape, preprocess_input, base_model, 
              num_classes=num_classes, dropout_rate=0.2):

    # Average pooling layer to pass from block 6x6x1280 to vector 1x1280
    global_average_layer = tf.keras.layers.GlobalAveragePooling2D()

    # Multiclass classification layer
    prediction_layer = tf.keras.layers.Dense(num_classes)

    # Create the model
    inputs = tf.keras.Input(shape=inputs_shape)
    x = data_augmentation(inputs)
    x = preprocess_input(x)        
    x = base_model(x, training=False)
    x = global_average_layer(x)
    x = layers.Dropout(dropout_rate)(x)
    outputs = prediction_layer(x)
    model = tf.keras.Model(inputs, outputs)

    model.summary()
    return model

In [None]:
def freeze_layers(base_model, fine_tune_at):    
    # Unfreeze the base model
    base_model.trainable = True

    # Number layers are in the base model
    print("Number of layers in the base model: ", len(base_model.layers))

    # Freeze all the layers before the `fine_tune_at` layer
    for layer in base_model.layers[:fine_tune_at]:
        layer.trainable =  False
    return base_model

In [None]:
def display_training_curves(acc, val_acc, loss, val_loss, name_save):
  epochs_range = range(len(acc))

  fig = plt.figure(figsize=(8, 8))
  plt.subplot(1, 2, 1)
  plt.plot(epochs_range, acc, label='Training Accuracy')
  plt.plot(epochs_range, val_acc, label='Validation Accuracy')
  plt.legend(loc='lower right')
  plt.title('Training and Validation Accuracy')
  #
  plt.subplot(1, 2, 2)
  plt.plot(epochs_range, loss, label='Training Loss')
  plt.plot(epochs_range, val_loss, label='Validation Loss')
  plt.legend(loc='upper right')
  plt.title('Training and Validation Loss')
  plt.show()
  fig.savefig(name_save, bbox_inches='tight', dpi=300)

In [None]:
# Loss, metrics, optimizer
acc_fn = tf.keras.metrics.Accuracy()
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
metrics = ['accuracy']

In [None]:
# Number of models for ensemble voting
IMG_SHAPE = (img_height, img_width) + (3,)
fine_tune_at = 70     # Fine-tune from this layer onwards
epochs = 100          # 100
fine_tune_epochs = 70 #70

base_model_name_list = ['mobilenet_v2', 'mobilenet_v3', 'EfficientNetB0']
test_acc_models = []
models = []
for base_model_name in base_model_name_list:
  # Define pretrained base model
  print('*'*50)
  print(f"Base model: {base_model_name}") 
  base_model, preprocess_input = get_base_model(base_model_name, IMG_SHAPE)

  # Model name
  model = get_model(IMG_SHAPE, preprocess_input, base_model)

  # Compile the model
  optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate)
  model.compile(optimizer=optimizer,
                loss=loss_fn,
                metrics=metrics)

  # Train a large number of steps with early stopping criterion
  history = model.fit(train_ds, validation_data=val_ds, epochs=epochs, 
                      callbacks=callbacks, verbose = 0)

  # Display loss and accuracy
  acc = history.history['accuracy']
  val_acc = history.history['val_accuracy']
  loss = history.history['loss']
  val_loss = history.history['val_loss']

  # FINE TUNING      
  base_model = freeze_layers(base_model, fine_tune_at)

  # Recompile your model after you make any changes
  optimizer_ft = tf.keras.optimizers.RMSprop(learning_rate=learning_rate/10)
  model.compile(optimizer = optimizer_ft,  # Lower learning rate
                loss=loss_fn,
                metrics=metrics)
  # ---------------------------------------------------------------------
  # Train (continue training)
  total_epochs = epochs + fine_tune_epochs    
  history_fine = model.fit(train_ds,
                            epochs=total_epochs,
                            initial_epoch=history.epoch[-1],
                            validation_data=val_ds,
                            callbacks=callbacks,
                           verbose=0)

  # Display losses
  acc += history_fine.history['accuracy']
  val_acc += history_fine.history['val_accuracy']

  loss += history_fine.history['loss']
  val_loss += history_fine.history['val_loss']

  epochs_range = range(len(acc))

  name_save_this = os.path.join(save_dir, f"{name_save}_{base_model_name}_loss_ep{epochs}_ft{fine_tune_epochs}.png")
  display_training_curves(acc, val_acc, loss, val_loss, name_save)

  # Predict 
  data_pred = model.predict(data_test)
  data_pred_class = np.argmax(data_pred, axis=1)

  test_acc = acc_fn(data_pred_class, label_test)
  print('Accuracy for %s with fine tuning is %.2f' % (base_model_name, test_acc))
  test_acc_models.append(test_acc)

  # Save the model in keras format (default).
  name_save = 'flower_photos_' + base_model_name + '_TF' + str(history.epoch[-1]) +'it' + '_FTat' + str(fine_tune_at) + '_it' + str(history_fine.epoch[-1]) + '_Ens' + str(i_ensemble) 
  model.save(os.path.join(save_dir, name_save + ".keras"))
  print(f"Model saved in {os.path.abspath(os.path.join(save_dir, name_save))}")
  models.append(model)  

  # Save history
  with open(os.path.join(save_dir, name_save + '.json'), 'w') as file:
    json.dump(history_fine.history, file)

### Ensemble assessment 

We combined the individually trained models by majority voting.

In [None]:
def ensemble_voting(data_pred_list):
    # Voting
    data_pred_ensemble = np.mean(data_pred_list, axis=0)
    data_pred_class_ensemble = np.argmax(data_pred_ensemble, axis=1)
    return data_pred_class_ensemble

Another option is to create a combined ensembled model and train it, based on this [blog article](https://blog.paperspace.com/ensembling-neural-network-models/). 

### Model assessment

Alternatively, we can load the data, predict and compute the desired metric.

In [None]:
# Load test data from test dataset. 
def get_imgs_from_dataset(ds_test, ds_test_size):
    # Take images from data set ds_test: (data_test, data_test_noisy) 
    data_test = []
    label_test = []
    count = 0
    for img, label in ds_test.take(ds_test_size):
        data_test_this = img.numpy()  
        label_test_this = label.numpy() 
        if count == 0:
            data_test = data_test_this
            label_test = label_test_this
            count = 1
        else:            
            data_test = np.append(data_test, data_test_this, axis=0)
            label_test = np.append(label_test, label_test_this, axis=0)
    return data_test, label_test
    
# Test data
data_test, label_test = get_imgs_from_dataset(val_ds, len(val_ds)-1)

In [None]:
# Predict for all models
data_pred_list = []
test_acc_models = []
for model in models:
    data_pred = model.predict(data_test)
    data_pred_list.append(data_pred)
    data_pred_class = np.argmax(data_pred, axis=1)
    test_acc = acc_fn(data_pred_class, label_test)
    test_acc_models.append(test_acc)

print('Accuracy for each model is: ', test_acc_models)

# Ensemble voting
data_pred_class_ensemble = ensemble_voting(data_pred_list)
test_acc_ensemble = acc_fn(data_pred_class_ensemble, label_test)
print('Accuracy for ensemble is %.2f' % test_acc_ensemble)