# Hybridization of fine-tuned VGG16, MobileNet and Xception Networks
- In this kernel, we will be performing Hybridization (Ensembling) of the fine-tuned models, namely VGG16, MobileNet and Xception.
- We will be considering 2 different types of ensembling in this kernel. First, we will be performing relative weighting of the predictions from the 3 models, and then predicting the class labels. Second, we will be performing majority vote on the predicted class labels from the 3 models, in order to get the final class labels. 

### Reference Kernels
- [Fine-tuned MobileNet](https://www.kaggle.com/code/mitishaagarwal/mobile-net-2)
- [Fine-tuned VGG16](https://www.kaggle.com/code/mitishaagarwal/vgg16)
- [Fine-tuned Xception](https://www.kaggle.com/code/mitishaagarwal/xception)

# 1. Importing the Packages & Boilerplate Code

In [1]:
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from tqdm import tqdm
import statistics
from tabulate import tabulate
from sklearn.metrics import accuracy_score, log_loss,  f1_score

# https://www.kaggle.com/c/ventilator-pressure-prediction/discussion/274717
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 

import tensorflow as tf
import keras
from tensorflow.keras import layers
from tensorflow.keras.applications import Xception

In [2]:
# Making sure that Tensorflow is able to detect the GPU
device_name = tf.test.gpu_device_name()
if "GPU" not in device_name:
    print("GPU device not found")
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


In [3]:
# These are the usual ipython objects
ipython_vars = ['In', 'Out', 'exit', 'quit', 'get_ipython', 'ipython_vars']

# Defining a function to list the memory consumed
# Only outputs variables taking at least 1MB space
def list_storage(inp_dir):
    # Get a sorted list of the objects and their sizes
    vars_defined = [x for x in inp_dir if not x.startswith('_') and x not in sys.modules and x not in ipython_vars]
    sto = sorted([(x, sys.getsizeof(globals().get(x))) for x in vars_defined], key=lambda x: x[1], reverse=True)
    sto = [(x[0], str(round((x[1] / 2**20), 2)) + ' MB') for x in sto if x[1] >= 2**20]
    print(tabulate(sto, headers = ['Variable', 'Storage (in MB)']))

# In order to use this function, use the below line of code
# list_storage(dir())

# 2. Importing the Train and Test Sets

In [4]:
# Importing the Labelled Training Dataset
print("For Train Dataset:")
df_train = pd.read_csv("../input/cifar10/train_lab_x.csv")
y_train = pd.read_csv("../input/cifar10/train_lab_y.csv")
df_train = np.array(df_train)
y_train = np.array(y_train)
print(df_train.shape, y_train.shape)

# Reshaping, rescaling and one-hot encoding
df_train = np.reshape(df_train, (-1, 3, 32, 32))
df_train = np.transpose(np.array(df_train), (0, 2, 3, 1))
df_train = df_train / 255
print(df_train.shape)

# Importing the Test Dataset
print("For Test Dataset:")
df_test = pd.read_csv("../input/cifar10/test_x.csv")
y_test = pd.read_csv("../input/cifar10/test_y.csv")
df_test = np.array(df_test)
y_test = np.array(y_test)
print(df_test.shape, y_test.shape)

# Reshaping the dataset
df_test = np.reshape(df_test, (-1, 3, 32, 32))
print(df_test.shape)

# Reshaping, rescaling and one-hot encoding
df_test = np.transpose(np.array(df_test), (0, 2, 3, 1))
df_test = df_test / 255
y_test_oh = tf.one_hot(np.ravel(y_test), depth = 10)
print(df_test.shape, y_test_oh.shape)

For Train Dataset:
(40006, 3072) (40006, 1)
(40006, 32, 32, 3)
For Test Dataset:
(10000, 3072) (10000, 1)
(10000, 3, 32, 32)
(10000, 32, 32, 3) (10000, 10)


# 3. Loading the fine-tuned Models

In [5]:
to_res = (64,64)
model1 = tf.keras.models.load_model('../input/dcai-rw/xcptn_model.h5')
model2 = tf.keras.models.load_model('../input/dcai-rw/mobilenet_model.h5')
model3 = tf.keras.models.load_model('../input/dcai-rw/vgg16_model.h5')

In [6]:
preds_test1 = model1.predict(df_test)
cls_test1 = np.argmax(preds_test1, axis = 1)
print("For Xception Model:")
print("Accuracy for Test Dataset: ", accuracy_score(y_test, cls_test1))
print("Log-loss for Test Dataset: ", log_loss(y_test_oh, preds_test1))
print("Weighted F1 Score for Test Dataset = ", f1_score(y_test, cls_test1, average = 'weighted'))
print()

preds_test2 = model2.predict(df_test)
cls_test2 = np.argmax(preds_test2, axis = 1)
print("For MobileNet Model:")
print("Accuracy for Test Dataset: ", accuracy_score(y_test, cls_test2))
print("Log-loss for Test Dataset: ", log_loss(y_test_oh, preds_test2))
print("Weighted F1 Score for Test Dataset = ", f1_score(y_test, cls_test2, average = 'weighted'))
print()

preds_test3 = model3.predict(df_test)
cls_test3 = np.argmax(preds_test3, axis = 1)
print("For VGG16 Model:")
print("Accuracy for Test Dataset: ", accuracy_score(y_test, cls_test3))
print("Log-loss for Test Dataset: ", log_loss(y_test_oh, preds_test3))
print("Weighted F1 Score for Test Dataset = ", f1_score(y_test, cls_test3, average = 'weighted'))
print()

For Xception Model:
Accuracy for Test Dataset:  0.8957
Log-loss for Test Dataset:  0.8017136794626224
Weighted F1 Score for Test Dataset =  0.8956651393697893

For MobileNet Model:
Accuracy for Test Dataset:  0.8668
Log-loss for Test Dataset:  1.0053337612657762
Weighted F1 Score for Test Dataset =  0.8667049942870388

For VGG16 Model:
Accuracy for Test Dataset:  0.7992
Log-loss for Test Dataset:  1.6821733511582482
Weighted F1 Score for Test Dataset =  0.798454250467614



# 4. Performing the Ensembling
## 4.1. Relative Weighting of the Predictions
- Since the Xception Model has the largest test set accuracy, followed by the MobileNet model, and then lastly, the VGG16 Model, hence, we will be using relative weights inspired by this intuition only.

In [7]:
# List of relative weighting to try
rel_weights = [
    [0.5, 0.3, 0.2],
    [0.45, 0.35, 0.2],
    [0.40, 0.35, 0.25],
    [0.35, 0.35, 0.3],
    [0.6, 0.3, 0.1]
]

test_acc, test_log_loss, test_f1_score = [], [], []

for weights in rel_weights:
    print("Xception: ", weights[0], " MobileNet: ", weights[1], " VGG16: ", weights[2])
    preds_test = weights[0] * preds_test1 + weights[1] * preds_test2 + weights[2] * preds_test3
    cls_test = np.argmax(preds_test, axis = 1)
    test_acc.append(accuracy_score(y_test, cls_test))
    test_log_loss.append(log_loss(y_test_oh, preds_test))
    test_f1_score.append(f1_score(y_test, cls_test, average = 'weighted'))
    print("Accuracy for Test Dataset: ", test_acc[-1])
    print("Log-loss for Test Dataset: ", test_log_loss[-1])
    print("Weighted F1 Score for Test Dataset = ", test_f1_score[-1])
    print()
    
best_index = np.argmax(test_acc)
print("For ensembling based on Relative Weighting:")
print("The optimal relative weights are ", rel_weights[best_index])
print("Accuracy for Test Dataset: ", test_acc[best_index])
print("Log-loss for Test Dataset: ", test_log_loss[best_index])
print("Weighted F1 Score for Test Dataset = ", test_f1_score[best_index])

Xception:  0.5  MobileNet:  0.3  VGG16:  0.2
Accuracy for Test Dataset:  0.905
Log-loss for Test Dataset:  0.4107004970640091
Weighted F1 Score for Test Dataset =  0.90481535464968

Xception:  0.45  MobileNet:  0.35  VGG16:  0.2
Accuracy for Test Dataset:  0.9025
Log-loss for Test Dataset:  0.4111881832950033
Weighted F1 Score for Test Dataset =  0.9022976729760547

Xception:  0.4  MobileNet:  0.35  VGG16:  0.25
Accuracy for Test Dataset:  0.9013
Log-loss for Test Dataset:  0.4162126540434949
Weighted F1 Score for Test Dataset =  0.9010838673888512

Xception:  0.35  MobileNet:  0.35  VGG16:  0.3
Accuracy for Test Dataset:  0.9016
Log-loss for Test Dataset:  0.4232345016994275
Weighted F1 Score for Test Dataset =  0.9014082700565781

Xception:  0.6  MobileNet:  0.3  VGG16:  0.1
Accuracy for Test Dataset:  0.9026
Log-loss for Test Dataset:  0.4093135084820644
Weighted F1 Score for Test Dataset =  0.9024842394961444

For ensembling based on Relative Weighting:
The optimal relative weights

## 4.2. Majority Vote of the Predicted Class Labels
- In the previous method, we were able to find out the log-loss and accuracy both, since, we have performed ensembling on the predictions themselves.
- In this method, we can only find the accuracy, since, we are performing the ensembling only on the predicted classes, and not on the predicted values.

In [8]:
cls_test = []
for i in range(len(cls_test1)):
    classes = [cls_test1[i], cls_test2[i], cls_test3[i]]
    
    # Case 1: When the 3 models predict different class labels
    # Selecting the class label corresponding to Xception Net
    if len(np.unique(classes)) == len(classes):
        cls_test.append(cls_test1[i])
    
    # Case 2: When at least 2 models predict the same class label
    # Selecting the majority class
    else:
        cls_test.append(np.bincount(classes).argmax())
        
print("For ensembling based on Majority Vote:")
print("Accuracy for Test Dataset: ", accuracy_score(y_test, cls_test))
print("Weighted F1 Score for Test Dataset = ", f1_score(y_test, cls_test, average = 'weighted'))

For ensembling based on Majority Vote:
Accuracy for Test Dataset:  0.9001
Weighted F1 Score for Test Dataset =  0.8999355290166248
