# Mask Detection using Transfer Learning based on InceptionV3 Neural Network
Author: [Tianyi Liang](https://www.linkedin.com/in/tianyi-liang-at-bu/)

## Introduction
This project aims to detect whether a person is wearing a mask or not using a Convolutional Neural Network (CNN) model. The model is built using transfer learning based on the InceptionV3 architecture. The dataset used for this project is publicly available on Kaggle.

This project is divided into several sections:

1. [Import Necessary Packages](#import-packages)
2. [Download the Dataset](#download-dataset)
3. [Read the Data into Memory](#read-data)
4. [Perform Data Visualization](#data-visualization)
5. [Normalize the Data and Format it into the Required Shape](#normalize-data)
6. [Check the Balance of the Data](#check-balance)
7. [Implement Image Augmentation to Prevent Overfitting](#image-augmentation)
8. [Build the Network using InceptionV3 and a Custom Classifier](#build-network)
9. [Train the Model on the Augmented Data](#train-model)
10. [Evaluate the Model and Analyze Wrong Cases](#evaluate-model)

---

<a id="import-packages"></a>
## Import Necessary Packages
In this stage, we import all the necessary packages required for the project. These include TensorFlow, Keras, OpenCV, opendatasets, Numpy, Pandas, Seaborn, and Matplotlib.

### Requirements:
- Python 3.7
- TensorFlow
- Keras
- OpenCV
- opendatasets
- Numpy
- Pandas
- Seaborn
- Matplotlib

---

In [None]:
##Import necessary packages
!pip install opendatasets
!pip install opencv-python
!pip install opencv-contrib-python

import gc
import os
import glob
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
import opendatasets as od

from cv2 import cv2
from matplotlib import pyplot as plt

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.utils import plot_model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model, load_model
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Activation, Input, Add, \
                   BatchNormalization, Flatten, Conv2D, \
                   AveragePooling2D, MaxPooling2D, ZeroPadding2D

<a id="download-dataset"></a>
## Download the Dataset
We download the dataset from Kaggle using the opendatasets package. The dataset contains images of people with and without masks.
##### (Link to the dataset: [Face Mask 12k Images Dataset by ASHISH JANGRA](https://www.kaggle.com/datasets/ashishjangra27/face-mask-12k-images-dataset))
---

In [None]:
od.download("https://www.kaggle.com/datasets/ashishjangra27/face-mask-12k-images-dataset")
root_path = "face-mask-12k-images-dataset/Face Mask Dataset/"

<a id="read-data"></a>
## Read the Data into Memory
In this stage, we read the data into memory. We load the images and their corresponding labels (mask or no mask) into a Pandas DataFrame.
##
---

In [None]:
dataset = {"path": [], "classification": [], "set": [], "value": []}

# Train or test folder
for train_test in os.listdir(root_path):

    # Mask or WithoutMask folder
    for true_false in os.listdir(root_path+"/"+train_test):

        # Each image inside
        for image in glob.glob(root_path + 
                     train_test + "/" + 
                     true_false + "/" + 
                     "*.png"):

            this_path = str(root_path + 
                     train_test + "/" + 
                     true_false + "/" + 
                     image)
            img = cv2.imread(image, 3)
            img = np.asarray(cv2.resize(img, 
                    dsize=(224, 224)))
            img = cv2.merge(cv2.split(img)[::-1])
            dataset["path"].append(image)
            if true_false == "WithMask":
                dataset["classification"].append(np.uint8(1))
            else:
                dataset["classification"].append(np.uint8(0))
            dataset["set"].append(train_test.lower())
            dataset["value"].append(img)

df_dataset = pd.DataFrame(dataset)
df_dataset.head()

<a id="data-visualization"></a>
## Perform Data Visualization
We visualize the data to get a better understanding of it. We plot some sample images from the dataset and their corresponding labels.
##
---

In [None]:
plt.figure(figsize = (20, 15))
np.random.seed(1)
sample = np.random.choice(range(len(df_dataset)), size=25, replace=False)
for i in range(25):
    plt.subplot(5, 5, i+1)
    img = df_dataset.loc[sample[i],"value"]
    plt.imshow(img)
    classification = 'WithMask' if df_dataset.loc[sample[i],"classification"] == 1 else "WithoutMask"
    plt.title(classification, size = 15)
    plt.xticks([])
    plt.yticks([])
plt.show()

<a id="normalize-data"></a>
## Normalize the Data and Format it into the Required Shape <a name="normalize-data"></a>
We normalize the pixel values of the images to be between 0 and 1. We also format the data into the shape required by the InceptionV3 model.
##
---

In [None]:
def set_seed():
    InitSeed = 23
    tf.random.set_seed(InitSeed)
    np.random.seed(InitSeed)


train_set = pd.DataFrame(df_dataset[df_dataset["set"] == "train"]).loc[:, ["value", "classification"]]
test_set = pd.DataFrame(df_dataset[df_dataset["set"] == "test"]).loc[:, ["value", "classification"]]
valid_set = pd.DataFrame(df_dataset[df_dataset["set"] == "validation"]).loc[:, ["value", "classification"]]

x_train = np.asarray(train_set["value"].tolist()) / 255
y_train = np.asarray(train_set["classification"].tolist())
x_test = np.asarray(test_set["value"].tolist()) / 255
y_test = np.asarray(test_set["classification"].tolist())
x_valid = np.asarray(valid_set["value"].tolist()) / 255
y_valid = np.asarray(valid_set["classification"].tolist())

del df_dataset, train_set, test_set, valid_set, img, dataset, this_path
del glob, od, pd, cv2
gc.collect()
print("x_train dimension:", x_train.shape, "\ny_train dimension:", y_train.shape,
   "\nx_test dimension:", x_test.shape, "\ny_test dimension:", y_test.shape,
   "\nx_valid dimension:", x_valid.shape, "\ny_valid dimension:", y_valid.shape,)

<a id="check-balance"></a>
## Check the Balance of the Data <a name="check-balance"></a>
We need to ensure that our dataset is balanced, i.e., there is an equal number of images for both classes (mask and no mask). An imbalanced dataset can lead to biased results, where the model might favor the class with more instances.

### Data Balance Verification

We calculate the number of instances for each class in the training, testing, and validation sets. The results are then visualized using a bar chart for a more intuitive understanding of the data distribution.

---



In [None]:
classes_distribution = {"WithoutMask in Train": np.sum(y_train == 0), 
              "With in Train": np.sum(y_train == 1),
              "WithoutMask in Test": np.sum(y_test == 0),
              "With in Test": np.sum(y_test == 1),
              "WithoutMask in Validation": np.sum(y_valid == 0),
              "With in Validation": np.sum(y_valid == 1)}
plt.figure(figsize = (16, 9))
color_A = (0.1, 0.4, 0.7)
color_B = color_A[::-1]
plt.bar(range(len(classes_distribution)), 
        list(classes_distribution.values()), 
        tick_label=list(classes_distribution.keys()),
        color=[color_A, color_B])
plt.show()
plt.show()

<a id="image-augmentation"></a>
## Implement Image Augmentation to Prevent Overfitting
To increase the diversity of the training data and prevent overfitting, we implement image augmentation techniques. These techniques include rotation, height shift, width shift, zoom, and horizontal flip.

### Image Augmentation Setup

We use the `ImageDataGenerator` function from Keras to create an image generator that can apply these augmentations to the training images. The parameters for the different augmentations are set as follows:

- **Rotation**: Up to 30 degrees
- **Height Shift**: Up to 10% of the image height
- **Width Shift**: Up to 10% of the image width
- **Zoom**: Up to 10%
- **Horizontal Flip**: Enabled
---

In [None]:
set_seed()
train_aug = ImageDataGenerator(
    rotation_range=30,
    height_shift_range=.1,
    width_shift_range=.1,
    zoom_range=.1,
    horizontal_flip=True,
)
augment = train_aug.flow(x_train[0:1], batch_size=1)

# Check if the augmentation works
plt.figure(figsize = (20, 15))
for i in range(1, 26):
    plt.subplot(1 + 5 % i, 5, i)
    tf.random.set_seed(1)
    augment.reset()
    plt.imshow(augment.next().squeeze(), cmap='gray')
    plt.axis('off')
plt.show()

del augment
gc.collect()

 <a id="build-network"></a>
## Build the Network using InceptionV3 and a Custom Classifier
We build the network using the InceptionV3 model and a custom classifier. The model is based on the InceptionV3 architecture, a popular convolutional neural network (CNN) for image classification tasks.

We build the network using the InceptionV3 model and a custom classifier. The model is based on the InceptionV3 architecture, a popular convolutional neural network (CNN) for image classification tasks.


### Custom Classifier: `classifier_`

This function defines a custom classifier that is added on top of the InceptionV3 base model. This classifier consists of several dense (fully connected) layers, each followed by batch normalization, activation, and dropout layers.

- **Batch Normalization Layers**: These help to stabilize the learning process and reduce the generalization error.
- **Dropout Layers**: These help to prevent overfitting by randomly setting a fraction of the input units to 0 at each update during training time.


### Model Construction: `model_flow`

This function constructs the overall model by first applying the InceptionV3 base model to the input, and then applying the custom classifier to the output of the base model.


### Model Compilation: `input_output_compile`

This function creates the final model by specifying the input and output, and compiles the model with the Stochastic Gradient Descent (SGD) optimizer, the sparse categorical cross-entropy loss function, and accuracy as the evaluation metric.


### Learning Rate Scheduler: `lr_decay`

This function defines a learning rate scheduler that decreases the learning rate as the training progresses, which can help to improve the convergence of the model.


### Model Training: `fit_generator`

The model is trained on the training data using the `fit_generator` method, with data augmentation applied to the training images to increase the diversity of the training data and help prevent overfitting. The learning rate scheduler is passed to the `fit_generator` method as a callback, so that the learning rate is updated at the end of each epoch.

---

In [None]:
def classifier_(input_):
    initializer = tf.keras.initializers.GlorotNormal(seed=767)
    output_ = keras.layers.GlobalAveragePooling2D()(input_)
    output_ = keras.layers.Flatten()(output_)
    output_ = keras.layers.BatchNormalization()(output_)
    output_prime = output_

    output_ = keras.layers.Dense(2048, kernel_initializer=initializer)(output_)
    output_ = keras.layers.BatchNormalization()(output_)
    output_ = keras.layers.Activation('relu')(output_)
    output_ = keras.layers.Dropout(0.55)(output_)

    output_ = keras.layers.Dense(2048, kernel_initializer=initializer)(output_)
    output_ = keras.layers.BatchNormalization()(output_)
    output_ = keras.layers.Activation('relu')(output_)
    output_ = keras.layers.Dropout(0.55)(output_)

    output_ = keras.layers.Dense(1024, kernel_initializer=initializer)(output_)
    output_ = keras.layers.BatchNormalization()(output_)
    output_ = keras.layers.Activation('relu')(output_)
    output_ = keras.layers.Dropout(0.55)(output_)

    output_ = keras.layers.Concatenate()([output_, output_prime])

    output_ = keras.layers.Dense(512, activation='relu', kernel_initializer=initializer)(output_)
    output_ = keras.layers.Dense(2, activation='softmax')((output_))
    return output_


def model_flow(input_):
    output_ = tf.keras.applications.InceptionV3(input_shape=(224, 224, 3),
                               include_top=False,
                               weights=None)(input_)

    # output_ a fc1000 classification at this point
    # Add the self-designed classifier to produce binary result
    output_ = classifier_(output_)
    return output_


def input_output_compile():

    # Shape of the training set
    input_ = keras.layers.Input(shape=(224, 224, 3))
    output_ = model_flow(input_)
    clf = keras.Model(inputs=input_, outputs=output_)

    clf.compile(optimizer='SGD', loss='sparse_categorical_crossentropy',
                metrics=['accuracy'])
    return clf


# Define learning rate scheduler
def lr_decay(epoch, lr):
    if epoch == 0:

        # Do not print evaluation at the beginning of the training

        return lr * 1 / (1 + 0.05 * epoch)
    if epoch % 5 == 0:
        
        # Print test accuracy every 5 epochs
        print('Test set accuracy: ' +
              str(np.round(model.evaluate(x_test, y_test)[1] * 100, 2)) + '%')
    return lr * 1 / (1 + 0.05 * epoch)


set_seed()

# Visualize the neural network structure
model = input_output_compile()
tf.keras.utils.plot_model(model, show_shapes=True,
              show_dtype=False,
              show_layer_names=False,
              show_layer_activations=True)

## Start the training on augmentation generater
set_seed()
lr_scheduler = tf.keras.callbacks.LearningRateScheduler(lr_decay)
hist = model.fit_generator(train_aug.flow(x_train, y_train, batch_size=24),
                    epochs=8,
                    # steps_per_epoch=X_train.shape[0]//batch_size,
                    validation_data=(x_valid, y_valid),
                    # validation_steps=X_valid.shape[0]//batch_size
                    callbacks=[lr_scheduler])

## Visualize the change of accuracy and loss during the training
In the following code block, we are visualizing the change in accuracy and loss during the training process. The accuracy and loss for both the training and validation sets are extracted from the history object returned by the model's fit method.

The accuracy and loss values are plotted against the number of epochs to show how the model's performance improved over time during training. The blue line represents the training data, while the orange line represents the validation data.

The first plot shows the accuracy of the model over time. Ideally, we want to see the accuracy increasing over time, indicating that the model is learning from the training data.

The second plot shows the loss of the model over time. Ideally, we want to see the loss decreasing over time, indicating that the model is getting better at predicting the correct classes.

---

In [None]:
acc = hist.history['accuracy']
val_acc = hist.history['val_accuracy']
loss = hist.history['loss']
val_loss = hist.history['val_loss']

epochs = range(1, len(acc) + 1)
plt.figure(figsize = (16, 8))
plt.plot(epochs, acc, label='Training acc')
plt.plot(epochs, val_acc, label='Validation acc')
plt.title('Accuracy')
plt.legend()
plt.grid()
plt.figure(figsize = (13, 8))
plt.plot(epochs, loss, label='Training loss')
plt.plot(epochs, val_loss, label='Validation loss')
plt.title('Loss')
plt.legend()
plt.grid()

plt.tight_layout()
plt.show()

## Evaluate the result and analyze wrong cases
> *    Plot wrong-predicted images
> *    Calculate TN, FN, TP, FP
> *    Calculate Reall, Precision, Specificity Rates
---

In [None]:
model.evaluate(x_test, y_test)

In [None]:
y_pred = model.predict(x_test)
correct_ = np.equal(np.argmax(y_pred, 1), y_test)
x_wrong = x_test[np.logical_not(correct_)]
from re import X
# Check if the augmentation works
plt.figure(figsize = (20, 15))
for i in range(1, len(x_wrong)-1):
    plt.subplot(1 + 5 % i, 5, i)
    plt.imshow(x_wrong[i].squeeze(), cmap='gray')
    plt.axis('off')
plt.show()

In [None]:
y_pred = np.argmax(y_pred, 1)
TP = np.sum(np.logical_and(y_pred == 0, y_test == 0))
FP = np.sum(np.logical_and(y_pred == 0, y_test == 1))
TN = np.sum(np.logical_and(y_pred == 1, y_test == 1))
FN = np.sum(np.logical_and(y_pred == 1, y_test == 0))
print('TP:', TP, '\nFP:', FP, '\nTN:', TN, '\nFN:', FN)
print("\nFP: predict as WithoutMask but actually WithMask",
   "\nFN: predict as WithMask but actually WithoutMask")

# IMPORTANT when false ALARM is unacceptable
precision = str(np.round((TP/(TP+FP))*100, 2)) + "%"

# IMPORTANT when missing ALARM is unacceptable, 
# rather have some fake ALARM
recall = str(np.round((TP/(TP+FN))*100, 2)) + "%"

# IMPORTANT when false ALARM is unacceptable
specificity = str(np.round((TN/(TN+FP))*100, 2)) + "%"

print("\nPrecision:", precision,"\nRecall:", recall,
    "\nSpecificity:", specificity)