# Face Mask Detection
## 1. Introduction
**Context**
This dataset is used for Face Mask Detection Classification with images. The dataset consists of almost 12K images which are almost 328.92MB in size.

**Acknowledgments**
All the images with the face mask (~6K) are scrapped from google search and all the images without the face mask are preprocessed from the CelebFace dataset created by Jessica Li (https://www.kaggle.com/jessicali9530). Thank you so much Jessica for providing a wonderful dataset to the community.

**Inspiration**
The inspiration behind creating this dataset is to create an algorithm that can directly detect is a person is wearing a face mask or not. So I've scrapped the images from google as well as from the CelebFace dataset created by Jessica Li (https://www.kaggle.com/jessicali9530) to make this happen.




The dataset is donwloaded from kaggle [Face Mask Detection ~12K Images Dataset](https://www.kaggle.com/datasets/ashishjangra27/face-mask-12k-images-dataset)

## 2. Planning
1. Data Augmentation to increase dataset size
2. Develop and train CNN model to detect face
3. Label faces
    1. Using Haarcascade find and crop faces
    2. using CNN model to predict the result
    3. label the faces with predicted result


In [None]:
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import classification_report
from libs.nn.conv import CnnModel
import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers as layers
from tensorflow.keras.optimizers import SGD
import pathlib
import matplotlib.pyplot as plt
import matplotlib.image as img
import opendatasets as od
import numpy as np
from imutils import paths
import os

In [None]:
# Dowload the dataset
dataset_url = 'https://www.kaggle.com/datasets/ashishjangra27/face-mask-12k-images-dataset'

# Look into the data directory
images_dir = './face-mask-12k-images-dataset/Face Mask Dataset'

images_dir_path = pathlib.Path(images_dir)
if not os.path.isdir(images_dir):
    od.download(dataset_url)


## 3. Data Exploration

In [None]:
# Show random images
train_dir = f'{images_dir}/Train'
test_dir = f'{images_dir}/Test'
valid_dir = f'{images_dir}/Validation'

train_imgs = list(paths.list_images(train_dir))
test_imgs = list(paths.list_images(test_dir))
valid_imgs = list(paths.list_images(valid_dir))

len(train_imgs) + len(test_imgs) + len(valid_imgs)

In [None]:
random_num_array = np.random.randint(len(train_imgs), size=16)

In [None]:
fig, axs = plt.subplots(nrows=4, ncols=4, figsize=(12, 8))

for ax, num in zip(axs.ravel(), random_num_array):
    _img = img.imread(train_imgs[num])
    ax.set_title(f'{_img.shape}')
    ax.imshow(_img , interpolation='none')
    ax.axis("off")
# plt.subplots_adjust(wspace=0, hspace=0, left=0, right=1, bottom=0, top=1)
plt.show()

Observation:
* we have have total 11792 different kind of colored images with different sizes

In [None]:
total_with_mask = len(list(paths.list_images(f'{images_dir}/Train/WithMask')))
total_without_mask = len(list(paths.list_images(f'{images_dir}/Train/WithoutMask')))
total_with_mask,total_without_mask

## 4. Data Preparation

In [None]:
BATCH_SIZE = 32
IMG_HEIGHT = 64
IMG_WIDTH = 64

In [None]:
train_data_set = keras.utils.image_dataset_from_directory(
    train_dir,
    labels='inferred',
    label_mode='binary',
    class_names=["WithMask", "WithoutMask"],
    batch_size=BATCH_SIZE,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    shuffle=True,
    seed=42
)
test_data_set = keras.utils.image_dataset_from_directory(
    test_dir,
    labels='inferred',
    label_mode='binary',
    class_names=["WithMask", "WithoutMask"],
    batch_size=BATCH_SIZE,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    seed=42
)
valid_data_set = keras.utils.image_dataset_from_directory(
    valid_dir,
    labels='inferred',
    label_mode='binary',
    class_names=["WithMask", "WithoutMask"],
    batch_size=BATCH_SIZE,
    image_size=(IMG_HEIGHT, IMG_WIDTH),
    seed=42
)

## 4. Training CNN Model

In [None]:
cnn_model = CnnModel.build(width=IMG_WIDTH, height=IMG_HEIGHT, depth=3, classes=1)

In [None]:
opt = SGD(learning_rate=0.005)
cnn_model.compile(loss="binary_crossentropy", optimizer=opt,metrics=["accuracy"])

In [None]:
AUTOTUNE = tf.data.AUTOTUNE

train_ds = train_data_set.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = valid_data_set.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
early_stopping_cb = keras.callbacks.EarlyStopping(patience=10,
                                                     restore_best_weights=True)

In [None]:
history = cnn_model.fit(train_ds, validation_data=val_ds,
	batch_size=32, epochs=100, verbose=1,callbacks=[early_stopping_cb])

## 5. Model Evaluation

In [None]:
plt.figure()
plt.plot( history.history["loss"], label="train_loss")
plt.plot( history.history["val_loss"], label="val_loss")
plt.plot( history.history["accuracy"], label="train_acc")
plt.plot( history.history["val_accuracy"], label="val_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend()
plt.show()

In [None]:
for x, y in test_data_set:
    redict = cnn_model.predict(x)
    print(redict.argmax(axis=1))
    # print (y)


In [None]:
preds = np.array([])
testY =  np.array([])
for x, y in test_data_set:
    predict = cnn_model.predict(x)
    print(max(predict.argmax(axis=1)))
    preds = np.concatenate([preds, predict.argmax(axis=1)])
    testY = np.concatenate([testY, y])

tf.math.confusion_matrix(labels=testY, predictions=preds).numpy()


In [None]:
print(classification_report(testY,
	test_preds.argmax(axis=1),
	target_names=["WithMask", "WithoutMask"]))

## 6. Testing on new (real world) images

## 7. Save the Model

## 7. Future Work

## 8. Reference