<a href="https://colab.research.google.com/github/janosepah/FaceMaskDetection/blob/main/Face_Mask_Detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Final Project: Face Mask Detector using Deep Learning** 
by Azadeh Ghaffari and Safoura Janosepah

**Objective** is to build a Deep Learning model which can identify if the person is wearing a mask or not.


The novel COVID-19 virus has forced us all to rethink how we live our everyday lives while keeping ourselves and others safe. Face masks have emerged as a simple and effective strategy for reducing the virus’s threat and also, application of face mask detection system are now in high demand for transportation means, densely populated areas, residential districts, large-scale manufacturers and other enterprises to ensure safety. Therefore, the goal of this project is to develop a face mask detector using deep learning.

### Table of Content

1. About Dataset
2. Choosing Deep Learning Model
3. Training and Evaluation of Models
4. Conclusion
5. Use Cases
6. References

In [None]:
import os
import cv2
import glob
import torch
import shutil
import itertools
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from torch import nn
from torch import optim
from torchvision import transforms, datasets, models
from keras.models import Sequential
from keras.applications.vgg19 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ModelCheckpoint,EarlyStopping
from keras.layers import Flatten, Dense, Conv2D, BatchNormalization, MaxPooling2D, Dropout
from tensorflow.keras.applications import EfficientNetB1, VGG19, ResNet50, InceptionV3, MobileNet, DenseNet201
from tensorflow.keras.applications import ResNet101,ResNet152,ResNet50V2,ResNet101V2,ResNet152V2
from os import path

## 1. About Dataset

The main source of data we used is Kaggle. 

* Data is downloaded from this link
https://www.kaggle.com/andrewmvd/face-mask-detection
* Data1 is downloaded from this link https://www.kaggle.com/ashishjangra27/face-mask-12k-images-dataset
* Data2 is downloaded from this link
https://www.kaggle.com/niharika41298/withwithout-mask

### Displaying sample images
Here we just show sample images that we have in data path

In [None]:
i = 0
fig, axes = plt.subplots(3, 3, figsize=(10, 10))
for dirname, _, filenames in os.walk('./data'):
    for filename in filenames[:9]:
        img = cv2.imread(os.path.join(dirname, filename))
        if img is not None:
            img = cv2.cvtColor(img, cv2.IMREAD_GRAYSCALE)
            img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
            ax = list(axes.flatten())[i]
            ax.imshow(img)
            ax.set_title('Image ' + str(i+1))
            ax.axis('off')
            i += 1
plt.show()

Make a small group of original images and augmented images from "withMask" folder

In [None]:
images = []
augmented = []
path = './data1/Face Mask Dataset'
#PROJECT_ROOT_DIR = "."
CHAPTER_ID = "Face Mask Dataset"
#path = os.path.join(PROJECT_ROOT_DIR, "data1", CHAPTER_ID)

for set_ in os.listdir(path):
    i, j = 4, 4
    
    for img in os.listdir(path+'/'+set_+'/WithMask'):
        if img[0] != 'A':
            if i > 0:
                images.append(path+'/'+set_+'/WithMask/'+img)
                i -= 1
        else:
            if j > 0:
                augmented.append(path+'/'+set_+'/WithMask/'+img)
                j -= 1

Display the original images that was grouped in previous step

In [None]:
fig, axes = plt.subplots(3, 4, figsize=(15, 15))
fig.tight_layout()
fig.subplots_adjust(hspace=-0.5)

for ax in axes.flatten():
    ax = axes.flatten()[list(axes.flatten()).index(ax)]
    img = cv2.imread(images[list(axes.flatten()).index(ax)])
    img = cv2.cvtColor(img, cv2.IMREAD_GRAYSCALE)
    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
    ax.imshow(img)
    ax.axis('off')

plt.suptitle('Original Images', size=30)
plt.show()

#### Image Augmentation
 Data Augmentation is a technique used to expand or enlarge your dataset by using the existing data of the dataset. We apply different techniques to expand our dataset so that it will help to train our model better with a large dataset.

Image Augmentation is one of the technique we can apply on an image dataset to expand our dataset so that no overfitting occurs and our model generalizes well. 
In this project we had to do this to get a better result

In [None]:
fig, axes = plt.subplots(3, 4, figsize=(15, 15))
fig.tight_layout()
fig.subplots_adjust(hspace=-0.5)

for ax in axes.flatten():
    ax = axes.flatten()[list(axes.flatten()).index(ax)]
    img = cv2.imread(augmented[list(axes.flatten()).index(ax)])
    img = cv2.cvtColor(img, cv2.IMREAD_GRAYSCALE)
    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)
    ax.imshow(img)
    ax.axis('off')

plt.suptitle('Augmented Images', size=30)
plt.show()

### Data preprocessing
preprocessing steps as mentioned below was applied to all the raw input images to convert them into clean versions,which could be fed to a neural network machine learning model.
1. Resizing the input image (128 x 128)
2. Scaling / Normalizing images 
3. Set the train and test groups

In [None]:
path, batch_size = './data2/maskdata/maskdata', 16

train_datagen = ImageDataGenerator(rescale=1.0/255, horizontal_flip=True, zoom_range=0.2,
                                  shear_range=0.2)
test_datagen = ImageDataGenerator(rescale=1.0/255)

train_generator = train_datagen.flow_from_directory(path+'/train', target_size=(128, 128), 
                                               batch_size=batch_size, class_mode='categorical')
test_generator = test_datagen.flow_from_directory(path+'/test', target_size=(128, 128), 
                                             batch_size=batch_size, class_mode='categorical')

### Plot the distribution of classes
There are two groups of withMask and WithoutMAsk. In the diagrams below you can see the ratio of these two classes in the datasets

In [None]:
path = './data2/maskdata/maskdata'
fig, axes = plt.subplots(1, 2, figsize=(15, 9))

for set_ in os.listdir(path):
    counts = []
    ax = axes[os.listdir(path).index(set_)]
    for class_ in os.listdir(path+'/'+set_):
        count=len(os.listdir(path+'/'+set_+'/'+class_))
        counts.append(count)
    ax.bar(['With Mask', 'Without Mask'], counts, color='skyblue')
    ax.set_title(set_)
    ax.set_xlabel('Classes')
    ax.set_ylabel('Number of samples')

plt.suptitle('Distribution of classes', size=25)
plt.show()

## 2- Chosing and Traing Deep Learning Models

In this project, we try to implement multiple Deep Learning models including : 


*   ConvNet
*   VGG19
*   DenseNet201
*   OpenCV
*   MobileNet
*   ResNet50- 6 Models
*   YOLO5



Based on a lot of research on face mask detection, we try to build the above models and then compare the results according to the accuracy and loss criteria.

## 3. Training and Evaluation of Models

### Detecting the face and mask using ConvNet model
In deep learning, a convolutional neural network is a class of deep neural networks, most commonly applied to analyzing visual imagery

In [None]:
histories = []
for i in range(3):
    model = Sequential()
    model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(128, 128, 3)))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))
    
    if i > 0: 
        model.add(Conv2D(64, kernel_size=(3, 3), activation='relu', input_shape=(128, 128, 3)))
        model.add(BatchNormalization())
        model.add(MaxPooling2D(pool_size=(2, 2)))
        model.add(Dropout(0.2))
    
        if i > 1: 
            model.add(Conv2D(128, kernel_size=(3, 3), activation='relu', input_shape=(128, 128, 3)))
            model.add(BatchNormalization())
            model.add(MaxPooling2D(pool_size=(2, 2)))
            model.add(Dropout(0.2))
    
    model.add(Flatten())
    model.add(Dense(2, activation='sigmoid'))
    model.summary()

    model.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')
    histories.append(model.fit_generator(generator=train_generator, 
                                         validation_data=test_generator, 
                                         steps_per_epoch=len(train_generator)//3, 
                                         validation_steps=len(test_generator)//3, 
                                         epochs=10))

#### Plot Loss and Accuracy 
plot the result of Loss and accuracy for all three layers

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
for metric in histories[0].history:
    index = list(histories[0].history).index(metric)
    ax = axes.flatten()[index]
    layer_num = 0
    for history in histories:
        layer_num += 1
        ax.plot(history.history[metric], label=str(layer_num)+' layer(s)')
    ax.set_title(metric)
    ax.legend()
plt.show()

### Detecting the face and mask using OpenCV model
OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and modify the code.
using openCV to detect the face in images , we use haarcascade for that and check a sample image to see the result

In [None]:
faceCascade = cv2.CascadeClassifier('./haarcascade/haarcascade_frontalface_default.xml')

In [None]:
import matplotlib.pyplot as plt
#trying it out on a sample image
img = cv2.imread('./data/meWithMask.png')

gray = cv2.cvtColor(img, cv2.IMREAD_GRAYSCALE)

faces = faceCascade.detectMultiScale(gray,1.3,5) #returns a list of (x,y,w,h) tuples

out_img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR) #colored output image

#plotting
for (x,y,w,h) in faces:
    cv2.rectangle(out_img,(x,y),(x+w,y+h),(0,0,255),1)
plt.figure(figsize=(12,12))
plt.imshow(out_img)

Define a method to test all images with the trained models and test it with sample image

In [None]:
sample = "./data/mewithoutMask.png"

def checkImages(model,image):
    plt.figure(figsize=(8,7))
    label = {0:"With Mask",1:"Without Mask"}
    color_label = {0: (0,255,0),1 : (0,0,255)}
    count = 0
    frame =cv2.imread(image)
    gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
    faces =faceCascade.detectMultiScale(gray,1.3,5)
    for x,y,w,h in faces:
        face_image = frame[y:y+h,x:x+w]
        resize_img  = cv2.resize(face_image,(128,128))
        normalized = resize_img/255.0
        reshape = np.reshape(normalized,(1,128,128,3))
        reshape = np.vstack([reshape])
        result = model.predict_classes(reshape)

        if result == 0:
            cv2.rectangle(frame,(x,y),(x+w,y+h),color_label[0],3)
            cv2.rectangle(frame,(x,y-50),(x+w,y),color_label[0],-1)
            cv2.putText(frame,label[0],(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)
            frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
            plt.imshow(frame)
        elif result == 1:
            cv2.rectangle(frame,(x,y),(x+w,y+h),color_label[1],3)
            cv2.rectangle(frame,(x,y-50),(x+w,y),color_label[1],-1)
            cv2.putText(frame,label[1],(x,y-10),cv2.FONT_HERSHEY_SIMPLEX,1,(255,255,255),2)
            frame = cv2.cvtColor(frame,cv2.COLOR_BGR2RGB)
            plt.imshow(frame)
        #plt.imshow(frame)
    plt.show()
    cv2.destroyAllWindows()

checkImages(model,sample)


### Comparing model performance in 11 models 
### (ConvNet,VGG19,DenseNet201,OpenCV, and MobileNet, ResNet)
* Using Inception V3 : is a convolutional neural network for assisting in image analysis and object detection, and got its start as a module for Googlenet. It is the third edition of Google's Inception Convolutional Neural Network, originally introduced during the ImageNet Recognition Challenge.
* MobileNet : are based on a streamlined architecture that uses depth-wise separable convolutions to build light weight deep neural networks
* DenseNet201: is a convolutional neural network that is 201 layers deep
* VGG19 :  is a variant of VGG model which in short consists of 19 layers (16 convolution layers, 3 Fully connected layer, 5 MaxPool layers and 1 SoftMax layer)
* Residual Convolutional Neural Networks 50 :
ResNet is one of the usefull models for image recognition tasks. Most noticable feature of this neural network architecture is some direct connections which skip middle layers rather then subsequent layer (it might vary on different ResNet models). In Keras API, there are 6 different ResNet architectures (ResNet50,ResNet101,ResNet152,ResNet50V2,ResNet101V2,ResNet152V2) which we are going to use them with Transfer Learning method. For more information about how to use those pre-build ResNet models in Keras

In [None]:
model_histories = []
all_models=[]
height = 128
width = 128
models = [InceptionV3(include_top=False, input_shape=(height, width, 3)), 
                   MobileNet(include_top=False, input_shape=(height, width, 3)), 
                   DenseNet201(include_top=False, input_shape=(height, width, 3)),
                   VGG19(include_top=False, input_shape=(height, width, 3)),ResNet50(include_top=False,input_shape=(height, width,3)),
                    ResNet101(include_top=False, input_shape=(height, width, 3)), ResNet152(include_top=False, input_shape=(height, width, 3)),
                    ResNet50V2(include_top=False, input_shape=(height, width, 3)),ResNet101V2(include_top=False, input_shape=(height, width, 3)),
                    ResNet152V2(include_top=False, input_shape=(height, width, 3))]
names = ['ConvNet', 'InceptionV3', 'MobileNet', 'DenseNet', 'VGG19', 'ResNet50','ResNet101','ResNet152','ResNet50V2','ResNet101V2','ResNet152V2']
for layer in [Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(height, width, 3))]:
    model = Sequential()
    model.add(layer)
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.2))
    
    model.add(Flatten())
    model.add(Dense(2, activation='sigmoid'))
    model.summary()

    model.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')
    all_models.append(model)
    model_histories.append(model.fit_generator(generator=train_generator, 
                                         validation_data=test_generator, 
                                         steps_per_epoch=len(train_generator)//3, 
                                         validation_steps=len(test_generator)//3, 
                                         epochs=10))

for functional in models:
    
    for layer in functional.layers:
        layer.trainable = False

    model = Sequential()
    model.add(functional)
    model.add(Flatten())
    model.add(Dense(2, activation='sigmoid'))
    model.summary()

    model.compile(optimizer='adam', loss='binary_crossentropy', metrics='accuracy')
    all_models.append(model)
    model_histories.append(model.fit_generator(generator=train_generator, 
                                         validation_data=test_generator, 
                        steps_per_epoch=len(train_generator)//3, 
                        validation_steps=len(test_generator)//3, epochs=10))


### Plot Loss / Accuracy 
Plot loss and accuracy for all five models over the epochs and compare them

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(20, 15))
fig.subplots_adjust(hspace=0.3)
for metric in model_histories[0].history:
    index = list(model_histories[0].history).index(metric)
    ax = axes.flatten()[index]
    name_index = 0
    for history in model_histories:
        ax.plot(history.history[metric], label=names[name_index])
        name_index += 1
    ax.set_title(metric+' over epochs', size=15)
    ax.set_xlabel('epochs')
    ax.set_ylabel(metric)
    ax.legend()
plt.show()

## Result
Having looked at the results:
* We can see after apoch 7 we got high accurcy for most of the models and low loss. 
* MobileNet , RestNet50V2 , DenseNet, VGG19 ,RestNet152V2 have the best accuracy and loss .
* ConvNet, RestNet101V2, RestNet152 and RestNet50 are in different level and they don't have a good result

## Test the models
* Test the last model with new data , try to detect mask on face
* Test models with another source of data to get the accuracy based on detection

#### Based on the result that we got after training we want to run the test on the top models 
* VGG19
* RestNet152V2
* RestNet50V2

In [None]:
import sys
test_dir="./data2/maskdata/maskdata/test"
def test_model(model):
    total_comparison=0
    total_incorrect=0
    class_list=list(train_generator.class_indices.keys())
    print(class_list[0])
    directory_test = test_dir
    for pic in os.listdir(directory_test):
        dir = os.path.join(directory_test,pic)
        for image_name in os.listdir(dir):
            path = os.path.join(dir,image_name)
            image = cv2.imread(path,cv2.IMREAD_COLOR)
            image = cv2.resize(image, (128,128))
            image = np.expand_dims(image, axis=0)
            total_comparison+=1
            pred_class=class_list[np.argmax(model.predict(image))]
            if pic!=pred_class:
                total_incorrect+=1
                sys.stdout.write('img= ' + image_name +' total_comparisons= ' + str(total_comparison) + '  total_incorrect: ' + str(total_incorrect) + '\r')
                sys.stdout.flush()
                print('')
                print('picture=' + pic + ' but model predicted='+pred_class)
                image = cv2.imread(path,cv2.IMREAD_COLOR)
                image = cv2.resize(image, (128,128))
                plt.imshow(image)
                plt.show()
            sys.stdout.write('img= ' + image_name +' total_comparisons= ' + str(total_comparison) + '  total_incorrect: ' + str(total_incorrect) + '\r')
            sys.stdout.flush()
    print('')
    print("Total comparisons="+str(total_comparison)+ " incorrect pred="+str(total_incorrect)+" Accuracy%="+str((total_comparison-total_incorrect)/total_comparison))
    return((total_comparison-total_incorrect)/total_comparison)

In [None]:
imagePath = "./data"
i = 0
for myModel in all_models:
    print(names[i] + " : " + str(myModel))
    i = i + 1
    if (i == 5 or i == 11):
        for image_name in os.listdir(imagePath):
            print(image_name)
            image_path = os.path.join(imagePath,image_name)
            checkImages(myModel,image_path)
    

By Using OpenCV on new data we can see that "RestNet152V2" has the good result 

In [None]:
i = 0
for myModel in all_models:
    print(names[i] + " : " + str(myModel))
    i = i + 1
    if (i == 9 or i == 5) :
        test_model(myModel)

## Test Result
Based on the test that we run on "RestNet50V2"we got %63 accuracy and for "VGG19" we got %94 acuuracy 
* based on our test result we can say that we have a good train model with VGG19  

## 4. Conclusion
As the technology are blooming with emerging trends the availability so we have novel face mask detector which can possibly contribute to public healthcare. The architecture consists of MobileNet as the backbone it can be used for high and low computation scenarios. In order to extract more robust features, we utilize transfer learning to adoptweights from a similar task face detection, which is trained on a very large dataset.
We used OpenCV, tensor flow, keras , Pytorch and CNN to detect whether people were wearing face masks or not. The models were tested with images. The accuracy of the model is achieved and, the optimization ofthe model is a continuous process and we are building a highly accurate solution by tuning the hyper parameters.


## 5. Use Cases
Here are a few use cases where this mask detection technology could be leveraged.

* Airports: 
The Face Mask Detection System could be used at airports to detect travelers without masks. Face data of travelers can be captured in the system at the entrance. If a traveler is found to be without a face mask, their picture is sent to the airport authorities so that they could take quick action.
* Hospitals: 
Using Face Mask Detector System, Hospitals can monitor if quarantined people required to wear a mask are doing so or not. The same holds good for monitoring staff on duty too.
* Offices & Working Spaces: 
The Face Mask Detection System can be used at office premises to ascertain if employees are maintaining safety standards at work. It monitors employees without masks and sends them a reminder to wear a mask.
* Government: 
To limit the spread of coronavirus, the police could deploy the face mask detector on its fleet of surveillance cameras to enforce the compulsory wearing of face masks in public places.


## 6. Refrences
* Kaggle
    * https://www.kaggle.com/andrewmvd/face-mask-detection
    * https://www.kaggle.com/ashishjangra27/face-mask-12k-images-dataset
    * https://www.kaggle.com/niharika41298/withwithout-mask
* https://www.irjet.net/archives/V7/i8/IRJET-V7I8530.pdf 
* https://www.ideas2it.com/blogs/face-mask-detector-using-deep-learning-pytorch-and-computer-vision-opencv/


#### Why YOLO?
Using an object detection model such as YOLOv5 is most likely the simplest and most reasonable approach to this problem. This is because we’re limiting the computer vision pipeline to a single step, since object detectors are trained to detect a:

*   Bounding box and a
*   Corresponding label

This is precisely what we’re trying to achieve for this problem. In our case, the bounding boxes will be the detected faces, and the corresponding labels will indicate whether the person is wearing a mask or not.
Alternatively, if we wanted to build our own deep learning model, it would be more complex, since it would have to be 2-fold: we’d need a model to detect faces in an image, and a second model to detect the presence or absence of face mask in the found bounding boxes.

A drawback of doing so, apart from the complexity, is that the inference time would be much slower, especially in images with many faces.

#### Training on custom data

In [None]:
from pathlib import Path
from xml.dom.minidom import parse
from shutil import copyfile

#### Project layout
The first thing we need to do is clone the repository from ultralytics/yolov5, and install all required dependencies:

In [None]:
!git clone https://github.com/rkuo2000/yolov5
%cd yolov5

#### Repro YOLOv5

In [None]:
!mkdir -p Dataset/FaceMask/Images
!mkdir -p Dataset/FaceMask/Labels

In [None]:
!cp -rf /content/drive/MyDrive/FinalProject/MaskDetestion/data3/images/* Dataset/FaceMask/Images

In [None]:
!mkdir -p Dataset/images Dataset/labels

#### Create Test and Train Dataset

In [None]:
FILE_ROOT = "/content/"
IMAGE_PATH = FILE_ROOT + "images"  
ANNOTATIONS_PATH = FILE_ROOT + "annotations"

DATA_ROOT = "Dataset/"
LABELS_ROOT = DATA_ROOT + "FaceMask/Labels"
IMAGES_ROOT = DATA_ROOT + "FaceMask/Images"  

DEST_IMAGES_PATH = "images"
DEST_LABELS_PATH = "labels" 

In [None]:
classes = ['with_mask', 'without_mask', 'mask_weared_incorrect']

#### Converting annotations (from COCO .xml to YOLO format .txt)

In [None]:
def cord_converter(size, box):
    """
    convert xml annotation to darknet format coordinates
    :param size： [w,h]
    :param box: anchor box coordinates [upper-left x,uppler-left y,lower-right x, lower-right y]
    :return: converted [x,y,w,h]
    """
    x1 = int(box[0])
    y1 = int(box[1])
    x2 = int(box[2])
    y2 = int(box[3])

    dw = np.float32(1. / int(size[0]))
    dh = np.float32(1. / int(size[1]))

    w = x2 - x1
    h = y2 - y1
    x = x1 + (w / 2)
    y = y1 + (h / 2)

    x = x * dw
    w = w * dw
    y = y * dh
    h = h * dh
    return [x, y, w, h]

def save_file(img_jpg_file_name, size, img_box):
    save_file_name = LABELS_ROOT + '/' + img_jpg_file_name + '.txt'
    print(save_file_name)
    file_path = open(save_file_name, "a+")
    for box in img_box:

        cls_num = classes.index(box[0])

        new_box = cord_converter(size, box[1:])

        file_path.write(f"{cls_num} {new_box[0]} {new_box[1]} {new_box[2]} {new_box[3]}\n")

    file_path.flush()
    file_path.close()
    
def get_xml_data(file_path, img_xml_file):
    img_path = file_path + '/' + img_xml_file + '.xml'
    print(img_path)

    dom = parse(img_path)
    root = dom.documentElement
    img_name = root.getElementsByTagName("filename")[0].childNodes[0].data
    img_size = root.getElementsByTagName("size")[0]
    objects = root.getElementsByTagName("object")
    img_w = img_size.getElementsByTagName("width")[0].childNodes[0].data
    img_h = img_size.getElementsByTagName("height")[0].childNodes[0].data
    img_c = img_size.getElementsByTagName("depth")[0].childNodes[0].data
    # print("img_name:", img_name)
    # print("image_info:(w,h,c)", img_w, img_h, img_c)
    img_box = []
    for box in objects:
        cls_name = box.getElementsByTagName("name")[0].childNodes[0].data
        x1 = int(box.getElementsByTagName("xmin")[0].childNodes[0].data)
        y1 = int(box.getElementsByTagName("ymin")[0].childNodes[0].data)
        x2 = int(box.getElementsByTagName("xmax")[0].childNodes[0].data)
        y2 = int(box.getElementsByTagName("ymax")[0].childNodes[0].data)
        # print("box:(c,xmin,ymin,xmax,ymax)", cls_name, x1, y1, x2, y2)
        img_jpg_file_name = img_xml_file + '.jpg'
        img_box.append([cls_name, x1, y1, x2, y2])
    # print(img_box)

    # test_dataset_box_feature(img_jpg_file_name, img_box)
    save_file(img_xml_file, [img_w, img_h], img_box)

In [None]:
files = os.listdir(ANNOTATIONS_PATH)
for file in files:
    print("file name: ", file)
    file_xml = file.split(".")
    get_xml_data(ANNOTATIONS_PATH, file_xml[0])

#### Spliting Images dataset

In [None]:
from sklearn.model_selection import train_test_split
image_list = os.listdir('Dataset/FaceMask/Images')
train_list, test_list = train_test_split(image_list, test_size=0.2, random_state=7)
val_list, test_list = train_test_split(test_list, test_size=0.5, random_state=8)

print('total =',len(image_list))
print('train :',len(train_list))
print('val   :',len(val_list))
print('test  :',len(test_list))

In order to train the model, a necessary step will be to change the format of the .xml annotation files so that they conform with the darknet format. In the linked github thread, we’ll see that each image has to have a .txt file associated with it, with rows with the format:
<object-class> <x> <y> <width> <height>
Each line will represent the annotation for each object in the image, where <x> <y> are the coordinates of the centre of the bounding box, and <width> <height> the respective width and height.
For example an img1.jpg must have an associated img1.txt containing

In [None]:
def copy_data(file_list, img_labels_root, imgs_source, type):

    root_file = Path(DATA_ROOT + DEST_IMAGES_PATH + '/' + type)
    if not root_file.exists():
        print(f"Path {root_file} is not exit")
        os.makedirs(root_file)

    root_file = Path(DATA_ROOT + DEST_LABELS_PATH + '/' + type)
    if not root_file.exists():
        print(f"Path {root_file} is not exit")
        os.makedirs(root_file)

    for file in file_list:
        img_name = file.replace('.png', '')
        img_src_file = imgs_source + '/' + img_name + '.png'
        label_src_file = img_labels_root + '/' + img_name + '.txt'

        # print(img_sor_file)
        # print(label_sor_file)
        # im = Image.open(rf"{img_sor_file}")
        # im.show()

        # Copy image
        DICT_DIR = DATA_ROOT + DEST_IMAGES_PATH + '/' + type
        img_dict_file = DICT_DIR + '/' + img_name + '.png'

        copyfile(img_src_file, img_dict_file)

        # Copy label
        DICT_DIR = DATA_ROOT + DEST_LABELS_PATH + '/' + type
        img_dict_file = DICT_DIR + '/' + img_name + '.txt'
        copyfile(label_src_file, img_dict_file)

In [None]:
copy_data(train_list, LABELS_ROOT, IMAGES_ROOT, "train")
copy_data(val_list,   LABELS_ROOT, IMAGES_ROOT, "val")
copy_data(test_list,  LABELS_ROOT, IMAGES_ROOT, "test")

#### Creating data/facemask.yaml

In [None]:
!echo "train: Dataset/images/train" > data/facemask.yaml
!echo "val:   Dataset/images/val" >> data/facemask.yaml
!echo "nc : 3" >> data/facemask.yaml
!echo "names: ['With_Mask', 'Without_Mask', 'Incorrect_Mask']" >> data/facemask.yaml

!cat data/facemask.yaml

#### Training YOLOv5


In [None]:
!pip install PyYAML==5.1

In [None]:
!pip install torchvision==0.7.0

In [None]:
# Train with default Yolov5.weight
!python train.py --img 320 --batch 16 --epochs 50 --data data/facemask.yaml --cfg models/yolov5s.yaml --weights yolov5s.pt

In [None]:
# save trained weights for detection
!cp runs/train/exp2/weights/best.pt weights

#### Testing YOLOv5
Detecting the facemask

In [None]:
!python detect.py --source Dataset/images/test --img-size 320 --conf 0.4 --weights weights/best.pt 

#### Display detected images
As soon as the first epoch is complete we will have a mosaic of images showing both the ground truth and prediction results on test images, which will look like:

In [None]:
# display detected images
from IPython.display import Image

In [None]:
!python detect.py --source /content/drive/MyDrive/FinalProject/MaskDetestion/data3/images/maksssksksss103.png --img-size 320 --conf 0.4 --weights weights/best.pt 

In [None]:
Image('runs/detect/exp2/maksssksksss103.png')

In [None]:
!python detect.py --source /content/drive/MyDrive/FinalProject/MaskDetestion/data3/images/maksssksksss130.png --img-size 320 --conf 0.4 --weights weights/best.pt 

In [None]:
Image('runs/detect/exp5/maksssksksss130.png')

All in all Yolo5 is one of the best model to detect object correctly. In our project we use this model to check the result and how it works, all of code we use from Kaggle and  https://medium.com/. 