<a href="https://colab.research.google.com/github/goldentrex/FaceMaskRecognition/blob/main/FaceMaskDetection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook, you will follow our approach in our project aiming at detecting the wearing of masks.
You can follow all the notebook on Google Colab, but won't be able to try the model using your own webcam.


---


First, the challenge of our project was to make our own dataset, that means to create data and then label it to train our model.

In a second time, we used Yolov5 to use one of their models to be able to train our own model for classification.

At the end you will be able to try the models out and see how it works, either on pre-recorded videos or if you're executing the model on your local machine, in real-time with your webcam (which is quite fun).


---


Note that if you want to fully execute this notebook on your local machine you will need a CUDA enabled NVIDIA GPU !

First, you have to clone the Yolov5 folders from the Ultralytics GitHub so we can use their training algorithm and their custom models

In [None]:
!git clone https://github.com/ultralytics/yolov5  # clone

fatal: destination path 'yolov5' already exists and is not an empty directory.


Now you will need to go to the yolov5 folder and install the requirements.txt to make sure you have all the Python libraries you need to train and detect with Yolov5

In [None]:
%cd yolov5
!pip install -r requirements.txt  # install
%cd ..

/content


As we will download data from other websites, make sure you have wget installed, this way you will be able to download the data we will use later on

In [None]:
!pip3 install wget



We made our own dataset using our own pictures and videos, labelising it with [Supervise.ly](https://), that's from this website that we will download our dataset in a Tar file.


---


If you want more information on the process of labelising data and how we managed to use the website in cooperation, feel free to ask us on our GitHub.
However most of the relevant information can be found directly on the website.

In [None]:
import wget

!mkdir data
%cd data

link = 'https://app.supervise.ly/h5un6l2bnaz1vj8a9qgms4-public/teams_storage/46385/p/b/Wu/s9B7kTZ7aYk4xFXx5i67t91dtuPtlbW45xiQaDF0tLpQKkWsWEdZOSdyMFHioT5UuyOfTLdEdtgQAH7yVF82vFedartZsMtztE07AHscZIGjBzuOlJVYXVlmwc4L.tar'

file_name = wget.download(link)

/content/data


Here we want to get the name of the archive as it can change over time, depending on which batch of data we provided in the link above.

In [None]:
#get the name of the archive
import glob

targetPattern = r"*.tar"
file_name = glob.glob(targetPattern)[0]

Now we can extract all the data from the Tar file that the website provided us

In [None]:
# get data and extract

import tarfile

tar = tarfile.open(file_name)
tar.extractall()
tar.close()

Remove the Tar file to clean up a bit our environment

In [None]:
#remove tarfile
import os

os.remove(file_name)

%cd ..

Even though the website provides us with the data and the labels that we worked on in a Yolov5 format, the form isn't right.

In fact, to make it work we want at least to have a train and validation set, so the training can be done properly, and a test set wouldn't be a bad idea to if we want to test our model later on.


---


So now we will want to split the data, to do that, let's first import some libraries that we will need in our function.

In [None]:
#here we will split the data into train validation and test folders

import shutil, os
import glob
import math
from PIL import Image
from random import shuffle
import tkinter, tkinter.constants, tkinter.filedialog
import yaml

To reference the dataset when we will train our model, we will need a YAML file, that will indicate the number of classes, their names and the path of the training and validation sets folders.

Note that if you are working a local machine you will need to change the paths as they are made to work in Colab on the first place.

In [None]:
#this will make the yaml file we need to train yolov5

def make_yaml(src_path,dest_path):
    with open(src_path + '/data_config.yaml') as file: #open the document provided by the website
        documents = yaml.full_load(file)

    for item, doc in documents.items(): #get the data from this doc
        if (item == "names"):
            item_names = doc
        if (item == "nc"):
            item_nc = doc
    
    data_file = open(dest_path + "/data.yaml","w") #make the new YAML file
    lines = ["train: /content/data_ready/" + "/train/images\n",
    "val: /content/data_ready/" + "/valid/images\n",
    "test: /content/data_ready/" + "/test/images\n","\n",
    "nc: {}\n".format(item_nc),
    "names: {}".format(item_names)]
    data_file.writelines(lines) #write the lines we need for our training

Then we will need to split the pictures and labels in different sets, to do that we will use this function.

In [None]:
def split_pictures(coef_train,coef_valid,coef_test):

    src_path = "/content/data" #paths for the folders, can be replaced by where you put the data or where you want to put the data processed
    dest_path = "/content/data_ready"

    #shuffle files to get a list
    files_temp = glob.glob(src_path + "/images/train/" + "*.jpg")
    files = []

    #get rid of the extension to shuffle both folders with the same names
    for i in range(len(files_temp)):
        name_temp = files_temp[i]
        name_temp_2 = name_temp[:-4]
        name = name_temp_2[len(src_path + "/images/train/"):]
        files.append(name)

    shuffle(files)

    files_images = []
    files_labels = []
    files_images_path = src_path + "/images/train/"
    files_labels_path = src_path + "/labels/train/"

    for i in range(len(files)):
        name_temp = files[i]
        name_images = files_images_path + name_temp + ".jpg"
        name_labels = files_labels_path + name_temp + ".txt"
        files_images.append(name_images)
        files_labels.append(name_labels)

    #get the number of images for the different directories
    nb_files = len(glob.glob(src_path + "/images/train/" + "*.jpg"))
    nb_train = round(coef_train * nb_files)
    nb_valid = round(coef_valid * nb_files)
    nb_test = round(coef_test * nb_files)

    if ((nb_train + nb_valid + nb_test) > nb_files):
        nb_test -= 1
    elif ((nb_train + nb_valid + nb_test) < nb_files):
        nb_test += 1

    print("Number of files : ",nb_files)
    print("Number of train files : ",nb_train)
    print("Number of valid files : ",nb_valid)
    print("Number of test files : ",nb_test)

    #create directories
    dest_train = dest_path + "/train"
    dest_valid = dest_path + "/valid"
    dest_test = dest_path + "/test"

    try:
        os.mkdir(dest_path)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_path)

    try:
        os.mkdir(dest_train)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_train)

    try:
        os.mkdir(dest_valid)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_valid)

    try:
        os.mkdir(dest_test)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_test)

    #create image and labels directories

    dest_train_images = dest_train + "/images"
    dest_train_labels = dest_train + "/labels"
    dest_valid_images = dest_valid + "/images"
    dest_valid_labels = dest_valid + "/labels"
    dest_test_images = dest_test + "/images"
    dest_test_labels = dest_test + "/labels"

    try:
        os.mkdir(dest_train_images)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_train_images)

    try:
        os.mkdir(dest_train_labels)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_train_labels)

    try:
        os.mkdir(dest_valid_images)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_valid_images)

    try:
        os.mkdir(dest_valid_labels)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_valid_labels)

    try:
        os.mkdir(dest_test_images)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_test_images)

    try:
        os.mkdir(dest_test_labels)
    except OSError:
        print ("Creation of the directory %s failed or it already exists" % dest_test_labels)

    #put the images in the directories
    for i in range(len(files_images)):
        if i < nb_train:
            if not os.path.exists(dest_train_images + '/' + os.path.basename(files_images[i])): #manage file duplicates
                shutil.move(files_images[i], dest_train_images)
        elif i < nb_valid + nb_train:
            if not os.path.exists(dest_valid_images + '/' + os.path.basename(files_images[i])):
                shutil.move(files_images[i], dest_valid_images)
        else:
            if not os.path.exists(dest_test_images + '/' + os.path.basename(files_images[i])):
                shutil.move(files_images[i], dest_test_images)

    #put the labels in the directories
    for i in range(len(files_labels)):
        if i < nb_train:
            if not os.path.exists(dest_train_labels + '/' + os.path.basename(files_labels[i])):
                shutil.move(files_labels[i], dest_train_labels)
        elif i < nb_valid + nb_train:
            if not os.path.exists(dest_valid_labels + '/' + os.path.basename(files_labels[i])):
                shutil.move(files_labels[i], dest_valid_labels)
        else:
            if not os.path.exists(dest_test_labels + '/' + os.path.basename(files_labels[i])):
                shutil.move(files_labels[i], dest_test_labels)

    #make yaml
    make_yaml(src_path,dest_path)

Now that our function is ready, let's use it ! 
You can change the coefficients if you feel like you want to, but usual trainings take 70% for the training, 20% for the validation and 10% for the test.

In [None]:
#make sure you keep the sum of the coefficients at 1
#usual training takes 0.7, 0.2, 0.1 for train, valid, test

split_pictures(0.7,0.2,0.1)

Number of files :  530
Number of train files :  371
Number of valid files :  106
Number of test files :  53
Creation of the directory /content/data_ready failed or it already exists
Creation of the directory /content/data_ready/train failed or it already exists
Creation of the directory /content/data_ready/valid failed or it already exists
Creation of the directory /content/data_ready/test failed or it already exists
Creation of the directory /content/data_ready/train/images failed or it already exists
Creation of the directory /content/data_ready/train/labels failed or it already exists
Creation of the directory /content/data_ready/valid/images failed or it already exists
Creation of the directory /content/data_ready/valid/labels failed or it already exists
Creation of the directory /content/data_ready/test/images failed or it already exists
Creation of the directory /content/data_ready/test/labels failed or it already exists


Just some cleaning for the old folders

In [None]:
#delete data directory
!rm -r data

Make sure to use the GPU for the training otherwise it will take forever even for a few epochs
If you're using colab go to execute > execution type > use GPU


---


Now we can finaly train our model !

In [None]:
#now we will want to train our own model
#to make our model stronger, we will use transfer learning later on, but for now we only have a simple train to begin with

%cd yolov5

!python3 train.py --data ../data_ready/data.yaml --cfg yolov5s.yaml --batch-size 15 --epochs 5


/content/yolov5
[34m[1mtrain: [0mweights=yolov5s.pt, cfg=yolov5s.yaml, data=../data_ready/data.yaml, hyp=data/hyps/hyp.scratch.yaml, epochs=5, batch_size=15, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, linear_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
[34m[1mgithub: [0mup to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v6.0-241-gf627bc5 torch 1.10.0+cu111 CUDA:0 (Tesla T4, 15110MiB)

[34m[1mhyperparameters: [0mlr0=0.01, lrf=0.1, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0

To have a first overview of our training, we can display the confusion matrix, the precision and the recall to have a first impression of the results

In [None]:
#show the matrixes or other things we can find easily after training

Now you can try some detection from the model that we just made together !
Here we will download some videos we put in a drive so you can try it on them but you can also try to import your own pictures or videos yourself and see how it does

In [None]:
#let's import a quick video to test our model

#ça marche pas trop c'est chiant à refaire


"""
%cd ..
!mkdir test_videos
%cd test_videos
link = 'https://drive.google.com/file/d/1M4z1gaA0ESOF3QhoUHzcoEzi2pSc4c2M/view?usp=sharing'

file_name = wget.download(link)

%cd ../yolov5
"""

"\n%cd ..\n!mkdir test_videos\n%cd test_videos\nlink = 'https://drive.google.com/file/d/1M4z1gaA0ESOF3QhoUHzcoEzi2pSc4c2M/view?usp=sharing'\n\nfile_name = wget.download(link)\n\n%cd ../yolov5\n"

And now launch the detection on the data you chose

In [None]:
#now let's try a quick detect to see how this first model works on a test folder with prerecorded videos

!python3 detect.py --source /content/test_videos --weights /content/yolov5/runs/train/exp2/weights/best.pt

[34m[1mdetect: [0mweights=['/content/yolov5/runs/train/exp2/weights/best.pt'], source=/content/test_videos, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.25, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.0-241-gf627bc5 torch 1.10.0+cu111 CUDA:0 (Tesla T4, 15110MiB)

Fusing layers... 
Model Summary: 213 layers, 7015519 parameters, 0 gradients, 15.8 GFLOPs
video 1/4 (1/321) /content/test_videos/Video2.mp4: 384x640 4 masks, Done. (0.016s)
video 1/4 (2/321) /content/test_videos/Video2.mp4: 384x640 3 masks, Done. (0.015s)
video 1/4 (3/321) /content/test_videos/Video2.mp4: 384x640 3 masks, Done. (0.012s)
video 1/4 (4/321) /content/test_videos/Video2.mp4: 384x640 3 masks, Done. (0.012s)
video 1/4 (

You can go in the folder that the scripts indicates at the end of the execution and see for yourself the results !
Usualy you can just go in the yolov5/runs/detect/exp?? and you'll find your detection

In [None]:
#you can also try out the model with your own webcam if you want, just note you have to be working in a local environement for it to work properly

!python3 detect.py --source 0 --weights /content/yolov5/runs/train/exp2/weights/best.pt

Not so good? A bit disapointed? You thought it was too easy to be true?
You were right.
As you can see, even with only a few trainings you already can have quite promising results, but for it to have a better detection you have to train with a lot more data and epochs.
The problem is that finding data and waiting for the model to train takes a lot of time, that's why we made the training for you so you won't have to suffer the pain we endured waiting hours.


---


With that said, you can now download our own pretrained weights and parameters that have been previously trained by our team to fully enjoy this project

In [None]:
#link to download .pt

Now you can launch a new detection with the proper weights that we have trained and see the results for yourself !

In [None]:
#detect avec les nouveaux poids