
<a href="https://blog.cloudcommander.net" target="_parent"><img src="https://raw.githubusercontent.com/cloud-commander/hexoblog/master/cloud.png" alt="Visit my Blog">
</a>
<br> 
# <span style="font-family:Didot; font-size:3em;"> Cloud Commander </span>


<a href="https://colab.research.google.com/github/cloud-commander/face-mask-detection/blob/master/1_Prepare_Data_Part1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"></a>
&nbsp;&nbsp;&nbsp;&nbsp;
<a href="https://github.com/cloud-commander/face-mask-detection/blob/master/1_Prepare_Data_Part1.ipynb" target="_parent"><img src="https://img.shields.io/static/v1?logo=GitHub&label=&color=333333&style=flat&message=View%20on%20GitHub" alt="View in GitHub"></a>



# Automatic Face Image Annotation  ##

We start off the data preparation phase by pre-processing images in our dataset for the unmasked category.

We have a selection of facial images and we need to draw bounding boxes around them. We could do that manually using a tool such as [labelImg](https://github.com/tzutalin/labelImg) however that would be slow, tedious and unnecessary for our purposes.

Instead we will the face_recognition library (which employs a CNN) to automatically detect faces and generate an accompanying XML file for each image with the coordinates of the face. Now this method is far from foolproof as it only works properly with full frontal images of faces but its a good starting point. To avoid errors further down the line, you should check the results manually. 



## Import required libraries

In [1]:
!wget https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/config/constants.py
from constants import *
!wget {ANNOTATE}
!wget {MOVE_FILES}

!pip install face_recognition
!pip install wget

from annotate import * 
from move_files import * 

import os
import glob
from pathlib import Path
import shutil


--2020-06-16 18:21:31--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/config/constants.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6059 (5.9K) [text/plain]
Saving to: ‘constants.py’


2020-06-16 18:21:31 (74.6 MB/s) - ‘constants.py’ saved [6059/6059]

--2020-06-16 18:21:32--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/utils/annotate.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2289 (2.2K) [text/plain]
Saving to: ‘annotate.py’


2020-06-16 18:21:33 (35.3 MB/s) - ‘

## Prepare dataset ###

### Connect to Google Drive

In [2]:
from google.colab import drive

drive.mount('/content/drive/')


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive/


### Download faces dataset

In [3]:
%cd {DATASET_DIR_TMP1}
!wget https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/data/100_faces_320x320_dataset_set_1.zip
!unzip -j 100_faces_320x320_dataset_set_1.zip '*.jpg'
!rm 100_faces_320x320_dataset_set_1.zip

/content/Face-Mask-Detection/dataset/tmp1
--2020-06-16 18:22:48--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/data/100_faces_320x320_dataset_set_1.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2585344 (2.5M) [application/zip]
Saving to: ‘100_faces_320x320_dataset_set_1.zip’


2020-06-16 18:22:49 (15.0 MB/s) - ‘100_faces_320x320_dataset_set_1.zip’ saved [2585344/2585344]

Archive:  100_faces_320x320_dataset_set_1.zip
  inflating: 000N7AIAFT.jpg          
  inflating: 00H858UYSD.jpg          
  inflating: 00KPGHV40E.jpg          
  inflating: 00P2LUXJW3.jpg          
  inflating: 00PYA83V1P.jpg          
  inflating: 00SD82OK2A.jpg          
  inflating: 00TMYXA5DF.jpg          
  inflating: 00V5CZZSSO.jpg          
  inflati

## Annotate the images
Calls the annotate function to automatically annotate the faces in our image folder and outputs XML files.

In [4]:
annotate(DATASET_DIR_TMP1,DATASET_DIR_TMP2,"UNMASKED")

## Save Dataset

### Move files from temp to correct location

In [5]:
#Return a list of items that have both images and XML files
matches = compare_intersect(DATASET_DIR_TMP1,DATASET_DIR_TMP2)

#Move the images and the annotations from the temp location
move_files(DATASET_DIR_TMP1,DATASET_DIR_UNPREP_IMG, matches)
move_files(DATASET_DIR_TMP2,DATASET_DIR_UNPREP_ANNO, matches)

Files moved: 100
Files moved: 100


### Compress Dataset and Save Dataset to Google Drive


In [9]:
%cd /content/
!zip -r part1-datasetv1.zip {DATASET_DIR_UNPREP}
!gsutil cp part1-datasetv1.zip {DRIVE_DEV}

/content
  adding: content/Face-Mask-Detection/dataset/unprepared/ (stored 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/ (stored 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0GHTF562EI.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/00KPGHV40E.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0J87GRUOWJ.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0GMDP1DZOX.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0KGASNTJV4.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0E48SPR74Z.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0ENKTDGKFN.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0DI8PN1TGQ.jpg (deflated 0%)
  adding: content/Face-Mask-Detection/dataset/unprepared/images/0CV9DYBJPJ.jpg (deflated 0%)
  addin