
<a href="https://blog.cloudcommander.net" target="_parent"><img src="https://raw.githubusercontent.com/cloud-commander/hexoblog/master/cloud.png" alt="Visit my Blog">
</a>
<br> 
# <span style="font-family:Didot; font-size:3em;"> Cloud Commander </span>


<a href="https://colab.research.google.com/github/cloud-commander/face-mask-detection/blob/master/1_Prepare_Data_Annotate_Images_Part1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open in Colab"></a>
&nbsp;&nbsp;&nbsp;&nbsp;
<a href="https://github.com/cloud-commander/face-mask-detection/blob/master/1_Prepare_Data_Annotate_Images_Part1.ipynb" target="_parent"><img src="https://img.shields.io/static/v1?logo=GitHub&label=&color=333333&style=flat&message=View%20on%20GitHub" alt="View in GitHub"></a>



## Automatic Face Image Annotation  ##

We start off the data preparation phase by pre-processing images in our dataset for the unmasked category.

We have a selection of facial images and we need to draw bounding boxes around them. We could do that manually using a tool such as [labelImg](https://github.com/tzutalin/labelImg) however that would be slow, tedious and unnecessary for our purposes.

Instead we will use a facial detection algorithm (Haar cascade) to automatically detect faces and generate an accompanying XML file for each image with the coordinates of the face. Now this method is far from foolproof as it only works properly with full frontal images of faces but its a good starting point.



### Import required libraries

In [1]:
!wget https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/config/constants.py
!wget https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/utils/annotate.py

!pip install wget

from annotate import *
from constants import * 

import os
import glob
from pathlib import Path
import shutil


--2020-06-10 21:33:12--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/config/constants.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4839 (4.7K) [text/plain]
Saving to: ‘constants.py’


2020-06-10 21:33:13 (67.4 MB/s) - ‘constants.py’ saved [4839/4839]

--2020-06-10 21:33:13--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/utils/annotate.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2725 (2.7K) [text/plain]
Saving to: ‘annotate.py’


2020-06-10 21:33:14 (52.9 MB/s) - ‘

### Prepare dataset ###

#### Connect to Google Drive

In [2]:
from google.colab import drive

drive.mount('/content/drive/')


Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive/


#### Download faces dataset and extract to temp folder


In [3]:
os.chdir(DATASET_DIR_TMP1)
!wget https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/data/1k_faces_00.zip
!unzip -j 1k_faces_00.zip '*.jpg'
!rm 1k_faces_00.zip

--2020-06-10 21:33:47--  https://raw.githubusercontent.com/cloud-commander/face-mask-detection/master/data/1k_faces_00.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 23867089 (23M) [application/zip]
Saving to: ‘1k_faces_00.zip’


2020-06-10 21:33:48 (91.6 MB/s) - ‘1k_faces_00.zip’ saved [23867089/23867089]

Archive:  1k_faces_00.zip
  inflating: 0004STET6P.jpg          
  inflating: 000N7AIAFT.jpg          
  inflating: 00H858UYSD.jpg          
  inflating: 00KPGHV40E.jpg          
  inflating: 00P2LUXJW3.jpg          
  inflating: 00PYA83V1P.jpg          
  inflating: 00SD82OK2A.jpg          
  inflating: 00TMYXA5DF.jpg          
  inflating: 00V5CZZSSO.jpg          
  inflating: 00W5NPIX4S.jpg          
  inflating: 00XN46VW5G.jpg          
  inflating: 01

### Annotate the images
Calls the annotate function to automatically annotate the faces in our image folder and outputs XML files.

In [0]:
annotate(DATASET_DIR_TMP1,DATASET_DIR_TMP2,"UNMASKED")

### Save Dataset

#### Prepare files for archiving

In [0]:
def move_files(source,destination):
    files = os.listdir(source)
    for f in files:
        if f.endswith(('.png', '.jpg', '.jpeg', '.xml')):
          shutil.move(source+"/"+f, destination+"/")

# Move the images and the annotations from the temp 
move_files(DATASET_DIR_TMP1,DATASET_DIR_UNPREP_IMG)
move_files(DATASET_DIR_TMP2,DATASET_DIR_UNPREP_ANNO)

#### Compress Dataset and Save Dataset to Google Drive


In [6]:
!zip -r unmasked-datasetv2.zip {DATASET_DIR_UNPREP}
!gsutil cp unmasked-datasetv2.zip {DRIVE_DEV}

  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/ (stored 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/ (stored 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/1HJCAS6U4B.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/2HWQ2T1E3O.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/03EZPUXP7Y.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/2RUF1V05NB.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/2W8WZ3XVFE.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/4BHJRM78NR.jpg (deflated 0%)
  adding: content/tensorflow/workspace/Face-Mask-Detection/dataset/unprepared/images/4MHTUTPWQ3.jpg (deflated 0%)
  adding: con