<a href="https://colab.research.google.com/github/cloud-commander/face-mask-detection/blob/master/utils/Annotate_Entire_Images.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<a href="https://blog.cloudcommander.net" target="_parent"><img src="https://raw.githubusercontent.com/cloud-commander/hexoblog/master/cloud.png" alt="Visit my Blog">
</a>
<br> 
# <span style="font-family:Didot; font-size:3em;"> Cloud Commander </span>

# Pascal VOC Annotation Generation
Creates a Pascal VOC XML annotation file for each image in the supplied folders.

It assumes that the images have already been cropped and the entire image is to be specified as the bounding box.



## Connect to Google Drive

In [1]:
from google.colab import drive

drive.mount('/content/drive/')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive/


## Define file locations, copy and extract

In [2]:
GDRIVE_FOLDER = "/content/drive/My\ Drive/'Machine Learning'/Datasets"
FILE = "RMFD.zip"

In [3]:
%cd {GDRIVE_FOLDER}
!gsutil cp {FILE} /content
%cd /content/
!unzip -o {FILE}

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0009.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0010.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0011.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0012.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0013.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0014.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0015.jpg  
  inflating: self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhouhuimin/1_0_zhouhuimin_0016.jpg  
  inflating: self-built-masked-face-recognition

## Rename files in folders
If the files don't have unique names globally then this can present issues if we move all the files into a single folder later.

In [4]:
import os
def rename(folder_path):
    for path, subdirs, files in os.walk(folder_path):
        for name in files:
            existing_filename = os.path.join(path,name)
            current_dir_name = os.path.basename(path)
            new_filename = current_dir_name + "_" + name
            new_filename = os.path.join(path,new_filename)
            os.rename(existing_filename, new_filename)

In [5]:
rename("/content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/")

## Generate Annotations
Generates XML Annotations for images that have already been cropped

In [7]:
#Generates XML Annotations for images that have already been cropped

import os
from lxml import etree as ET
#import cv2
import numpy as np
from PIL import Image, ImageFile

def annotate_entire_image(IMG_INPUT, CLASS):
	file_counter = 0
	img_counter = 0

	for subdir, dirs, files in os.walk(IMG_INPUT):
		for file in files:
			
			file_counter += 1
			img_path=os.path.join(subdir, file)
			img_name=os.path.basename(img_path)
			

			if img_path.lower().endswith(('.png', '.jpg', '.jpeg')):				
				img = Image.open(img_path)
				h,w,bpp = np.shape(img)
					
				subdir_path, subdir_name = os.path.split(subdir)
				root = ET.Element("annotation", verified="yes")
				ET.SubElement(root, "folder").text = CLASS

				ET.SubElement(root, "filename").text = img_name
				#ET.SubElement(root, "path").text = img_path

				source=ET.SubElement(root, "source")
				ET.SubElement(source, "database").text = "Cloud Commander"

				size=ET.SubElement(root, "size")
				ET.SubElement(size, "width").text = str(w)
				ET.SubElement(size, "height").text = str(h)
				ET.SubElement(size, "depth").text = str(bpp)

				ET.SubElement(root, "segmented").text = "0"

				obj=ET.SubElement(root, "object")
				ET.SubElement(obj, "name").text = CLASS
				ET.SubElement(obj, "pose").text = "Frontal"
				ET.SubElement(obj, "truncated").text = "0"
				ET.SubElement(obj, "difficult").text = "0"

				box=ET.SubElement(obj, "bndbox")
				ET.SubElement(box, "xmin").text = str(0)
				ET.SubElement(box, "ymin").text = str(0)
				ET.SubElement(box, "xmax").text = str(w)
				ET.SubElement(box, "ymax").text = str(h)

				tree = ET.ElementTree(root)
				tree.write(os.path.join(subdir, os.path.splitext(img_name)[0] + '.xml'))
				img_counter += 1

			else :
				with open("delete.txt", "a") as myfile:
						myfile.write(img_path+"\n")
	
	print(f"Result: {file_counter} files were processed and {img_counter} images were annotated")


In [8]:
annotate_entire_image("/content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/","MASKED")

Result: 2203 files were processed and 2203 images were annotated


In [9]:
annotate_entire_image("/content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/","UNMASKED")

Result: 90468 files were processed and 90468 images were annotated


# Save

In [10]:
%cd /content/
!zip -r RMFD_masked_annotated.zip /content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/
!gsutil cp RMFD_masked_annotated.zip {GDRIVE_FOLDER}


/content
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/ (stored 0%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/huhaiquan/ (stored 0%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/huhaiquan/huhaiquan_0_0_2.xml (deflated 43%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/huhaiquan/huhaiquan_0_0_3.jpg (deflated 5%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/huhaiquan/huhaiquan_0_0_3.xml (deflated 43%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/huhaiquan/huhaiquan_0_0_2.jpg (deflated 4%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/yingcaier/ (stored 0%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_masked_face_dataset/yingcaier/yingcaier_0_0_4.jpg (deflated 11%)
  adding: content/sel

In [11]:
%cd /content/
!zip -r RMFD_unmasked_annotated.zip /content/self-built-masked-face-recognition-dataset/AFDB_face_dataset
!gsutil cp RMFD_unmasked_annotated.zip {GDRIVE_FOLDER}

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/0_0_zhoumi_0025.jpg (deflated 3%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/0_0_zhoumi_0081.xml (deflated 43%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/1_0_zhoumi_0074.jpg (deflated 3%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/1_0_zhoumi_0143.jpg (deflated 3%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/1_0_zhoumi_0121.xml (deflated 43%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/1_0_zhoumi_0120.xml (deflated 43%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/0_0_zhoumi_0014.jpg (deflated 3%)
  adding: content/self-built-masked-face-recognition-dataset/AFDB_face_dataset/zhoumi/1_0_zh