# Masked Enforcerer – Face Mask Detection with YOLOv8

Author: **Jose Moreno**  
Course: **ITAI 1378 – Computer Vision & AI**  
Tier 1 Core Project: **Automated Mask Compliance Detection**

This notebook implements the Tier‑1 project described in the *Masked Enforcerer* proposal. It trains a YOLOv8 object detection model to classify people as **with mask**, **without mask**, or **mask worn incorrectly**, and then lets the user **upload three images** to test the trained model.

**High‑level pipeline:**
1. Set up the environment (PyTorch + Ultralytics YOLOv8).
2. Download / upload a public *Face Mask Detection* dataset (Kaggle, 3‑class).
3. Convert PASCAL VOC XML annotations to YOLO format.
4. Split data into **train (70%)**, **val (15%)**, **test (15%)**.
5. Train a **YOLOv8n** model (fast & light) on the dataset.
6. Evaluate basic metrics (YOLO’s training summary).
7. Let the user **upload three images** and run inference to label mask usage.


## 1. Environment setup

This notebook was created thanks to the settings and tools available on **Google Colab**, but it will also work in a local Jupyter environment as long as there is installed on the machine a GPU‑enabled PyTorch + Ultralytics stack. Following this, the environment setup is initialized to serve as foundations for the project


In [6]:
%%capture
!pip install ultralytics kaggle gdown

import torch
from ultralytics import YOLO

print('PyTorch version:', torch.__version__)
print('CUDA available:', torch.cuda.is_available())
if torch.cuda.is_available():
    print('GPU name:', torch.cuda.get_device_name(0))
else:
    print('Running on CPU – training will be slower.')

## 2. Dataset: 3‑class Face Mask Detection (public)

For the final project I have decided to work with the public **Face Mask Detection** dataset (Kaggle, by `andrewmvd`), which contains:
- **853 images**
- **3 classes**: `with_mask`, `without_mask`, `mask_weared_incorrect`
- **PASCAL VOC** XML annotations with bounding boxes for each face

Because the Kaggle API requires ownership of credentials, this notebook supports **two workflows** were explored:

### Option A – Download automatically via Kaggle API (recommended)
1. Go to Kaggle → Account → *Create New API Token* → download `kaggle.json`.
2. Upload `kaggle.json` to `/content`.
3. Run the cell below (it will install Kaggle CLI, move the token, and download the dataset).
-----------------------------------------------------
### Option B – Manual upload
Manually download the zip file containing the dataset from the Kaggle API and then uploading it on the editor, in this case colab

-----------------------------------------------------

It was later decided that to make the project more accessible to the public, that the best approach would be to upload the dataset in google drive instead of manually uploading it on colab


In [7]:
import os, zipfile, pathlib, gdown

DATASET_DIR = pathlib.Path('face-mask-detection')
DATA_ZIP_PATH = pathlib.Path('archive.zip')

# File ID from your shared Google Drive link (already public)
DRIVE_FILE_ID = "1peUij_PsdCdWfE5GV6ZPi1i6tH7Wu1qG"

def download_from_gdrive():
    """Download the face-mask-detection dataset ZIP from Google Drive (public link).

    This will download 'archive.zip' and save it in the current working directory.
    """
    url = f"https://drive.google.com/uc?id={DRIVE_FILE_ID}"
    print("Downloading dataset zip from Google Drive...")
    gdown.download(url, str(DATA_ZIP_PATH), quiet=False)
    print("Download complete. Saved to", DATA_ZIP_PATH.resolve())

print("Starting Google Drive dataset download automatically...")
download_from_gdrive()


Starting Google Drive dataset download automatically...
Downloading dataset zip from Google Drive...


Downloading...
From (original): https://drive.google.com/uc?id=1peUij_PsdCdWfE5GV6ZPi1i6tH7Wu1qG
From (redirected): https://drive.google.com/uc?id=1peUij_PsdCdWfE5GV6ZPi1i6tH7Wu1qG&confirm=t&uuid=01f58438-6578-4272-89d4-7be837aac006
To: /content/archive.zip
100%|██████████| 417M/417M [00:02<00:00, 140MB/s]

Download complete. Saved to /content/archive.zip





## 3. Unzip dataset and inspect structure

This cell looks for a **`.zip`** file in the current directory (for example, the Kaggle download) and
extracts it into a `face-mask-detection/` folder that contains `images/` and `annotations/`.


In [8]:
import os, zipfile, glob
from pathlib import Path

# Find any zip file if the dataset dir does not already exist
if not DATASET_DIR.exists():
    zip_candidates = [f for f in os.listdir('.') if f.lower().endswith('.zip')]
    if not zip_candidates:
        raise FileNotFoundError(
            'No .zip file found. Either run download_with_kaggle() first or manually upload the Kaggle dataset zip.'
        )

    zip_path = zip_candidates[0]
    print('Using zip file:', zip_path)

    with zipfile.ZipFile(zip_path, 'r') as zf:
        zf.extractall(DATASET_DIR)

    print('Extracted dataset to:', DATASET_DIR.resolve())
else:
    print('Dataset directory already exists at', DATASET_DIR.resolve())

# Inspect basic structure
for root, dirs, files in os.walk(DATASET_DIR):
    level = root.replace(str(DATASET_DIR), '').count(os.sep)
    indent = ' ' * 2 * level
    print(f"{indent}{os.path.basename(root)}/")
    subindent = ' ' * 2 * (level + 1)
    for f in files[:5]:  # show only a few files per directory
        print(f"{subindent}{f}")

Using zip file: archive.zip
Extracted dataset to: /content/face-mask-detection
face-mask-detection/
  annotations/
    maksssksksss356.xml
    maksssksksss33.xml
    maksssksksss722.xml
    maksssksksss484.xml
    maksssksksss833.xml
  images/
    maksssksksss490.png
    maksssksksss256.png
    maksssksksss402.png
    maksssksksss405.png
    maksssksksss3.png


## 4. Convert PASCAL VOC XML → YOLO format + Train/Val/Test split

The Kaggle dataset annotations are in **PASCAL VOC** XML format. YOLOv8 expects one `.txt` file per image, with each line containing:

```text
<class_id> <cx> <cy> <w> <h>
```

All coordinates are **normalized** to `[0, 1]` relative to the image width/height.

In this step the model will:
1. Map classes to indices:
   - `with_mask` → **0**
   - `without_mask` → **1**
   - `mask_weared_incorrect` → **2**
2. Randomly split the images into **train (70%)**, **val (15%)**, **test (15%)**.
3. Copy images into `images/{train,val,test}/` and create labels in `labels/{train,val,test}/`.


In [9]:
import xml.etree.ElementTree as ET
from pathlib import Path
import random, shutil

random.seed(42)

annotations_dir = next(DATASET_DIR.glob('**/annotations'))
images_dir = next(DATASET_DIR.glob('**/images'))

print('Annotations dir:', annotations_dir)
print('Images dir     :', images_dir)

# Class mapping for YOLO
CLASS_MAP = {
    'with_mask': 0,
    'without_mask': 1,
    'mask_weared_incorrect': 2
}

image_paths = sorted(list(images_dir.glob('*.png')) + list(images_dir.glob('*.jpg')))
print('Total images found:', len(image_paths))

random.shuffle(image_paths)
n_total = len(image_paths)
n_train = int(0.70 * n_total)
n_val = int(0.15 * n_total)
n_test = n_total - n_train - n_val

splits = {
    'train': image_paths[:n_train],
    'val': image_paths[n_train:n_train + n_val],
    'test': image_paths[n_train + n_val:]
}

print(f'Train: {len(splits["train"])}  Val: {len(splits["val"])}  Test: {len(splits["test"])}')

for split, paths in splits.items():
    img_out_dir = DATASET_DIR / 'images' / split
    lbl_out_dir = DATASET_DIR / 'labels' / split
    img_out_dir.mkdir(parents=True, exist_ok=True)
    lbl_out_dir.mkdir(parents=True, exist_ok=True)

    for img_path in paths:
        # Copy image
        shutil.copy2(img_path, img_out_dir / img_path.name)

        # Build XML path and parse
        xml_path = annotations_dir / f"{img_path.stem}.xml"
        if not xml_path.exists():
            print('Warning: missing annotation for', img_path.name)
            continue

        tree = ET.parse(str(xml_path))
        root_xml = tree.getroot()
        w = float(root_xml.find('./size/width').text)
        h = float(root_xml.find('./size/height').text)

        yolo_lines = []
        for obj in root_xml.findall('object'):
            cls_name = obj.find('name').text
            if cls_name not in CLASS_MAP:
                continue
            cls_id = CLASS_MAP[cls_name]

            bbox = obj.find('bndbox')
            xmin = float(bbox.find('xmin').text)
            ymin = float(bbox.find('ymin').text)
            xmax = float(bbox.find('xmax').text)
            ymax = float(bbox.find('ymax').text)

            # Convert to YOLO (normalized center x,y + width/height)
            cx = (xmin + xmax) / 2.0 / w
            cy = (ymin + ymax) / 2.0 / h
            bw = (xmax - xmin) / w
            bh = (ymax - ymin) / h

            yolo_lines.append(f"{cls_id} {cx:.6f} {cy:.6f} {bw:.6f} {bh:.6f}")

        label_path = lbl_out_dir / f"{img_path.stem}.txt"
        with open(label_path, 'w') as f:
            f.write('\n'.join(yolo_lines))

print('VOC → YOLO conversion complete.')


Annotations dir: face-mask-detection/annotations
Images dir     : face-mask-detection/images
Total images found: 853
Train: 597  Val: 127  Test: 129
VOC → YOLO conversion complete.


## 5. Create YOLOv8 dataset config (`face_mask.yaml`)

YOLOv8 uses a small YAML file to describe where the images and labels live, and what the class names are.


In [10]:
data_yaml_path = Path('face_mask.yaml')

data_yaml_content = f"""
path: {DATASET_DIR.resolve()}
train: images/train
val: images/val
test: images/test

names:
  0: with_mask
  1: without_mask
  2: mask_weared_incorrect
""".strip() + "\n"

with open(data_yaml_path, 'w') as f:
    f.write(data_yaml_content)

print('Wrote YOLO dataset config to', data_yaml_path.resolve())
print('\n=== face_mask.yaml ===')
print(data_yaml_content)


Wrote YOLO dataset config to /content/face_mask.yaml

=== face_mask.yaml ===
path: /content/face-mask-detection
train: images/train
val: images/val
test: images/test

names:
  0: with_mask
  1: without_mask
  2: mask_weared_incorrect



## 6. Train YOLOv8n on the face mask dataset

Working with a pre‑trained YOLOv8n model (small, fast) on the face mask dataset. This is were the training proper will begin, with a set number of epochs which will determine how many times the dataset will be scanned.

You can adjust hyperparameters (epochs, image size, batch size) as needed. For Colab Free, **20–30 epochs** is usually a good balance between speed and performance.


In [11]:
from ultralytics import YOLO

# Load pre-trained YOLOv8n model
model = YOLO('yolov8n.pt')

results = model.train(
    data='face_mask.yaml',
    epochs=10,
    imgsz=640,
    batch=16,
    patience=10,
    verbose=True,
    project='runs',
    name='mask_yolov8n',
    exist_ok=True
)

print('Training complete. Best weights saved to runs/mask_yolov8n/weights/best.pt')


[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt': 100% ━━━━━━━━━━━━ 6.2MB 99.6MB/s 0.1s
[KDownloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf': 100% ━━━━━━━━━━━━ 755.1KB 26.5MB/s 0.0s
[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt': 100% ━━━━━━━━━━━━ 5.4MB 120.7MB/s 0.0s
[K[34m[1mtrain: [0mScanning /content/face-mask-detection/labels/train... 597 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 597/597 889.7it/s 0.7s
[K[34m[1mval: [0mScanning /content/face-mask-detection/labels/val... 127 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 127/127 438.5it/s 0.3s
[K       1/10      2.76G        1.8      3.005      1.458         23        640: 100% ━━━━━━━━━━━━ 38/38 2.1it/s 17.8s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 4/4 1.0s/it 4.0s
[K       2/10   

## 7. Quick evaluation summary

YOLOv8 logs metrics like **mAP**, **precision**, and **recall** during training. In Colab, you can open the `runs/mask_yolov8n` folder to inspect:
- `results.png` – overall training curves (loss, mAP, precision, recall).
- `confusion_matrix.png` – class confusion.
- `PR_curve.png` – precision–recall curve.

The cell below just prints where to find these artifacts.


In [12]:
from pathlib import Path

run_dir = Path('runs') / 'mask_yolov8n'
print('Training artifacts located at:', run_dir.resolve())
print('Check this folder for results.png, confusion_matrix.png, PR_curve.png, etc.')


Training artifacts located at: /content/runs/mask_yolov8n
Check this folder for results.png, confusion_matrix.png, PR_curve.png, etc.


## 8. Inference: upload three images and detect mask usage

**Important: if there are no sample images available, refer to the main page of the repository and click on the folder named "stock" then use any of the images contained on that folder to proceed with the test.**

Now that the model is trained, you can **upload any three images** containing people’s faces. The model will:
- Detect faces.
- Classify each detection as `with_mask`, `without_mask`, or `mask_weared_incorrect`.
- Save annotated images to a new `runs/detect/` folder.

**Instructions:**
1. Run the cell below.
2. Use the file picker to upload up to three images from your computer.
3. After inference, open the `runs/detect/` folder in the Colab file browser to view the labeled outputs.


In [13]:
from google.colab import files

# Load best model weights from training
inference_model = YOLO('runs/mask_yolov8n/weights/best.pt')

print('Please select up to three images (JPG/PNG)...')
uploaded = files.upload()

image_paths = list(uploaded.keys())
print('Running inference on:', image_paths)

results = inference_model.predict(
    source=image_paths,
    conf=0.35,
    imgsz=640,
    save=True
)

print('Inference complete. Check the latest folder inside runs/detect/ for annotated images.')


Please select up to three images (JPG/PNG)...


Saving 106467352-1585602933667virus-medical-flu-mask-health-protection-woman-young-outdoor-sick-pollution-protective-danger-face_t20_o07dbe.jpg to 106467352-1585602933667virus-medical-flu-mask-health-protection-woman-young-outdoor-sick-pollution-protective-danger-face_t20_o07dbe.jpg
Saving images (8).jpeg to images (8).jpeg
Saving istockphoto-1044252330-612x612.jpg to istockphoto-1044252330-612x612.jpg
Running inference on: ['106467352-1585602933667virus-medical-flu-mask-health-protection-woman-young-outdoor-sick-pollution-protective-danger-face_t20_o07dbe.jpg', 'images (8).jpeg', 'istockphoto-1044252330-612x612.jpg']
Inference complete. Check the latest folder inside runs/detect/ for annotated images.


In [14]:
from ultralytics.utils import ops

# Class names in the correct order:
class_names = ["with_mask", "without_mask", "mask_weared_incorrect"]

print("DETECTION RESULTS:\n")

for i, r in enumerate(results):
    print(f"Image {i+1}: {r.path}")

    boxes = r.boxes
    if boxes is None or len(boxes) == 0:
        print("  No detections.\n")
        continue

    for b in boxes:
        cls_id = int(b.cls[0])
        conf = float(b.conf[0])
        label = class_names[cls_id]

        print(f"  → {label}  (confidence: {conf:.2f})")

    print()  # blank line between images


DETECTION RESULTS:

Image 1: 106467352-1585602933667virus-medical-flu-mask-health-protection-woman-young-outdoor-sick-pollution-protective-danger-face_t20_o07dbe.jpg
  → with_mask  (confidence: 0.95)

Image 2: images (8).jpeg
  → with_mask  (confidence: 0.74)

Image 3: istockphoto-1044252330-612x612.jpg
  → without_mask  (confidence: 0.76)



## 9. Optional: batch evaluation on the held‑out test set

This optional cell runs inference on the **test split** and writes predictions and labels into a `runs/detect/mask_test/` folder. You can use this for more detailed error analysis if desired. This it not necessary for the project to be functional however given that this evaluates the entire dataset and serves the purpose of test evaluation its better to use this to evaluate the model as a whole.


In [15]:
test_images_dir = DATASET_DIR / 'images' / 'test'
print('Test images dir:', test_images_dir)

test_results = inference_model.predict(
    source=str(test_images_dir),
    conf=0.35,
    imgsz=640,
    save=True,
    project='runs',
    name='mask_test',
    exist_ok=True
    )
print('Saved test predictions to runs/mask_test')

Test images dir: face-mask-detection/images/test
Saved test predictions to runs/mask_test
