# Face Mask Classification with Transfer Learning (ResNet18)

This notebook implements a full training pipeline for classifying face-mask usage
using the **Face Mask Detection** dataset from Kaggle.

We use:

- **Transfer learning** with `ResNet18` pretrained on ImageNet  
- PyTorch + TorchVision's new **multi-weight API**  
- A custom dataset that reads bounding boxes from VOC XML annotations  
- Dynamic face cropping inside `__getitem__`  
- A clean project structure with `data_utils.py` and `model_utils.py`

The goal is to build a model that predicts three classes:
1. `with_mask`
2. `without_mask`
3. `mask_weared_incorrect`

This notebook handles:
- downloading the dataset  
- parsing annotations  
- creating DataLoaders  
- building and training the model  
- visualising learning curves  
- final evaluation on the held-out test set  

In [4]:
import sys
sys.platform

'linux'

In [None]:
!ls /content

sample_data


In [5]:
from data_utils import download_kaggle_dataset, create_dataloaders
from model_utils import create_model, train_model, evaluate_model

import torch
import matplotlib.pyplot as plt

ModuleNotFoundError: No module named 'data_utils'

## 1. Download the Kaggle Dataset

The dataset is downloaded automatically using the Kaggle CLI.  
If the dataset already exists locally, it will be reused.

This step creates a directory containing:
- images/
- annotations/

where each annotation is a PASCAL VOC XML file containing bounding boxes and labels.

In [4]:
dataset_root = download_kaggle_dataset()
dataset_root

Using existing dataset at data\face-mask-detection


WindowsPath('data/face-mask-detection')

## 2. Create DataLoaders

We use a custom parsing function that reads each XML file and extracts:

- the image path  
- the bounding box for each face  
- the class label  

Each face becomes one training sample.

Inside the Dataset class, faces are **cropped dynamically** using PIL, then
transformed using the preprocessing associated with the pretrained weights.

This step returns:

- `train_loader`
- `val_loader`
- `test_loader`
- `class_to_idx` (mapping from class names to label indices)

In [5]:
train_loader, val_loader, test_loader, class_to_idx = create_dataloaders(dataset_root)
class_to_idx

{'mask_weared_incorrect': 0, 'with_mask': 1, 'without_mask': 2}

## 3. Build a List of Class Names

`class_to_idx` maps class names → indices (e.g. `"with_mask": 0`).

For evaluation we need the inverse mapping in the correct order (index → name).

In [7]:
class_names = [None] * len(class_to_idx)
for name, idx in class_to_idx.items():
    class_names[idx] = name # pyright: ignore[reportArgumentType, reportCallIssue]

class_names

['mask_weared_incorrect', 'with_mask', 'without_mask']

## 4. Create the Transfer Learning Model

We use `ResNet18` with pretrained ImageNet weights.  
The final fully connected layer is replaced with a custom classifier head:

Linear(512 → 256) → ReLU → Dropout → Linear(256 → num_classes)

Freezing the backbone is optional, but here we fine-tune all layers (`freeze_backbone=False`)
to increase performance on this relatively small dataset.

In [8]:
model = create_model(num_classes=len(class_to_idx),
                     backbone="resnet18",
                     pretrained=True,
                     freeze_backbone=False)

model

ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
  

## 5. Train the Model

The training loop:

- moves batches to the GPU (if available)
- performs forward and backward passes
- updates model weights using Adam
- computes training and validation accuracy/loss each epoch

All metrics are stored in a `history` dictionary so we can plot them afterwards.

In [9]:
model, history = train_model(
    model,
    train_loader,
    val_loader,
    num_epochs=10,
    lr=1e-3
)

Epoch 1/10 Train loss: 0.3246 acc: 0.892 | Val loss: 0.2460 acc: 0.934
Epoch 2/10 Train loss: 0.2525 acc: 0.922 | Val loss: 0.2525 acc: 0.918


KeyboardInterrupt: 