# Facemask Live Detection Documentation
This notebook explains how I built a real-time face mask detection system using a webcam, OpenCV, and a trained CNN model. The script uses Haar cascades for face detection and PyTorch for mask classification.

> Just as a note: This notebook isn't meant to actually deploy the live detection feed. To do that, navigate to the live_detect.py file and run that. This notebook only creates a walkthrough of how the `live_detect.py` file was created.

## 1. Introduction & Overview

This notebook provides a comprehensive walkthrough of the real-time face mask detection system we built using a CNN model, OpenCV, and PyTorch. It focuses specifically on the `live_detect.py` script, which connects webcam input to our trained classifier.

**What I learnt:**

- Loading and running a trained CNN model
- Real-time face detection using Haar cascades
- Preprocessing webcam frames for classification
- Displaying model predictions live on the video feed

**Project Structure:**

- `src/`: Source code for training, evaluation, and live detection
- `models/`: CNN model definition and trained weights
- `data/`: Dataset used for training and testing
- `notebooks/`: Documentation and demo notebooks
- `requirements.txt`: Dependencies
- `README.md`: Project instructions

## 2. Live Detection Script Overview

The script performs the following key tasks:
- Loads the pretrained CNN model
- Starts a webcam video stream
- Detects faces using OpenCV’s Haar cascade
- Applies preprocessing to each face
- Uses the CNN to classify mask presence
- Overlays the prediction on the video feed in real time

## 3. Setup & Model Loading

We begin by importing the required libraries, setting the device (CPU or GPU), and loading the trained CNN model.

The model is stored in `models/facemask_cnn.pth` and is defined in `models/cnn.py`.

We'll also define the image transformation pipeline that resizes and converts images for model input.

### Imports

In [6]:
import cv2
import torch
import sys
from torchvision import transforms

### CNN Model Definition
> An explanation for this was outlined in `ModelDocumentation.ipynb`

In [7]:
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(64 * 56 * 56, 128)
        self.fc2 = nn.Linear(128, 2)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 64 * 56 * 56)
        x = F.relu(self.fc1(x))
        return self.fc2(x)

Loading a trained model:
> This block, and blocks from here onwards, are meant for explanation purposes only. They may not run.

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = CNN()
model.load_state_dict(torch.load("models/facemask_cnn.pth", map_location=device))
model = model.to(device)
model.eval()

We define the same CNN architecture used during training. We then load the saved model weights (`facemask_cnn.pth`) and switch the model to evaluation mode using `.eval()`.

## 4. Face Detection Setup

We use OpenCV’s pre-trained Haar Cascade classifier to detect faces in each frame from the webcam feed.

The detected face regions are passed to the model for mask classification.
We also define an image transformation pipeline that:
- Converts the image to PIL format
- Resizes it to 224x224 (expected by the CNN)
- Converts it into a normalized PyTorch tensor


In [10]:
import cv2

# Load Haar cascade for face detection
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")

# Image preprocessing transform
transform = transforms.Compose([
    transforms.ToPILImage(),
    transforms.Resize((224, 224)),
    transforms.ToTensor()
])

## 5. Webcam Feed and Inference Loop

We now start the webcam feed using OpenCV. For each video frame:

1. Detect faces using Haar cascade
2. Preprocess each detected face
3. Pass it through the CNN model
4. Display the prediction (mask / no mask) with confidence on the live frame

Press **'q'** to exit the video stream.

In [None]:
cap = cv2.VideoCapture(1)  # Use 0 if your webcam doesn't open
class_names = ['with_mask', 'without_mask']

while True:
    ret, frame = cap.read()
    if not ret:
        break

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(
        gray, scaleFactor=1.1, minNeighbors=3, minSize=(30, 30), maxSize=(300, 300)
    )

    if len(faces) == 0:
        faces = face_cascade.detectMultiScale(
            gray, scaleFactor=1.05, minNeighbors=2, minSize=(20, 20)
        )

    for (x, y, w, h) in faces:
        face = frame[y:y+h, x:x+w]
        face_rgb = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)
        face_tensor = transform(face_rgb).unsqueeze(0).to(device)

        with torch.no_grad():
            output = model(face_tensor)
            probs = torch.softmax(output, dim=1)
            confidence, predicted = torch.max(probs, 1)

        predicted_class = class_names[int(predicted.item())]
        confidence_val = float(confidence.item()) * 100
        mask_prob = float(probs[0, 0].item()) * 100
        no_mask_prob = float(probs[0, 1].item()) * 100

        display_text = f"{predicted_class}: {confidence_val:.1f}%"
        detailed_text = f"Mask: {mask_prob:.1f}% | No Mask: {no_mask_prob:.1f}%"
        color = (0, 255, 0) if predicted_class == 'with_mask' else (0, 0, 255)

        cv2.rectangle(frame, (x, y), (x+w, y+h), color, 2)
        cv2.putText(frame, display_text, (x, y - 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, color, 2)
        cv2.putText(frame, detailed_text, (x, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

    cv2.imshow("Facemask Detection", frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

### Notes on Inference Loop

- `cap = cv2.VideoCapture(1)`  
  Starts the webcam stream. Use `0` if your webcam doesn't appear on `1`.

- `gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)`  
  Converts the frame to grayscale — Haar cascades work best on grayscale images.

- `detectMultiScale(...)`  
  Detects all faces in the frame. The backup detector with lower thresholds runs if no faces are found on the first pass.

- `face = frame[y:y+h, x:x+w]`  
  Crops the face region from the frame based on bounding box coordinates.

- `transform(...)`  
  Applies resizing and tensor conversion to prepare the cropped face for the CNN.

- `torch.no_grad()`  
  Disables gradient tracking for faster inference — required during evaluation.

- `probs = torch.softmax(output, dim=1)`  
  Converts logits to class probabilities.

- `cv2.rectangle(...)` and `cv2.putText(...)`  
  Draw a colored box around the face and display prediction text directly on the frame.

- `'q'` to quit  
  Pressing `'q'` stops the loop, releases the camera, and closes OpenCV windows.


## 6. Summary & Takeaways

This notebook documented the real-time face mask detection system step-by-step, based on the `live_detect.py` script.

### What I built:
- A webcam-based application using OpenCV for live video feed
- Face detection using Haar cascades
- Face mask classification using a pretrained CNN model
- Real-time prediction display with confidence percentages

### Key learnings:
- How to preprocess webcam frames for deep learning models
- How to integrate PyTorch models with OpenCV pipelines
- How to use Haar cascades for efficient face detection
- How to visualize model predictions in real time

> This notebook is for documentation only and not meant to run live camera code inside Jupyter.
