## Case Study - Comparing Efficacy for Real-Time Face Detection Models

A crucial factor for success when it comes to real-time face detection system is efficacy.
In this case study, we compare different methods for face detection and compare their efficacy.
The methods are:

- Hog
- Haar
- DNN

Run the following code block to setup imports: 

In [2]:
# Importing packages for CV2, DLIB and Plotting 
import cv2
import dlib
import matplotlib.pyplot as pyplot

# For benchmarking the process of generating the graph
import time

# For creating the progress bar when looping over the dataset
from tqdm.notebook import tqdm


### 1. Code snippets for each model

The following code are the function for each face detection model. The code for the individual models can be found in within `src/models/code/` the `OpenCV_Server` repository. Each file has a function that is called on a single frame. The output is either a list of boundary boxes (`Rect`) or `None`. The functions are moved over to this file for the sake of simplicity.

Run them to save the functions:

In [3]:
# Haar Cascade
def detect_face_haar(img,detectMultipleFaces=False, scale=1.1, neighbors=10, size=50):
    """Detect a face in an image using a pre-trained Haar Cascade model. 

    The model has been trained by OpenCV.
    See: https://opencv.org/

    Args:
        img (numpy.ndarray): 
            Image read from the cv2.imread function. Ut is a numpy
        detectMultipleFaces (boolean): 
            Toggle for returning more than one face detected. Default is false. 
        scale (float, optional): 
            For scaling down the input image, before trying to detect a face. Makes it easier to detect a face with smaller scale. Defaults to 1.1.
        neighbors (int, optional): 
            Amount of neighbor rectangles needed for a face to be set as detected. Defaults to 10.
        size (int, optional): 
            Size of the sliding window that checks for any facial features. Should match the face size in the image, that should be detected. Defaults to 50.

    Returns:
        Rect: Datatype of a rectangle, that overlays the position of the detected face. It has four attributes of intrests: x-position, y-position, 
    """

    # Turing the image into a grayscale image
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Printing the gray scale image
    # print(f"Gray-Scale Image dimension: ({gray_image.shape})")

    # Loading the classifier from a pretrained dataset
    face_classifier = cv2.CascadeClassifier(
        cv2.data.haarcascades + "haarcascade_frontalface_default.xml"
    )

    # Performing the face detection
    faces = face_classifier.detectMultiScale(
        gray_image, scaleFactor=scale, minNeighbors=neighbors, minSize=(size,size)
    )

    # Return amount of 
    if detectMultipleFaces == True:
        return faces
    return faces[0]


# HOG
def detect_face_hog(img,detectMultipleFaces=False):
    # Turing the image into a grayscale image
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    # Define the HOG detector from dlib
    hog_face_detector = dlib.get_frontal_face_detector()

    # Detect faces from the grayscale image
    faces = hog_face_detector(gray_image, 1)

    # Check if only one face is needed
    if detectMultipleFaces == False:
        return faces[0]

    return faces


# DNN
def detect_face_dnn(img, net, framework="caffe", conf_threshold=0.7, detectMultipleFaces=False):
    """
    Detect faces in an image using a deep neural network (DNN).

    Parameters:
    - img: The input image.
    - net: The pre-trained DNN model for face detection.
    - framework: The framework used for the DNN model ("caffe" or "tensorflow").
    - conf_threshold: The confidence threshold for detecting faces.
    - detect_multiple_faces: Boolean flag to detect multiple faces or just the first one.

    Returns:
    - A list of bounding boxes for detected faces or a single bounding box if detectMultipleFaces is False.
    """
    frameHeight = img.shape[0]
    frameWidth = img.shape[1]
    if framework == "caffe":
        blob = cv2.dnn.blobFromImage(img, 1.0, (300, 300), [104, 117, 123], False, False)
    else:
        blob = cv2.dnn.blobFromImage(img, 1.0, (300, 300), [104, 117, 123], True, False)

    net.setInput(blob)
    detections = net.forward()
    bboxes = []
    for i in range(detections.shape[2]):
        confidence = detections[0, 0, i, 2]
        if confidence > conf_threshold:
            x1 = int(detections[0, 0, i, 3] * frameWidth)
            y1 = int(detections[0, 0, i, 4] * frameHeight)
            x2 = int(detections[0, 0, i, 5] * frameWidth)
            y2 = int(detections[0, 0, i, 6] * frameHeight)
            width = x2 - x1
            height = y2 - y1
            bboxes.append((x1, y1, width, height))

    if detectMultipleFaces == True:
        return bboxes  # Return all detected faces
    else:
        return bboxes[0] if bboxes else None # Return the first face or None if no faces are detected

### 2. The DataSet 



In [5]:
# TODO: TEST THAT DATASET CAN BE LOADED IN

# Path to videos 
video_path = "../data/test_data/videos/"

### 3. Metrics

This case study is going to measure two things: 

- FPS 
- Memory usage 

FPS, because it measures the amount of frames processed by each model. The other metric is Memory usage, because we want to see how memory heavy the algorithms are

### 4. Measuring FPS

### TODO: WRITE THIS


Run th

### 5. Measuring Memory Usage

### TODO: WRITE THIS


Run th

### 6. Plotting the result in graphs

### Resources