First we need to download the sample data. As the sample data link is in a Google Drive , we need to import "Subprocess" library to extract the sample video file.

In [None]:
import subprocess # Importing Subprocess library to download sample video from gdrive link.

subprocess.run(["gdown", "--fuzzy", "https://drive.google.com/file/d/1nyeeqBJyDr2zphBDQ9ruh99JBdYm4nPH/view?usp=sharing", "-O", "test_video.mp4"])


Downloading...
From: https://drive.google.com/uc?id=1nyeeqBJyDr2zphBDQ9ruh99JBdYm4nPH
To: /home/happy/AIML-projects/face_detection/test_video.mp4
100%|██████████| 7.54M/7.54M [00:01<00:00, 4.41MB/s]


CompletedProcess(args=['gdown', '--fuzzy', 'https://drive.google.com/file/d/1nyeeqBJyDr2zphBDQ9ruh99JBdYm4nPH/view?usp=sharing', '-O', 'test_video.mp4'], returncode=0)

The download is complete and now we have a sample data named "test_video.mp4".

For building up the model , first we need to import important libraries and packages such as :
1. CV2(OpenCV)
2. Numpy
3. Torch(PyTorch)
4. YOLO(You Only Look Once)


In [None]:
import cv2
import numpy as np
import os
import torch
from ultralytics import YOLO

cv2 (OpenCV): Used for video processing (reading, writing, and displaying video frames).

numpy: Handles numerical operations like manipulating image arrays and bounding box coordinates.

torch: Provides deep learning capabilities to run and optimize YOLO models on CPU or GPU.

ultralytics: Contains the YOLO implementation for loading pre-trained models and performing object detection.

For this model i am using "yolov8m-face-lindevs.pt" pretrained model which i downloaded from linked devs github repositry for face detection.

In [21]:
# Load YOLOv8 face detection model
model = YOLO('yolov8m-face-lindevs.pt')

Lets make variables for the input and output sample video.

In [None]:
# Input and output video paths variables
input_video = 'test_video.mp4'
output_video = 'output_video.mp4'

Now me make inference on the input sample video using the pretrained yolov8m model.

In [23]:
# Run inference on the video
results = model.predict(source=input_video, device='cuda', stream=True)

In [None]:
# Capture input video properties
cap = cv2.VideoCapture(input_video)
fps = int(cap.get(cv2.CAP_PROP_FPS))                  # Frames per second
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))  # Frame width
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))# Frame height
cap.release()


Now we will check if there is a output directory or not if not then we create one and if yes then we will store the output video there.

In [25]:
# Ensure output directory exists
os.makedirs('runs/detect', exist_ok=True)

# Set up video writer to save output
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # Codec for mp4 format
out = cv2.VideoWriter(f'runs/detect/{output_video}', fourcc, fps, (frame_width, frame_height))


Now we will process frames in the video one-by-one and we are giving the face_class_id = 0 to check it there are any faces in the video. If yes then it create bounding boxes around them.

In [26]:
# Process frames and draw face-only boxes
for result in results:
    frame = result.orig_img  # Original frame
    face_class_id = 0        # Face class ID

    # Draw bounding boxes around faces only
    for box in result.boxes:
        if int(box.cls) == face_class_id:  # Filter face detections
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)  # Green bounding box

    # Write frame to output video
    out.write(frame)

# Release video writer
out.release()
print(f"✅ Face detection video saved as: runs/detect/{output_video}")



video 1/1 (frame 1/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 32.8ms
video 1/1 (frame 2/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 30.1ms
video 1/1 (frame 3/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 29.9ms
video 1/1 (frame 4/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 25.7ms
video 1/1 (frame 5/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 25.6ms
video 1/1 (frame 6/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 25.6ms
video 1/1 (frame 7/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 25.4ms
video 1/1 (frame 8/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no detections), 25.3ms
video 1/1 (frame 9/956) /home/happy/AIML-projects/face_detection/test_video.mp4: 384x640 (no de

Lets check if there are any faces in the sample video.


In [20]:
print(model.names)


{0: 'face'}


It's returning class "face" means there are faces present in the sample video.

Now lets run the output video to check if the bounding boxes are correct or not. first we need to open terminal and download VLC player is not installed already using "sudo apt update && sudo apt install vlc" and then you can run the ouput video by running command "vlc runs/detect/predict/test_video.avi".


In [27]:
import shutil
shutil.move('runs/detect/predict/test_video.avi', 'output_video.mp4')

print("Output saved to: output_video.mp4")

Output saved to: output_video.mp4
