# __Objective:__ Given the input video file, localize and draw bounding boxes around the face of characters.

In [None]:
#downloading the video file as test_video.mp4
! gdown --fuzzy https://drive.google.com/file/d/1nyeeqBJyDr2zphBDQ9ruh99JBdYm4nPH/view?usp=sharing --o test_video.mp4

Downloading...
From: https://drive.google.com/uc?id=1nyeeqBJyDr2zphBDQ9ruh99JBdYm4nPH
To: /content/test_video.m4
  0% 0.00/7.54M [00:00<?, ?B/s] 63% 4.72M/7.54M [00:00<00:00, 28.2MB/s]100% 7.54M/7.54M [00:00<00:00, 42.1MB/s]


### Approach/Steps

I am using OpenCV's Cascade classifier linked [here](https://docs.opencv.org/3.4/db/d28/tutorial_cascade_classifier.html)

So, for that, importing the open cv library.

My approach is as follows:

1. Upload the video on Colab
2. Convert the video into an Open CV object.
3. For each frame of the video:<br>
   3.1 Convert it into grayscale.<br>
   3.2 The detectMultiScale method of the face_cascade object is used to detect faces in the grayscale frame.<br>
   3.3 Draw a rectangle around each detected face in the frame.<br>
   3.4 The processed frame is saved into the output_images directory as a jpg image.

4. Download the files from the folder output_images. There will be as many files as the number of frames in the video. Downloading multiple files in colab manually is not posssible, so I wrote the code to download them all at once in a zip file. I took some help from google and stackoverflow to achieve this.

5. After that on my local system, I used FFmpeg framework to create a video from all the downloaded images (frames). I had to do this locally, as earlier I tried to produce an output video in colab only using OpenCVs videowriter object, however, the video wasn't loading.



In [None]:
# imports
import cv2
import os
from zipfile import ZipFile

In [None]:
#Step 1
! gdown --fuzzy https://drive.google.com/file/d/1guTg2NijehjIWREaMhS8pz-B8L29VH5A/view?usp=sharing --o test_video.mp4

In [None]:
#Creating a directory to store the processd images with faces tracked
output_dir = "output_images"
os.makedirs(output_dir, exist_ok=True)

In [None]:
#Loading the haar cascade classifier from cv2
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')

#Step2
video_path = "test_video.mp4"
video = cv2.VideoCapture(video_path)

#Step3
while True:
    ret, frame = video.read()
    if not ret:
        break
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (255, 0, 0), 2)

    # Save the processed frame as an image file
    frame_number = int(video.get(cv2.CAP_PROP_POS_FRAMES))
    image_path = os.path.join(output_dir, f"frame_{frame_number:04d}.jpg")
    cv2.imwrite(image_path, frame)




In [None]:
#Step 4
folder_path = '/content/output_images'

# Create a zip file with the folder name
zip_filename = os.path.basename(folder_path) + '.zip'
with ZipFile(zip_filename, 'w') as zip_file:
  for root, _, files in os.walk(folder_path):
    for filename in files:
      file_path = os.path.join(root, filename)
      zip_file.write(file_path, os.path.relpath(file_path, os.path.join(folder_path, '..')))

# Download the zip file using Colab files.download
from google.colab import files
files.download(zip_filename)

For step 5, I used the following code:

`ffmpeg -r 30 -f image2 -s 640x480 -i output_images/frame_%04d.jpg -vcodec libx264 -crf 25 -pix_fmt yuv420p output.mp4`

The output_images is the folder with all the face detected frames. Here the output video will be of resolution 640*480 with a frame rate of 30 FPS.

In [None]:
# result video
! gdown --fuzzy https://drive.google.com/file/d/1Hw-UbixS3GzXc_ThYlY6wf8DLcb5jGxu/view?usp=sharing --o output_video.mp4

Downloading...
From: https://drive.google.com/uc?id=1Hw-UbixS3GzXc_ThYlY6wf8DLcb5jGxu
To: /content/output_video.mp4
100% 6.10M/6.10M [00:00<00:00, 12.2MB/s]


## FAQs

#### About the video processing library and localization model you used?

Video processing library: OpenCV <br>
Localization model: Pre-trained Haar Cascade classifier for detecting frontal faces (part of OpenCVs object detection models)



#### What better approach I might have implemented?

Firstly, I would like to increase the accuracy of my current approach. For a better approach I would say that I would create a new object detection model and train it according to the problem statement. However, a better approach would be to use transfer learning upon the currenlty used state of the art CNN models.

####  Explaining some real life use cases of Object detection or localization. 


1. Attendance monitoring systems: To log the attendance of employees or students by scanning their faces at entry and exit, not directly object detection per se but object recognition.

2. Traffic rule monitoring systems: To identify law breaking vehicle owners from their numberplates, the numberplate data can be obtained from CCTV footage of traffic departments.

3. Obstacle avoidance in autonomous vehicles: Self driving cars can use object detection systems to classifiy, differentiate and avoid obstacles, other vehicles, or even humans on the road.