# Exercise 1.1 

## Overview

The solution to the 1st task of Exercise 1 focuses on detecting cars in the provided video and drawing bounding boxes around them using frame differencing and background subtraction. This notebook only implements the solution and discusses the step-by-step implementation process. The theoretical aspects of these techniques will be further elaborated in the report submitted alongside this notebook.

## Implementation

### Installing Dependencies

In [1]:
%pip install opencv-python numpy

Note: you may need to restart the kernel to use updated packages.


### Detecting moving cars

To begin with we will create a function to track the cars passing within the detection bounding box. The function logic follows these steps:

1. **Loading Video**: To begin, we first load the video using the `load_video_file()` function from the `cv_utils.py` file, which internally uses the python opencv library. 

2. **Calculate Background Frame**: We then extract the background frame by calculating the median frame from a sample of 200 random frames from the video. This is done by the `get_median_frame` in the `cv_utils.py` file.

3. **Reading Frames**: Then each frame of the video is read until the end or until the 'q' key is pressed.

4. **Grayscale Conversion**: A grayscale version of the current frame is created using the `cv2.cvtColor()` function. Grayscale images simplify the analysis by removing color information.

5. **Background Subtraction**: The background frame is subtracted from the current grayscale frame to isolate moving objects using the `subtract_background()` function from `cv_utils`.

6. **Preprocessing the Subtracted Frame**: The subtracted frame is preprocessed to enhance the moving objects:
   * **Thresholding**: A threshold is applied to the foreground mask to highlight moving objects.
   * **Dilation**: The shapes in the foreground mask are dilated to fill out black regions within the shapes.
   * **Erosion**: The boundary of the shapes in the foreground mask is eroded to smoothen out the edges of moving objects.

7. **Car Detection**: Bounding boxes of cars are detected using the `cv2.findContours()` function on the preprocessed foreground mask. Bounding boxes of cars that meet specific criteria are recorded.

8. **Drawing Bounding Boxes**: The detected bounding boxes are drawn on a copy of the original video frame, where the area of detection is highlighted with a red bounding box and the bounding boxes around detected cars are drawn in green.

9. **Rendering and Playback**: The modified frame with the bounding boxes is rendered in a window. The frames are updated based on the frame time of the video to maintain proper playback. If the `render_intermediate_frames` argument is set to `True`, the intermediate frames will be rendered in a separate window.

In [2]:
import cv2
import cv_utils
import numpy as np
from rect import Rect
from vector import Vector

def track_cars_in_video(video_path: str,  
                        area_of_detection_bounding_box: Rect, 
                        min_car_bounding_box_area: int,
                        render_intermediate_frames: bool=False):
    # Loading video
    video, video_metadata = cv_utils.load_video_file(video_path)
    # Extracting background frame
    background_frame = cv_utils.get_median_frame(video_path, 200)
    background_gray_frame = cv2.cvtColor(background_frame, cv2.COLOR_BGR2GRAY)

    frame_time_ms = int(round(1000 / video_metadata["frame_rate"]))
    
    # Read until video is completed or we press 'q'
    while True:
        # Reading frame
        check, frame = video.read()

        if check == True:
            # Creating frame copy
            org_frame_copy = frame.copy()

            # Creating grayscale version of frame
            gray_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

            # Subtracting background frame from current frame
            foreground_frame = cv_utils.subtract_background(background_gray_frame, gray_frame)

            ## Preprocess subtracted frame
            # Thresholding the foreground_mask to highlight the moving objects
            _, foreground_mask_1 = cv2.threshold(foreground_frame, 25, 255, cv2.THRESH_BINARY)
            # Dilate the shapes in the foreground mask to fill out black regions within the shape
            foreground_mask_2 = cv2.dilate(foreground_mask_1, np.ones((9, 9), np.uint8), iterations = 2)
            # Erode the boundary of the shapes in the foreground mask to smoothen out the edges of the moving objects
            foreground_mask_3 = cv2.erode(foreground_mask_2, np.ones((5, 5), np.uint8), iterations = 1)

            # Detect bounding boxes of cars from foreground_mask using contours method from opencv
            car_bounding_boxes = []
            contours, _ = cv2.findContours(foreground_mask_3, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

            for contour in contours:
                # Accessing the x, y and height, width of the cars
                x, y, width, height = cv2.boundingRect(contour)
                contourBoundingRect = Rect(Vector(x, y), width, height)

                if cv2.contourArea(contour) > min_car_bounding_box_area and Rect.is_rect_within_rect(area_of_detection_bounding_box, contourBoundingRect, percentage_threshold=80):
                    # Recording bounding boxes of the cars if 80% of the car's bounding box is within the area of detection
                    car_bounding_boxes.append(contourBoundingRect)

            # Draw bounding box for area of detection
            cv_utils.draw_rect_in_frame(org_frame_copy, area_of_detection_bounding_box, (0, 0, 255))
            # Drawing bounding box for all the cars
            for car_bounding_box in car_bounding_boxes:
                cv_utils.draw_rect_in_frame(org_frame_copy, car_bounding_box, (0, 255, 0))
                
            # Rendering the frame with bounding boxes to the window
            cv2.imshow("tracking", org_frame_copy)

            if render_intermediate_frames:
                # Stacking pre-processing frames
                stacked_frame = np.vstack((
                    np.hstack((frame, cv2.cvtColor(gray_frame, cv2.COLOR_GRAY2BGR), cv2.cvtColor(foreground_frame, cv2.COLOR_GRAY2BGR))),
                    np.hstack((cv2.cvtColor(foreground_mask_1, cv2.COLOR_GRAY2BGR), cv2.cvtColor(foreground_mask_2, cv2.COLOR_GRAY2BGR), cv2.cvtColor(foreground_mask_3, cv2.COLOR_GRAY2BGR))),
                ))

                # Rendering the intermediate frames to the window
                cv2.imshow("intermediate-frames", stacked_frame)

            # Sleeping for a time equal to the frame time of the video to maintain proper playback of video
            if cv2.waitKey(frame_time_ms) & 0xFF == ord("q"):
                break
        else:
            break

    # Release the video object
    video.release()

    # Destroy all the windows
    cv2.destroyAllWindows()

    # Waiting for the windows to close properly
    cv2.waitKey(10)

With the function to track cars created, we need to declare the dimensions and position of the bounding box, within which the detection should work. As instructed in the task sheet, we only focus on cars moving in `Main Street`. The information is stored as an object the `Rect` class which represents a rectangle in 2D space with utility functions that help with certain calculations. Then, we also define the minimum area of a bounding box for it to be considered a car which will help in differentiating other moving objects from moving cars.

In [3]:
# Defining detection bounding box
area_of_detection_bounding_box = Rect(
    position=Vector(1, 300),
    width= 1037,
    height= 298
)

# Declaring minimum area of the calculated bounding boxes to be considered a car
min_car_bounding_box_area = 4000

We then use the created function and intialized variables to track cars in the provided video.

In [5]:
track_cars_in_video('videos/Traffic_Laramie_1.mp4',
                    area_of_detection_bounding_box,
                    min_car_bounding_box_area,
                    render_intermediate_frames=False)