<a href="https://colab.research.google.com/github/Asma-Ahmed-Aqil-AL-Zubaidi/Bonus_Exercise/blob/main/1_of_YOLO_Practice_Exercise.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Highway Vehicle Counting Practice Exercise


In this exercise, you will use the YOLO (You Only Look Once) object detection model to analyze a video of highway traffic.
Your task is to count the number of cars that are leaving the highway (coming toward the camera) on the right side and the number of cars that are joining the road on the left side. The video can be found under `Datasets/Example.mp4`.

## Objectives
- Load and process a video using OpenCV.
- Use the YOLO model to detect vehicles in each frame.
- Use OpenCV to manulate the video.
- Track vehicles as they move through the video frames.
- Count the number of vehicles leaving the highway on the right side.
- Count the number of vehicles joining the road on the left side.

## Setup Environment

Before you begin, ensure you have the necessary libraries installed. You will need `opencv`, and `ultralytics` among others.
If these are not installed, you should install them.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
import cv2
import torch
from google.colab.patches import cv2_imshow

## Load the YOLO Model

You will first need to load the YOLO model. You can use a pre-trained YOLO model for this task.
Write the code to load the YOLO model below:


In [None]:
#model=torch.hub.load('ultralytics/yolov5', 'yolov5s')

Downloading: "https://github.com/ultralytics/yolov5/zipball/master" to /root/.cache/torch/hub/master.zip


[31m[1mrequirements:[0m Ultralytics requirements ['gitpython>=3.1.30', 'pillow>=10.3.0'] not found, attempting AutoUpdate...
Collecting gitpython>=3.1.30
  Downloading GitPython-3.1.43-py3-none-any.whl.metadata (13 kB)
Collecting pillow>=10.3.0
  Downloading pillow-10.4.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (9.2 kB)
Collecting gitdb<5,>=4.0.1 (from gitpython>=3.1.30)
  Downloading gitdb-4.0.11-py3-none-any.whl.metadata (1.2 kB)
Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->gitpython>=3.1.30)
  Downloading smmap-5.0.1-py3-none-any.whl.metadata (4.3 kB)
Downloading GitPython-3.1.43-py3-none-any.whl (207 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 207.3/207.3 kB 10.8 MB/s eta 0:00:00
Downloading pillow-10.4.0-cp310-cp310-manylinux_2_28_x86_64.whl (4.5 MB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 137.8 MB/s eta 0:00:00
Downloading gitdb-4.0.11-py3-none-any.whl (62 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.7/62.7 kB 151.7 MB/s eta 0:00:00
Downloa

YOLOv5 🚀 2024-8-30 Python-3.10.12 torch-2.4.0+cu121 CPU

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100%|██████████| 14.1M/14.1M [00:00<00:00, 160MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


In [None]:
from ultralytics import YOLO
model = YOLO(model="yolov8n.pt")

## Prepare the Video Capture

Create a variable to capture the video frames, you can use `cv2.VideoCapture()` to achive this.

In [None]:
Vid_path='/content/drive/MyDrive/T5/Week 6/6- Introduction to computer vision/Practice Exercise/Datasets/Example.mp4'
cap=cv2.VideoCapture(Vid_path)

## Get Video Information

You can use `cv2` library to get these information fro the `VideoCapture()` variable you created to extract these information:
* `height`: Video's height.
* `width`: Video's width.
* `fps`: Video's frames.

In [None]:
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  # Height of the video
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))    # Width of the video
fps = cap.get(cv2.CAP_PROP_FPS)                   # Frames per second of the video

print(f"Video Height: {height}, Video Width: {width}, FPS: {fps}")

Video Height: 720, Video Width: 1280, FPS: 25.03699088856008


## Prepare Video Writer to Store the Output

Create a variable that uses `cv2.VideoCapture()` to save the video with the bounding boxes and the counted cars on both sides. You will need to make the video with the same `fps`, `width`, `height`, and specify the codec and output path of the video.

In [None]:
# Define the codec and create VideoWriter object
output_path = 'output_with_detections.avi'  # Path to save the output video
fourcc = cv2.VideoWriter_fourcc(*'XVID')  # Define the codec (XVID is widely used)
output_video = cv2.VideoWriter(output_path, fourcc, fps, (width, height))  # Initialize the video writer

In [None]:
# Define the codec and create VideoWriter object for MP4 format
output_path = 'output_with_detections.mp4'  # Path to save the output video in MP4 format
fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # Define the codec for MP4
output_video = cv2.VideoWriter(output_path, fourcc, fps, (width, height))  # Initialize the video writer

## Process Video Frames and Identify Vehicles on the Right and Left Sides

For each frame in the video, use the YOLO model to detect and track vehicles. You'll need to write a loop that processes each frame and applies the YOLO model.
In each frame, after detecting the vehicles, determine whether they are on the left or right side of the highway.
You can use the position of the bounding boxes provided by YOLO to do this.
* The video should display bounding boxes around the detected objects.
* The video should display the confidence along side with the object id and class id of each detected and tracked object.
* The video display the number of vehicles on the left side.
* The video display the number of vehicles on the right side.
* The video should display the line in which you counted the objects that have crossed it and counted.

In [None]:
left_count = 0  # Counter for vehicles on the left
right_count = 0  # Counter for vehicles on the right
middle_line_position = width // 2  # Position to divide left and right in the frame

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Run YOLO model to detect objects in the current frame
    results = model(frame)

    # Process each detection in the frame
    for result in results:
        for box in result.boxes:
            # Extract bounding box coordinates, class ID, confidence, etc.
            x1, y1, x2, y2 = map(int, box.xyxy[0].cpu().numpy())  # Bounding box coordinates
            conf = box.conf.cpu().numpy()[0]  # Confidence of the detection
            cls = int(box.cls.cpu().numpy()[0])  # Class ID (e.g., 2 for cars)

            # Draw bounding box around detected object
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)

            # Display confidence, object ID, and class ID
            label = f"ID: {cls}, Conf: {conf:.2f}"
            cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2)

            # Determine if the detected object is on the left or right side
            if (x1 + x2) // 2 < middle_line_position:
                left_count += 1  # Vehicle on the left side
            else:
                right_count += 1  # Vehicle on the right side

    # Draw middle line to divide left and right
    cv2.line(frame, (middle_line_position, 0), (middle_line_position, height), (0, 0, 255), 2)

    # Display counts of vehicles on each side
    cv2.putText(frame, f"Left side vehicles: {left_count}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)
    cv2.putText(frame, f"Right side vehicles: {right_count}", (width - 300, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

    # Write processed frame to output video
    output_video.write(frame)

    # Display the frame (optional, can slow down processing in Colab)
    # cv2.imshow('Processed Video', frame)
    # if cv2.waitKey(1) & 0xFF == ord('q'):
    #     break

# Release video resources
cap.release()
output_video.release()
# cv2.destroyAllWindows()  # Uncomment if using cv2.imshow()

In [None]:
left_count = 0  # Counter for vehicles on the left
right_count = 0  # Counter for vehicles on the right
middle_line_position = width // 2  # Position to divide left and right in the frame

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # Run YOLO model to detect objects in the current frame
    results = model(frame)

    # Process each detection in the frame
    for result in results:
        for box in result.boxes:
            # Extract bounding box coordinates, class ID, confidence, etc.
            x1, y1, x2, y2 = map(int, box.xyxy[0].cpu().numpy())  # Bounding box coordinates
            conf = box.conf.cpu().numpy()[0]  # Confidence of the detection
            cls = int(box.cls.cpu().numpy()[0])  # Class ID (e.g., 2 for cars)

            # Draw bounding box around detected object
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)

            # Display confidence, object ID, and class ID
            label = f"ID: {cls}, Conf: {conf:.2f}"
            cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 2)

            # Determine if the detected object is on the left or right side
            if (x1 + x2) // 2 < middle_line_position:
                left_count += 1  # Vehicle on the left side
            else:
                right_count += 1  # Vehicle on the right side

    # Draw middle line to divide left and right
    cv2.line(frame, (middle_line_position, 0), (middle_line_position, height), (0, 0, 255), 2)

    # Display counts of vehicles on each side
    cv2.putText(frame, f"Left side vehicles: {left_count}", (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)
    cv2.putText(frame, f"Right side vehicles: {right_count}", (width - 300, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

    # Write processed frame to output video
    output_video.write(frame)

   # Display the frame (optional, can slow down processing in Colab)
cv2.imshow('Processed Video', frame) # Change cv2_imshow to cv2.imshow
if cv2.waitKey(1) & 0xFF == ord('q'):
     break

# Release video resources
cap.release()
output_video.release()
cv2.destroyAllWindows()  # Uncomment if using cv2.imshow()


0: 384x640 11 cars, 1 bus, 1 train, 2 trucks, 160.3ms
Speed: 3.7ms preprocess, 160.3ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 train, 2 trucks, 156.6ms
Speed: 4.7ms preprocess, 156.6ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 bus, 1 train, 1 truck, 203.4ms
Speed: 4.4ms preprocess, 203.4ms inference, 2.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 bus, 1 train, 2 trucks, 211.8ms
Speed: 5.7ms preprocess, 211.8ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 train, 1 truck, 241.1ms
Speed: 6.0ms preprocess, 241.1ms inference, 9.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 bus, 1 train, 1 truck, 368.0ms
Speed: 5.8ms preprocess, 368.0ms inference, 1.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 11 cars, 1 bus, 1 train, 1 truck, 201.6ms
Speed: 6.6ms preprocess, 201.6ms inference, 1

DisabledFunctionError: cv2.imshow() is disabled in Colab, because it causes Jupyter sessions
to crash; see https://github.com/jupyter/notebook/issues/3935.
As a substitution, consider using
  from google.colab.patches import cv2_imshow


## Save and Submit Your Work

In [None]:
from google.colab import files

# Download the output video
files.download('output_with_detections.avi')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
from google.colab import files

# Download the output video
files.download('output_with_detections.mp4')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In the actual exam you will be asked to submit both the notebook and the output video