# **Track and Count Vehicles with YOLOv8**

This Jupyter Notebook is dedicated to tracking and counting vehicles in video footage using advanced object detection models. The primary aim is to accurately identify, follow, and enumerate different types of vehicles across video frames. This analysis is pivotal for traffic flow management, urban planning, and automated surveillance systems. Leveraging deep learning models, specifically designed for object detection in videos, this notebook outlines a comprehensive approach for real-time vehicle tracking and counting.

## **GPU Status Check**

We begin by checking the availability and status of our GPU, which is crucial for the computationally intensive tasks of video processing and running the YOLOv8 model. The nvidia-smi command gives us a snapshot of the GPU's model, memory usage, and active processes, ensuring our setup is ready for the subsequent operations.

In [None]:
!nvidia-smi

## **Importing Libraries and Setting Up the Workspace**

After downloading the necessary video file for vehicle tracking and counting, this cell focuses on importing essential Python libraries and modules that will be used throughout the notebook. Additionally, it reaffirms the home directory setup, ensuring all file paths are correctly managed.

In [None]:
import os
import sys
from typing import List

import numpy as np
from IPython.display import display
from tqdm.notebook import tqdm

HOME = os.getcwd()
print(HOME)

## **Downloading Video for Analysis**

Ensure the working directory is set to HOME and download the vehicle-counting.mp4 video using a Google Drive link. This step is critical for making sure the necessary video file is present for our vehicle tracking and counting tasks.

In [None]:
%cd {HOME}
!wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1pz68D1Gsx80MoPg-_q-IbEdESEmyVLm-' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')&id=1pz68D1Gsx80MoPg-_q-IbEdESEmyVLm-" -O vehicle-counting.mp4 && rm -rf /tmp/cookies.txt

## **Setting the Source Video Path**

Define the path to the video file used for vehicle tracking and counting analysis, ensuring it's correctly located within the notebook's home directory for easy access during processing.

In [None]:
SOURCE_VIDEO_PATH = f"{HOME}/vehicle-counting.mp4"


## **Installing Ultralytics and Preparing the Environment**

Install the ultralytics package, clear the output to maintain a clean notebook, and verify the installation by checking the environment setup.

In [None]:
!pip install ultralytics

display.clear_output()

import ultralytics
ultralytics.checks()


## **Setting Up ByteTrack for Enhanced Vehicle Tracking**

This section is dedicated to installing ByteTrack, a high-performance vehicle tracking system, and preparing it for integration with our vehicle detection and counting project

In [None]:
%cd {HOME}
!git clone https://github.com/ifzhang/ByteTrack.git
%cd {HOME}/ByteTrack

!sed -i 's/onnx==1.8.1/onnx==1.9.0/g' requirements.txt

!pip3 install -q -r requirements.txt
!python3 setup.py -q develop
!pip install -q cython_bbox
!pip install -q onemetric

!pip install -q loguru lap thop

display.clear_output()

sys.path.append(f"{HOME}/ByteTrack")

import yolox
print("yolox.__version__:", yolox.__version__)

## **Integrating BYTETracker for Vehicle Tracking**

This section integrates BYTETracker from the ByteTrack framework into our vehicle tracking and counting project. BYTETracker is renowned for its high-performance multi-object tracking capabilities, making it ideal for tracking vehicles in video streams.



In [None]:
from yolox.tracker.byte_tracker import BYTETracker, STrack
from onemetric.cv.utils.iou import box_iou_batch
from dataclasses import dataclass

@dataclass(frozen=True)
class BYTETrackerArgs:
    track_thresh: float = 0.25
    track_buffer: int = 30
    match_thresh: float = 0.8
    aspect_ratio_thresh: float = 3.0
    min_box_area: float = 1.0
    mot20: bool = False

## **Installing a Specific Version of the Supervision Library**

To ensure compatibility and access to specific features required for vehicle tracking and counting, this step involves installing version 0.1.0 of the supervision library. Following the installation, the output is cleared to maintain a clean notebook workspace, and the installed version of supervision is verified.



In [None]:
!pip install supervision==0.1.0

display.clear_output()

import supervision
print("supervision.__version__:", supervision.__version__)

## **Importing Supervision Tools for Video Processing and Annotation**

This step involves importing various modules and classes from the supervision library that are essential for handling video data, drawing annotations, and processing detections in our vehicle tracking and counting project.



In [None]:
from supervision.draw.color import ColorPalette
from supervision.geometry.dataclasses import Point
from supervision.video.dataclasses import VideoInfo
from supervision.video.source import get_video_frames_generator
from supervision.video.sink import VideoSink
from supervision.notebook.utils import show_frame_in_notebook
from supervision.tools.detections import Detections, BoxAnnotator
from supervision.tools.line_counter import LineCounter, LineCounterAnnotator


## **Utility Functions for Processing Detections and Tracks**

This section introduces utility functions designed to facilitate the handling and association of detection results with tracked objects in our vehicle tracking and counting project. These functions convert detection and track information into a format suitable for further processing and matching.

In [None]:
def detections2boxes(detections: Detections) -> np.ndarray:
    return np.hstack((
        detections.xyxy,
        detections.confidence[:, np.newaxis]
    ))

def tracks2boxes(tracks: List[STrack]) -> np.ndarray:
    return np.array([
        track.tlbr
        for track
        in tracks
    ], dtype=float)

# matches our bounding boxes with predictions
def match_detections_with_tracks(
    detections: Detections,
    tracks: List[STrack]
) -> Detections:
    if not np.any(detections.xyxy) or len(tracks) == 0:
        return np.empty((0,))

    tracks_boxes = tracks2boxes(tracks=tracks)
    iou = box_iou_batch(tracks_boxes, detections.xyxy)
    track2detection = np.argmax(iou, axis=1)

    tracker_ids = [None] * len(detections)

    for tracker_index, detection_index in enumerate(track2detection):
        if iou[tracker_index, detection_index] != 0:
            tracker_ids[detection_index] = tracks[tracker_index].track_id

    return tracker_ids


## **Specifying the YOLOv8 Model for Vehicle Detection**

This line of code sets the model to be used for vehicle detection to YOLOv8x, indicating the specific version of the YOLO model optimized for accuracy and performance in detecting objects within video frames.

In [None]:
MODEL = "yolov8x.pt"


## **Loading and Optimizing the YOLOv8x Model for Vehicle Detection**

This section involves importing the YOLO class from the ultralytics package, initializing the YOLO model with the specified model configuration, and applying model optimization techniques to enhance performance.

In [None]:
from ultralytics import YOLO
model = YOLO(MODEL)
model.fuse()

## **Configuring Class IDs for Vehicle Detection**

This step involves setting up a dictionary to map class IDs to their corresponding class names for the YOLOv8x model and specifying the class IDs of interest for vehicle detection, such as cars, motorcycles, buses, and trucks.

In [None]:
# dict maping class_id to class_name
CLASS_NAMES_DICT = model.model.names
# class_ids of interest - car, motorcycle, bus and truck
CLASS_ID = [2, 3, 5, 7]

## **Initializing Video Processing and Detection**

This section outlines the steps to generate frames from the video, annotate detected vehicles, and display the processed frame. It involves setting up a frame generator, initializing a BoxAnnotator for drawing detections, performing a model prediction on the first frame, and visually presenting the results.

In [None]:
# create frame generator
generator = get_video_frames_generator(SOURCE_VIDEO_PATH)
# create instance of BoxAnnotator
box_annotator = BoxAnnotator(color=ColorPalette(), thickness=4, text_thickness=4, text_scale=2)
# acquire first video frame
iterator = iter(generator)
frame = next(iterator)
# model prediction on single frame and conversion to supervision Detections
results = model(frame)
detections = Detections(
    xyxy=results[0].boxes.xyxy.cpu().numpy(),
    confidence=results[0].boxes.conf.cpu().numpy(),
    class_id=results[0].boxes.cls.cpu().numpy().astype(int)
)
# format custom labels
labels = [
    f"{CLASS_NAMES_DICT[class_id]} {confidence:0.2f}"
    for _, confidence, class_id, tracker_id
    in detections
]
# annotate and display frame
frame = box_annotator.annotate(frame=frame, detections=detections, labels=labels)

%matplotlib inline
show_frame_in_notebook(frame, (16, 16))

## **Configuring Line Detection and Output Video Path**

This step specifies settings for a line used in vehicle counting and defines the path for saving the processed video with annotations.



In [None]:
# settings
LINE_START = Point(50, 1500)
LINE_END = Point(3840-50, 1500)

TARGET_VIDEO_PATH = f"{HOME}/vehicle-counting-result.mp4"

## **Extracting Video Information**

This step involves obtaining essential information about the source video, such as frame rate, resolution, and total frames. This information serves as the basis for subsequent video processing steps.

In [None]:
VideoInfo.from_video_path(SOURCE_VIDEO_PATH)

## **Implementing Vehicle Tracking and Counting**

This section combines vehicle detection, tracking, and counting into a single workflow, utilizing BYTETracker for tracking, annotations for visual feedback, and a line counter for vehicle counting.

**Initialize Tracking and Annotation Tools:** bold text Set up BYTETracker, video metadata retrieval, frame generation, line counting, and annotations for boxes and lines.

**Process Video Frames:**
- Detect vehicles using YOLO and filter detections by class.
- Update BYTETracker with detections for tracking.
- Match detections to tracks and update tracker IDs.
- Count vehicles crossing a predefined line.
Annotate frames with detection boxes and line crossings.

**Output:** Write annotated frames to a target video file.

In [None]:
# create BYTETracker instance
byte_tracker = BYTETracker(BYTETrackerArgs())
# create VideoInfo instance
video_info = VideoInfo.from_video_path(SOURCE_VIDEO_PATH)
# create frame generator
generator = get_video_frames_generator(SOURCE_VIDEO_PATH)
# create LineCounter instance
line_counter = LineCounter(start=LINE_START, end=LINE_END)
# create instance of BoxAnnotator and LineCounterAnnotator
box_annotator = BoxAnnotator(color=ColorPalette(), thickness=4, text_thickness=4, text_scale=2)
line_annotator = LineCounterAnnotator(thickness=4, text_thickness=4, text_scale=2)

# open target video file
with VideoSink(TARGET_VIDEO_PATH, video_info) as sink:
    # loop over video frames
    for frame in tqdm(generator, total=video_info.total_frames):
        # model prediction on single frame and conversion to supervision Detections
        results = model(frame)
        detections = Detections(
            xyxy=results[0].boxes.xyxy.cpu().numpy(),
            confidence=results[0].boxes.conf.cpu().numpy(),
            class_id=results[0].boxes.cls.cpu().numpy().astype(int)
        )
        # filtering out detections with unwanted classes
        mask = np.array([class_id in CLASS_ID for class_id in detections.class_id], dtype=bool)
        detections.filter(mask=mask, inplace=True)
        # tracking detections
        tracks = byte_tracker.update(
            output_results=detections2boxes(detections=detections),
            img_info=frame.shape,
            img_size=frame.shape
        )
        tracker_id = match_detections_with_tracks(detections=detections, tracks=tracks)
        detections.tracker_id = np.array(tracker_id)
        # filtering out detections without trackers
        mask = np.array([tracker_id is not None for tracker_id in detections.tracker_id], dtype=bool)
        detections.filter(mask=mask, inplace=True)
        # format custom labels
        labels = [
            f"#{tracker_id} {CLASS_NAMES_DICT[class_id]} {confidence:0.2f}"
            for _, confidence, class_id, tracker_id
            in detections
        ]
        # updating line counter
        line_counter.update(detections=detections)
        # annotate and display frame
        frame = box_annotator.annotate(frame=frame, detections=detections, labels=labels)
        line_annotator.annotate(frame=frame, line_counter=line_counter)
        sink.write_frame(frame)