**Stage 1**: Vehicle Detection and Tracking
Vehicle detection was implemented using YOLOv8 (Ultralytics), focusing on the "car" class from the COCO dataset. Integrated object tracking (e.g., BoT-SORT) was enabled to assign consistent unique IDs to each detected vehicle across frames. Bounding boxes were drawn around vehicles, labeled with their class and ID to support object continuity during tracking.

**Stage 2**: Defining the Counting Line
A virtual counting line was defined using two coordinate points (x1, y1) and (x2, y2), creating either a horizontal or vertical line on the video frame. This line was rendered on-screen to provide a visual reference for crossing detection.

**Stage 3**: Crossing Logic and Counting Mechanism
For each tracked object, the center of its bounding box was computed. The system monitored the movement trajectory of this center point to determine whether the object crossed the defined line. To ensure reliable counting:

Only one direction of crossing (e.g., left-to-right) was considered valid.

Each object was counted only once by tracking its unique ID and marking it as “counted” after crossing.

**Stage 4**: Visualization
The visualization layer displayed:

Bounding boxes with object class and ID labels.

The defined counting line.

Center points of tracked objects.

A dynamic counter showing the total number of crossings, updated in real time.

Additionally, objects that successfully crossed the line were visually distinguished (e.g., with a different bounding box color or label) to reflect their counted status.

**Stage 5**: Testing and Evaluation
The complete system was tested on sample video footage containing moving vehicles. The tracking and counting pipeline operated correctly, identifying and counting vehicles as they crossed the predefined line. Additional features were considered for further development, such as multi-class support, bidirectional counting, and speed estimation.

**As an additional feature**, a designated lane area was defined using a bounding box over a specific region of the frame (e.g., a traffic lane). The system monitored tracked vehicles to determine whether their center points entered this predefined zone.

In [12]:
!pip install lap ultralytics opencv-python



Let's import everything we need

In [34]:
from ultralytics import YOLO
import cv2
from shapely.geometry import Point, Polygon
import time
import math
import numpy as np

In [35]:
import torch
import torchvision
print(torch.__version__)
print(torchvision.__version__)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Used device: {device}")


2.5.1
0.20.1
Used device: cuda


In [36]:
import warnings
warnings.filterwarnings("ignore")


This section reads a video file and creates a new version with a reduced frame rate (10 FPS) and smaller resolution (half the original size). It calculates the appropriate frame skip interval to match the target FPS and writes only the selected, resized frames to a new output file (video_10fps.mp4). This preprocessing step helps optimize performance for further processing.

In [37]:
cap = cv2.VideoCapture("video/video_cropped.mp4")
w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = cap.get(cv2.CAP_PROP_FPS)

target_fps = 10
skip = max(1, round(fps / target_fps))
size = (w//2, h//2)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter('video/video_10fps.mp4', fourcc, target_fps, size)

frame_id = 0
while True:
    ret, frame = cap.read()
    if not ret:
        break
    frame_id += 1
    if frame_id % skip:
        continue
    small = cv2.resize(frame, size)
    out.write(small)

cap.release()
out.release()

This section applies YOLOv8 object detection and tracking to a preprocessed video (video_10fps.mp4). The model is configured to use the BoT-SORT tracker and filter for the "car" class (class ID 2 in the COCO dataset).

For each frame:

YOLOv8 performs tracking with persistent object IDs.

Bounding boxes and labels are rendered directly on the frame.

The annotated frame is written to an output file (final_video.mp4) using preprocessed video with low FPS and resolution.



In [38]:
model = YOLO("yolov8m.pt")
model.to("cuda")
model.tracker = "botsort.yaml"
classes = (2,)  # class "car"

cap = cv2.VideoCapture("video/video_10fps.mp4")

fps = cap.get(cv2.CAP_PROP_FPS)
frame_time = 1.0 / fps
w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
out = cv2.VideoWriter("video/final_video.mp4", fourcc, fps, (w, h))



A horizontal counting line was positioned at the vertical center of the frame to detect vehicle crossings.

In addition, a rotated rectangular zone was defined in the lower part of the frame to monitor whether vehicles entered a specific area (e.g., a traffic lane). The zone was rotated by approximately 15 degrees. This setup enabled detection of vehicles entering that marked lane area.

In [39]:
line_start = (0, h // 2)
line_end = (w, h // 2)
line_y = h // 2

In [40]:
polygon = Polygon([
    (-95, 332),
    (213, 249),
    (223, 287),
    (-85, 370)
])

rect_pts_np = np.array(polygon.exterior.coords[:-1], dtype=np.int32)


The four corner points of the rotated rectangular zone were computed by applying rotation transformations around its center. Each corner is calculated using trigonometric functions based on the rectangle's width, height, center coordinates, and rotation angle. The resulting points define the polygon representing the restricted lane area for vehicle entry detection.

Sets and dictionaries were initialized to keep track of previously detected object centers and to ensure that each tracked vehicle is counted only once when crossing the counting line or entering the designated zone. 

In [41]:
prev_centers = {}
counted_ids_line = set()
counted_ids_zone = set()
line_count = 0
zone_count = 0




**Vehicle Tracking, Counting, and Visualization Loop**

The video is processed frame-by-frame in a loop:

For each frame, YOLOv8 with the BoT-SORT tracker detects and tracks cars, outputting bounding boxes, unique IDs, and object coordinates.

For each detected vehicle:

The center point coordinates are calculated.

The vehicle's previous center position is retrieved to track movement direction.

Crossing the horizontal counting line is detected by comparing previous and current y-coordinates; each vehicle is counted only once when crossing.

Entry into the rotated restricted zone is checked using cv2.pointPolygonTest on the defined polygon; vehicles are counted only once upon entering.

Vehicles that have crossed the line or entered the zone are marked with green or red bounding boxes, respectively; others are marked in blue.

The counting line and restricted zone polygon are drawn on the frame with corresponding counters displayed.

The annotated frame is written to the output video.



In [42]:
while True:
    start = time.time()
    ret, frame = cap.read()
    if not ret:
        break

    results = model.track(frame, persist=True, classes=classes, tracker=model.tracker)
    frame = results[0].orig_img.copy()
    boxes = results[0].boxes
    annotated = frame.copy()

    if boxes is not None:
        for box in boxes:
            if box.id is None:
                continue

            id = int(box.id.item())
            x1, y1, x2, y2 = map(int, box.xyxy[0])
            cx = (x1 + x2) // 2
            cy = (y1 + y2) // 2

            prev_center = prev_centers.get(id, (cx, cy))
            prev_cy = prev_center[1]

            if id not in counted_ids_line and prev_cy > line_y and cy <= line_y:
                line_count += 1
                counted_ids_line.add(id)

            if id not in counted_ids_zone:
                point = Point(cx, cy)
                if polygon.contains(point):
                    zone_count += 1
                    counted_ids_zone.add(id)


            prev_centers[id] = (cx, cy)

            if id in counted_ids_zone:
                color = (0, 0, 255)
            elif id in counted_ids_line:
                color = (0, 255, 0)
            else:
                color = (255, 0, 0)

            cv2.rectangle(annotated, (x1, y1), (x2, y2), color, 2)
            cv2.circle(annotated, (cx, cy), 4, color, -1)
            cv2.putText(annotated, f"ID: {id}", (x1, y1 - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1)


    cv2.line(annotated, line_start, line_end, (0, 255, 255), 2)
    cv2.putText(annotated, f"Line crossed: {line_count}", (10, 30),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255), 2)

    cv2.polylines(annotated, [rect_pts_np], isClosed=True, color=(255, 255, 0), thickness=2)
    cv2.putText(annotated, f"Restricted zone crossed: {zone_count}", (10, 70),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 0), 2)

    out.write(annotated)

    elapsed = time.time() - start
    to_wait = frame_time - elapsed
    if to_wait > 0:
        time.sleep(to_wait)

    if cv2.waitKey(1) & 0xFF == 27:
        break



0: 384x640 7 cars, 17.0ms
Speed: 1.3ms preprocess, 17.0ms inference, 0.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 cars, 16.1ms
Speed: 0.7ms preprocess, 16.1ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 cars, 16.2ms
Speed: 2.6ms preprocess, 16.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 9 cars, 16.3ms
Speed: 2.4ms preprocess, 16.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 8 cars, 16.4ms
Speed: 2.9ms preprocess, 16.4ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 8 cars, 16.2ms
Speed: 2.2ms preprocess, 16.2ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 10 cars, 16.2ms
Speed: 1.9ms preprocess, 16.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 10 cars, 16.3ms
Speed: 1.5ms preprocess, 16.3ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)

0: 38

In [43]:
cap.release()
out.release()

**All five main stages and the additional task have been successfully implemented. Both counting mechanisms—line crossing and zone entry—function correctly without errors. Object detection is stable and reliable, and the visualization of bounding boxes, tracking IDs, counting lines, and zones is clearly presented.**