# Object Tracking

In this notebook, I will learn to use YOLO model to track the race car.

## Importing Libraries

In [1]:
import cv2
import numpy as np
from ultralytics import YOLO
from deep_sort_realtime.deepsort_tracker import DeepSort

Creating new Ultralytics Settings v0.0.6 file  
View Ultralytics Settings with 'yolo settings' or at 'C:\Users\jaisu\AppData\Roaming\Ultralytics\settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


## Initializing

In [2]:
model = YOLO("yolov8n.pt")

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt'...


100%|█████████████████████████████████████████████████████████████████████████████| 6.25M/6.25M [00:01<00:00, 5.92MB/s]


In [3]:
tracker = DeepSort(max_age=30, nn_budget=100, override_track_class=None)

In [4]:
video_path = "images/race_car.mp4"
cap = cv2.VideoCapture(video_path)

In [5]:
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))

In [6]:
output_path = "images/output_race_car_tracking.mp4"
fourcc = cv2.VideoWriter_fourcc(*"mp4v")
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))

## Tracking

In [8]:
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    results = model(frame)
    detections = []
    
    for result in results:
        boxes = result.boxes
        for box in boxes:
            x1, y1, x2, y2 = map(int, box.xyxy[0])  # Bounding box coordinates
            conf = box.conf[0]  # Confidence score
            cls = int(box.cls[0])  # Class ID

            # class ID for car is 2
            if cls == 2 and conf > 0.5:  # Adjust confidence threshold as needed
                # DeepSORT expects format: [x1, y1, w, h, confidence]
                w = x2 - x1
                h = y2 - y1
                detections.append(([x1, y1, w, h], conf, cls))

    tracks = tracker.update_tracks(detections, frame=frame)

    for track in tracks:
        if not track.is_confirmed():
            continue
        track_id = track.track_id
        bbox = track.to_ltrb()
        x1, y1, x2, y2 = map(int, bbox)

        # Draw bounding box 
        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
        cv2.putText(frame, f"ID: {track_id}", (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (0, 255, 0), 2)

    # Saving the each frame in the output video
    out.write(frame)


0: 416x640 2 cars, 226.7ms
Speed: 18.6ms preprocess, 226.7ms inference, 2.7ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 219.2ms
Speed: 5.2ms preprocess, 219.2ms inference, 2.9ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 216.2ms
Speed: 6.6ms preprocess, 216.2ms inference, 2.5ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 214.6ms
Speed: 4.7ms preprocess, 214.6ms inference, 2.4ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 224.8ms
Speed: 6.3ms preprocess, 224.8ms inference, 2.3ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 259.8ms
Speed: 4.7ms preprocess, 259.8ms inference, 3.1ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 190.3ms
Speed: 5.8ms preprocess, 190.3ms inference, 2.0ms postprocess per image at shape (1, 3, 416, 640)

0: 416x640 2 cars, 426.3ms
Speed: 5.2ms preprocess, 426.3ms inference, 3.8ms postprocess per image at shape (1, 3, 4

In [10]:
# releasing the memory
cap.release()
out.release()