# Object Tracking Tutorial
===========================
<br>In this tutorial, we will learn how to track objects in a video using deepSORT.
<br>We will use the YOLOv11 model for object detection and deepSORT for tracking.


### Importing Libraries

In [1]:
import cv2
import os
from ultralytics import YOLO
from deep_sort_realtime.deepsort_tracker import DeepSort

### Path to the YOLOv11 model and Output Directory

In [None]:
# Load YOLOv8 model
model = YOLO("./yolo11n.pt")

# Initialize Deep SORT tracker
tracker = DeepSort(max_age=30)

# Set input video
video_path = "./home/tang/FRA532_objectdetechtion/video/mavic_2_pro.mp4"
cap = cv2.VideoCapture(video_path)

# Create output folder
output_folder = "track_output_custom"
os.makedirs(output_folder, exist_ok=True)



### Processing the Video
The class_id for objects in YOLOv11 is as follows:
- 0: person
- 1: bicycle
- 2: car
- 3: motorcycle
- 4: airplane
- 5: bus
- 6: train
- 7: truck
- 8: boat
- 9: traffic light
- 10: fire hydrant
...

In [3]:
# Set focused classes and confidence threshold
focused_classes = 0 # person
confidence_threshold = 0.5

# Start processing video
frame_index = 0
while True:
    ret, frame = cap.read()
    if not ret:
        break

    # Run YOLO detection
    results = model(frame)[0]

    detections = []

    # Collect only person class detections
    for box in results.boxes:
        cls_id = int(box.cls[0])
        conf = float(box.conf[0])
        # Filter the detections based on class ID and confidence
        if cls_id == focused_classes and conf > confidence_threshold: 

            x1, y1, x2, y2 = map(int, box.xyxy[0])
            detections.append(([x1, y1, x2 - x1, y2 - y1], conf, 'person'))

    # Track using Deep SORT
    tracks = tracker.update_tracks(detections, frame=frame)

    # Draw results on frame
    for track in tracks:
        if not track.is_confirmed():
            continue
        track_id = track.track_id
        x1, y1, x2, y2 = map(int, track.to_ltrb())
        cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)
        cv2.putText(frame, f'ID {track_id}', (x1, y1 - 5),
                    cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)

    # Save frame to output folder
    frame_filename = os.path.join(output_folder, f"frame_{frame_index:05d}.jpg")
    cv2.imwrite(frame_filename, frame)
    frame_index += 1

cap.release()
print(f"Done. Frames saved to '{output_folder}/'")


0: 384x640 6 cars, 2 trucks, 3 traffic lights, 42.6ms
Speed: 3.0ms preprocess, 42.6ms inference, 111.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bicycle, 6 cars, 2 trucks, 3 traffic lights, 6.6ms
Speed: 1.6ms preprocess, 6.6ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 bicycle, 7 cars, 2 trucks, 3 traffic lights, 6.0ms
Speed: 1.2ms preprocess, 6.0ms inference, 1.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bicycle, 8 cars, 2 trucks, 2 traffic lights, 5.9ms
Speed: 0.6ms preprocess, 5.9ms inference, 0.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 bicycle, 8 cars, 2 trucks, 3 traffic lights, 8.2ms
Speed: 0.9ms preprocess, 8.2ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 bicycle, 8 cars, 3 trucks, 3 traffic lights, 8.4ms
Speed: 0.9ms preprocess, 8.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons