# TASK 2: Re-Identification in a Single Feed

## Objective
The goal of this task is to perform **player re-identification** in a short video clip. Specifically, each player detected in the video should retain a **consistent identity (ID)** throughout the 15-second footage, even if they **leave the frame and return later**.

---

## Instructions

1. **Video Input**: Use the provided `15sec_input_720p.mp4` sports footage file as input.
2. **Object Detection**: Use the provided YOLOv11 model (fine-tuned for player and ball detection) to detect players in each frame.
3. **Initial ID Assignment**: Assign unique IDs to players based on the first few seconds of the video.
4. **Re-Identification**: When a player leaves and re-enters the frame (e.g., during a goal event), ensure they are **assigned the same ID** as before.
5. **Real-Time Simulation**: Should simulate real-time re-identification and player tracking..

---

## Given Model Details

- **Model Link**: [Provided YOLOv11 Object Detection Model](https://drive.google.com/file/d/1.5fOSHOSB9UXvPenOoZNAMScrePVcMD/view)
- A fine-tuned version of **Ultralytics YOLOv11**, trained to detect:
  - Players
  - Ball

---

## Expected Output
- A video output with bounding boxes and player IDs.
- Players should have **consistent IDs** across frames, even after occlusions or brief exits.
- A script or notebook implementation that works frame-by-frame to detect, assign, and re-identify players.



# Mounting the Drive

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Setting up the working directory and path

In [2]:
WORKDIR = '/content/drive/MyDrive/Player_Re_Identification_Assignment'

MODEL_PATH = f"{WORKDIR}/best.pt"
VIDEO_PATH = f"{WORKDIR}/15sec_input_720p.mp4"
OUTPUT_VIDEO_PATH = f"{WORKDIR}/output_tracked.mp4"

print(f"Model path: {MODEL_PATH}")
print(f"Video path: {VIDEO_PATH}")
print(f"Output will be saved to: {OUTPUT_VIDEO_PATH}")

Model path: /content/drive/MyDrive/Player_Re_Identification_Assignment/best.pt
Video path: /content/drive/MyDrive/Player_Re_Identification_Assignment/15sec_input_720p.mp4
Output will be saved to: /content/drive/MyDrive/Player_Re_Identification_Assignment/output_tracked.mp4


# Install Required Libraries  
We install `ultralytics` for YOLOv11-based object detection, `opencv-python-headless` for video and image processing, and `scipy` for computing feature similarity during player re-identification.

In [3]:
!pip install ultralytics opencv-python-headless scipy

Collecting ultralytics
  Downloading ultralytics-8.3.163-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.14-py3-none-any.whl.metadata (9.4 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.8.0->ultralytics)
  Downloading n

# Load Dependencies and Configure Tracking Parameters  
This cell imports necessary libraries and sets up key tracking parameters such as confidence threshold, IOU threshold, and feature similarity threshold. It also initializes global variables and checks if a CUDA-enabled GPU is available for acceleration.


In [4]:
import cv2
import numpy as np
import torch
from ultralytics import YOLO
from collections import defaultdict
from scipy.spatial.distance import cosine
import os

# Tracking settings
CONFIDENCE_THRESHOLD = 0.5
IOU_TRACKING_THRESHOLD = 0.3
FEATURE_SIMILARITY_THRESHOLD = 0.5
MAX_LOST_FRAMES = 15
IMG_SIZE = 512

# Global state
_next_player_id = 0
_active_players = {}
_inactive_players = {}
_yolo_model = None
_model_loaded = False
_player_class_id = -1
device = 'cuda' if torch.cuda.is_available() else 'cpu'

print(f"[INFO] Torch version: {torch.__version__}")
print(f"[INFO] Using device: {device.upper()}")
if device == 'cuda':
    print(f"[INFO] CUDA Device: {torch.cuda.get_device_name(0)}")

Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
[INFO] Torch version: 2.6.0+cu124
[INFO] Using device: CUDA
[INFO] CUDA Device: Tesla T4


# Define Player Class and Core Utility Functions  
This cell defines the `Player` class to manage player state, including ID, bounding box, last seen frame, and feature vector. It also includes helper functions for computing IoU between bounding boxes, extracting color histogram features from player crops, and identifying the best match among known players using cosine similarity.


In [5]:
class Player:
    def __init__(self, player_id, bbox, frame_num, features=None):
        self.player_id = player_id
        self.bbox = bbox
        self.last_seen_frame = frame_num
        self.features = features
        self.lost_frames_count = 0

    def update_bbox(self, new_bbox, frame_num):
        self.bbox = new_bbox
        self.last_seen_frame = frame_num
        self.lost_frames_count = 0

def calculate_iou(boxA, boxB):
    xA, yA = max(boxA[0], boxB[0]), max(boxA[1], boxB[1])
    xB, yB = min(boxA[2], boxB[2]), min(boxA[3], boxB[3])
    interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
    boxAArea = (boxA[2] - boxA[0] + 1) * (boxA[3] - boxA[1] + 1)
    boxBArea = (boxB[2] - boxB[0] + 1) * (boxB[3] - boxB[1] + 1)
    return interArea / float(boxAArea + boxBArea - interArea + 1e-6)

def extract_features(image, bbox):
    x1, y1, x2, y2 = map(int, bbox)
    h, w = image.shape[:2]
    x1, y1 = max(0, x1), max(0, y1)
    x2, y2 = min(w, x2), min(h, y2)
    if x2 <= x1 or y2 <= y1:
        return None
    crop = image[y1:y2, x1:x2]
    if crop.size == 0:
        return None
    hsv = cv2.cvtColor(crop, cv2.COLOR_BGR2HSV)
    hist = cv2.calcHist([hsv], [0, 1], None, [16, 16], [0, 180, 0, 256])
    return cv2.normalize(hist, hist).flatten()

def get_best_match(feat, candidates):
    if feat is None:
        return None, 0
    sims = [(pid, 1 - cosine(feat, player.features))
            for pid, player in candidates.items() if player.features is not None]
    if not sims:
        return None, 0
    return max(sims, key=lambda x: x[1])

# Load YOLOv11 Model  
This function loads the fine-tuned YOLOv11 model onto the appropriate device (GPU or CPU), sets it to evaluation mode, and identifies the class ID corresponding to players for selective detection and tracking.


In [6]:
def _load_yolo_model():
    global _yolo_model, _model_loaded, _player_class_id
    if _model_loaded:
        return True

    print(f"[INFO] Loading YOLO model from {MODEL_PATH} on {device.upper()}...")
    _yolo_model = YOLO(MODEL_PATH)
    _yolo_model.to(device).eval().half()

    for class_id, name in _yolo_model.names.items():
        if name.lower() == "player":
            _player_class_id = class_id
            break

    _model_loaded = True
    print(f"[INFO] Model loaded successfully. Tracking class ID: {_player_class_id}")
    return True

# Process Video for Player Re-Identification  
This function performs frame-by-frame player detection, initial ID assignment, and re-identification using YOLOv11. It handles active and inactive player tracking, extracts features for matching, and updates the output video with consistent bounding boxes and player IDs. It also logs frame-wise and summary statistics to the terminal.


In [7]:
def process_reid_video(input_video_path):
    global _next_player_id, _active_players, _inactive_players

    _next_player_id = 0
    _active_players = {}
    _inactive_players = {}

    if not _model_loaded and not _load_yolo_model():
        return None

    cap = cv2.VideoCapture(input_video_path)
    if not cap.isOpened():
        print(f"[ERROR] Could not open video {input_video_path}")
        return None

    w, h = int(cap.get(3)), int(cap.get(4))
    fps = cap.get(cv2.CAP_PROP_FPS)
    out = cv2.VideoWriter(OUTPUT_VIDEO_PATH, cv2.VideoWriter_fourcc(*'mp4v'), fps, (w, h))

    frame_num = 0
    total_new_players = 0
    total_reidentified = 0
    total_detections = 0
    total_active_sum = 0

    while True:
        ret, frame = cap.read()
        if not ret:
            break
        frame_num += 1

        with torch.inference_mode():
            results = _yolo_model.predict(source=frame, imgsz=IMG_SIZE, half=True, device=device, verbose=False)[0]

        detections = []
        track_detections = []

        if results.boxes is not None:
            for *xyxy, conf, cls in results.boxes.data:
                bbox = list(map(int, xyxy))
                class_id = int(cls)
                conf = float(conf)
                name = _yolo_model.names.get(class_id, 'unknown')
                detections.append({'bbox': bbox, 'class': name, 'conf': conf})
                if class_id == _player_class_id and conf > CONFIDENCE_THRESHOLD:
                    track_detections.append(bbox)

        total_detections += len(detections)

        assigned = []
        reidentified = 0
        new_players = 0

        for i, det_bbox in enumerate(track_detections):
            best_id, max_iou = -1, 0
            for pid, player in _active_players.items():
                iou = calculate_iou(player.bbox, det_bbox)
                if iou > max_iou:
                    best_id, max_iou = pid, iou
            if max_iou >= IOU_TRACKING_THRESHOLD:
                _active_players[best_id].update_bbox(det_bbox, frame_num)
                assigned.append(i)

        unmatched = [b for i, b in enumerate(track_detections) if i not in assigned]
        for bbox in unmatched:
            feat = extract_features(frame, bbox)
            best_id, max_sim = get_best_match(feat, _inactive_players)

            if max_sim >= FEATURE_SIMILARITY_THRESHOLD:
                p = _inactive_players.pop(best_id)
                p.update_bbox(bbox, frame_num)
                p.features = feat
                _active_players[best_id] = p
                reidentified += 1

            elif _next_player_id < 22:
                new = Player(_next_player_id, bbox, frame_num, feat)
                _active_players[_next_player_id] = new
                _next_player_id += 1
                new_players += 1

            else:
                fallback_id, fallback_sim = get_best_match(feat, _active_players)
                if fallback_sim >= 0.4:
                    _active_players[fallback_id].update_bbox(bbox, frame_num)
                    _active_players[fallback_id].features = feat
                    reidentified += 1
                else:
                    oldest_id = min(_active_players.items(), key=lambda x: x[1].last_seen_frame)[0]
                    _active_players[oldest_id].update_bbox(bbox, frame_num)
                    _active_players[oldest_id].features = feat
                    print(f" Forced ID reuse: ID {oldest_id}")
                    reidentified += 1

        total_reidentified += reidentified
        total_new_players += new_players
        total_active_sum += len(_active_players)

        for pid in list(_active_players.keys()):
            if _active_players[pid].last_seen_frame < frame_num:
                _active_players[pid].lost_frames_count += 1
                if _active_players[pid].lost_frames_count > MAX_LOST_FRAMES:
                    _inactive_players[pid] = _active_players.pop(pid)

        for d in detections:
            x1, y1, x2, y2 = d['bbox']
            color = (0, 255, 0) if d['class'] == 'player' else (0, 0, 255) if d['class'] == 'referee' else (255, 0, 0)
            cv2.rectangle(frame, (x1, y1), (x2, y2), color, 1)
            cv2.putText(frame, f"{d['class']} {d['conf']:.2f}", (x1, y1 - 5),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 1)

        for pid, player in _active_players.items():
            x1, y1, x2, y2 = map(int, player.bbox)
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 255), 3)
            cv2.putText(frame, f"ID: {pid}", (x1, y1 - 10),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 255), 2)

        out.write(frame)

        print(f"[Frame {frame_num:03d}] Detections: {len(detections)} | New: {new_players} | ReID: {reidentified} | Active: {len(_active_players)}")

    cap.release()
    out.release()

    # Print final metrics
    print("\nFinal Stats:")
    print(f" Total frames processed: {frame_num}")
    print(f" Total unique players (IDs): {_next_player_id}")
    print(f" Total new player detections: {total_new_players}")
    print(f" Total re-identifications: {total_reidentified}")
    print(f" Avg active players/frame: {total_active_sum / frame_num:.2f}")
    print(f" Total detections (all classes): {total_detections}")
    print(f" Output saved at: {OUTPUT_VIDEO_PATH}")
    return OUTPUT_VIDEO_PATH

# Run YOLOv11 Model and Start Player Re-Identification  
This cell loads the YOLOv11 model and processes the input video to detect, track, and consistently re-identify players throughout the footage. The final annotated video is saved to the output path.


In [8]:
_load_yolo_model()
output = process_reid_video(VIDEO_PATH)

[INFO] Loading YOLO model from /content/drive/MyDrive/Player_Re_Identification_Assignment/best.pt on CUDA...
[INFO] Model loaded successfully. Tracking class ID: 2
[Frame 001] Detections: 17 | New: 15 | ReID: 0 | Active: 15
[Frame 002] Detections: 19 | New: 0 | ReID: 0 | Active: 15
[Frame 003] Detections: 18 | New: 0 | ReID: 0 | Active: 15
[Frame 004] Detections: 18 | New: 2 | ReID: 0 | Active: 17
[Frame 005] Detections: 17 | New: 0 | ReID: 0 | Active: 17
[Frame 006] Detections: 17 | New: 1 | ReID: 0 | Active: 18
[Frame 007] Detections: 17 | New: 1 | ReID: 0 | Active: 19
[Frame 008] Detections: 16 | New: 1 | ReID: 0 | Active: 20
[Frame 009] Detections: 16 | New: 2 | ReID: 0 | Active: 22
[Frame 010] Detections: 17 | New: 0 | ReID: 2 | Active: 22
[Frame 011] Detections: 16 | New: 0 | ReID: 2 | Active: 22
[Frame 012] Detections: 15 | New: 0 | ReID: 2 | Active: 22
[Frame 013] Detections: 17 | New: 0 | ReID: 1 | Active: 22
[Frame 014] Detections: 16 | New: 0 | ReID: 0 | Active: 22
[Frame 01

# Download the Output Video  
This cell checks if the re-identification output video was successfully generated and provides a download link for the user within Google Colab.


In [9]:
from google.colab import files
if os.path.exists(OUTPUT_VIDEO_PATH):
    print("Output video ready.")
    files.download(OUTPUT_VIDEO_PATH)
else:
    print("Output not found.")

Output video ready.


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>