# Pedestrians Counting (YOLOv11) — A: Detection, B: Tracking, C: Tracking + Scene Counting

This notebook processes a people video with **YOLOv11** and Ultralytics tracking.

- **Phase A**: Detection-only → saves **`people_detection.mp4`**.
- **Phase B**: Detection + Tracking (IDs) → saves **`people_tracked.mp4`**.
- **Phase C**: Detection + Tracking + Scene Counting (current) → saves **`people_counted.mp4`**.

> Put your weights `yolo11m.pt` and a sample `Pedestrians.mp4` in the working directory.


## 0) Install packages

In [None]:
!pip -q install ultralytics opencv-python shapely lapx

## 1) Imports, configuration & helpers

In [None]:
from pathlib import Path
import time
import math
from typing import Tuple
import cv2

from ultralytics import YOLO

# --- Configuration ---
WEIGHTS = 'yolo11m.pt'   # YOLOv11 weights path
SOURCE  = 'Pedestrians.mp4'   

# Sanity checks
assert Path(WEIGHTS).exists(), 'If missing weights: place YOLOv11 weights at ./yolo11m.pt'
assert Path(SOURCE).exists() or SOURCE == 0, 'If missing Pedestrians.mp4'

# COCO class id
PERSON_ID = 0


print('Config OK:', WEIGHTS, SOURCE)


Config OK: yolo11m.pt people2.mp4


## 2) PHASE A — Detection-only → `people_detection.mp4`

This cell:
- runs **Detection** on a people video,
- draws person boxes,
- and saves the result to **`people_detection.mp4`**.

Notes:
- `classes=[0]` restricts to the **person** class (COCO ID 0).

In [None]:
# PHASE A — Detect (person only) → people_detection.mp4
# Expects these to be defined earlier in the notebook:
# WEIGHTS (e.g., 'yolo11m.pt'), SOURCE (e.g., 'people.mp4' or 0), PERSON_ID = 0
# Requires: ultralytics, opencv-python

# 1) Load model
model = YOLO(WEIGHTS)


# Stream detection results frame-by-frame (Ultralytics reads SOURCE internally)
results = model.predict(
    source=SOURCE,
    stream=True,            # yields one result per frame
    conf=0.35,              # raise to be stricter, lower to see more
    iou=0.5,                
    imgsz=960,
    verbose=False,
)

writer = None
frames = 0

for r in results:
    # Original BGR frame to draw on
    frame = r.orig_img.copy()

    # Create the writer on the first frame (match width/height; fixed 30 FPS output)
    if writer is None:
        h, w = frame.shape[:2]
        fourcc = cv2.VideoWriter_fourcc(*"mp4v")
        writer = cv2.VideoWriter("people_detection.mp4", fourcc, 30.0, (w, h))

    # If detections exist, keep only 'person' and draw boxes + confidence
    if r.boxes is not None and len(r.boxes) > 0:
        cls = r.boxes.cls.int().cpu().numpy()
        msk = (cls == PERSON_ID)
        if msk.any():
            xyxy  = r.boxes.xyxy.cpu().numpy()[msk]   # [x1,y1,x2,y2]
            confs = r.boxes.conf.cpu().numpy()[msk]   # confidence per box
            for (x1, y1, x2, y2), cf in zip(xyxy, confs):
                cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
                cv2.putText(frame, f"person {cf:.2f}",
                            (int(x1), max(0, int(y1) - 5)),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)

    # Write the annotated frame
    writer.write(frame)
    frames += 1

# 3) Clean up
if writer is not None:
    writer.release()

print(f"Saved: people_detection.mp4 ({frames} frames)")

Saved: people_detection.mp4 (597 frames)


## 3) PHASE B — Detection + Tracking (IDs only) → `people_tracked.mp4`

This cell:
- runs **tracking** (BoT-SORT) on a people video,
- draws person boxes + IDs,
- and saves the result to **`people_tracked.mp4`**.

Notes:
- `classes=[0]` restricts to the **person** class (COCO ID 0).
- `tracker="botsort.yaml"` reduces ID switches by using bot sort tracker.

In [23]:
# PHASE B — Detect + Track (IDs only) → people_tracked.mp4
# Expects these to be defined earlier in the notebook:
# WEIGHTS (e.g., 'yolo11m.pt'), SOURCE (e.g., 'people.mp4' or 0), PERSON_ID = 0
# Requires: ultralytics, opencv-python


# Inference params
# Start tracking stream:
# - BoT-SORT for fewer ID switches
# - Only 'person' class
# - imgsz 960 = more stable than 640 (but slower)
results = model.track(
    source=SOURCE,
    tracker="botsort.yaml",
    stream=True,            # yields one result per frame
    classes=[PERSON_ID],    # filter to people
    conf=0.35,              # raise to be stricter, lower to see more
    iou=0.5,
    imgsz=960,
    persist=True,           # keep tracker state across frames
    verbose=False,
)


writer = None               
frames = 0

for r in results:
    # Get the current frame (BGR) to draw on
    frame = r.orig_img.copy()

    # Create the video writer on the first frame (match input width/height)
    if writer is None:
        h, w = frame.shape[:2]
        fourcc = cv2.VideoWriter_fourcc(*"mp4v")
        writer = cv2.VideoWriter("people_tracked.mp4", fourcc, 30.0, (w, h))  # fixed 30 FPS output

    # If we have detections with IDs, draw boxes + ID labels
    if r.boxes is not None and len(r.boxes) > 0 and getattr(r.boxes, "id", None) is not None:
        ids = r.boxes.id.int().cpu().numpy()       # tracker IDs (one per detection)
        cls = r.boxes.cls.int().cpu().numpy()      # class index per detection
        msk = (cls == PERSON_ID)                   # keep only 'person'
        xyxy = r.boxes.xyxy.cpu().numpy()[msk]     # boxes for people only
        ids_f = ids[msk]                           # matching IDs for people

        # Draw each person's box and ID above it
        for (x1, y1, x2, y2), tid in zip(xyxy, ids_f):
            cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (255, 200, 0), 2)
            cv2.putText(frame, f"ID {int(tid)}",
                        (int(x1), max(0, int(y1) - 5)),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 200, 0), 2)

    # Write the annotated frame to the output video
    writer.write(frame)

# Close the file cleanly
if writer is not None:
    writer.release()

print("Saved: people_tracked.mp4")


Saved: people_tracked.mp4


## 4) PHASE C — Detection + Tracking + Scene Counting → `people_counted.mp4`

This cell:
- runs **tracking** (BoT-SORT) on a people video,
- draws person boxes + IDs,
- overlays the **current number of people in the scene**,
- and saves the result to **`people_counted.mp4`**.

Notes:
- `classes=[0]` restricts to the **person** class (COCO ID 0).
- `tracker="botsort.yaml"` reduces ID switches by using bot sort tracker.

In [26]:
# PHASE C — Detection + Tracking + Scene Counting (current only) → people_counted.mp4
# Expects these to be defined earlier in the notebook:
# WEIGHTS (e.g., 'yolo11m.pt'), SOURCE (e.g., 'people.mp4' or 0), PERSON_ID = 0
# Requires: ultralytics, opencv-python


results = model.track(
    source=SOURCE,
    tracker="botsort.yaml",
    stream=True,            # yields one result per frame
    classes=[PERSON_ID],    # filter to people
    conf=0.35,              # raise to be stricter, lower to see more
    iou=0.5,
    imgsz=960,
    persist=True,           # keep tracker state across frames
    verbose=False,
)

writer = None
frames = 0

for r in results:
    frame = r.orig_img.copy()

    # Create the video writer on the first frame
    if writer is None:
        h, w = frame.shape[:2]
        fourcc = cv2.VideoWriter_fourcc(*"mp4v")
        writer = cv2.VideoWriter("people_counted.mp4", fourcc, 30.0, (w, h))  # 30 FPS output

    # Collect IDs for people in this frame
    current_ids = []
    if r.boxes is not None and len(r.boxes) > 0 and getattr(r.boxes, "id", None) is not None:
        ids = r.boxes.id.int().cpu().numpy()
        cls = r.boxes.cls.int().cpu().numpy()
        msk = (cls == PERSON_ID)
        current_ids = ids[msk].tolist()

        # draw boxes + IDs just for clarity
        xyxy = r.boxes.xyxy.cpu().numpy()[msk]
        for (x1, y1, x2, y2), tid in zip(xyxy, current_ids):
            cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 170, 255), 2)
            cv2.putText(frame, f"ID {int(tid)}",
                        (int(x1), max(0, int(y1) - 5)),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 170, 255), 2)

    # Show current count
    current_count = len(set(current_ids))
    cv2.putText(frame, f"Current in scene: {current_count}",
                (15, 35), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (50, 255, 50), 2)

    writer.write(frame)

if writer is not None:
    writer.release()

print("Saved: people_counted.mp4")


Saved: people_counted.mp4
