# 02 — Bounding Box Visualization

**Objective:**  
Visualize object bounding boxes from the EPIC-KITCHENS dataset to ensure annotations are correct and aligned with frames.

This notebook:
1. Loads the EPIC-KITCHENS object annotation CSV.  
2. Parses bounding boxes from the dataset.  
3. Draws them on top of video frames.  
4. Exports annotated images for visual inspection.

**Based on:** `epic55-bbox.py`

In [91]:
%load_ext autoreload
%autoreload 2

import pandas as pd
import cv2
import os, sys, ast, random
from tqdm import tqdm

sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..')))
from config import DATA_ROOT, ANNOTATION_CSV, FRAMES_ROOT

# === Paths ===
FRAMES_DIR = os.path.join(FRAMES_ROOT, "P04", "object_detection_images")  # Example participant folder
OUTPUT_DIR = os.path.join("./data/annotated_frames")

os.makedirs(OUTPUT_DIR, exist_ok=True)

print(f"Config loaded successfully\nDATA_ROOT: {DATA_ROOT}\nANNOTATION_CSV: {ANNOTATION_CSV}\nOUTPUT_DIR: {OUTPUT_DIR}")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Config loaded successfully
DATA_ROOT: ../annotations
ANNOTATION_CSV: ../annotations/EPIC_train_object_labels.csv
OUTPUT_DIR: ./data/annotated_frames


## Step 1 — Load and Filter Annotations

We start by reading the main annotation CSV file from EPIC-KITCHENS and filtering out invalid or empty bounding boxes.

In [92]:
# === Load Annotations ===
print("[INFO] Loading annotations...")
df = pd.read_csv(ANNOTATION_CSV)

# Keep only rows with valid bounding boxes
df = df[df["bounding_boxes"].notna() & (df["bounding_boxes"] != "[]")]

print(f"Total valid rows before parsing: {len(df)}")
print(df)

[INFO] Loading annotations...
Total valid rows before parsing: 299546
        noun_class      noun participant_id video_id  frame  \
0               20       bag            P01   P01_01  56581   
1               20       bag            P01   P01_01  56611   
2               20       bag            P01   P01_01  56641   
3               20       bag            P01   P01_01  56671   
4               20       bag            P01   P01_01  56701   
...            ...       ...            ...      ...    ...   
389802           7  teaspoon            P31   P31_14  12031   
389803           7  teaspoon            P31   P31_14  12061   
389804           7  teaspoon            P31   P31_14  12091   
389805           7  teaspoon            P31   P31_14  12121   
389806           7  teaspoon            P31   P31_14  12151   

                bounding_boxes  
0       [(76, 1260, 462, 186)]  
1       [(84, 1190, 446, 204)]  
2       [(584, 936, 358, 268)]  
3       [(472, 836, 412, 342)]  
4       

## Step 2 — Parse Bounding Boxes

Each bounding box is stored as a string (e.g., `'[(top, left, height, width)]'`).  
We parse them safely using `ast.literal_eval` and filter only those with valid 4-element tuples.

In [93]:
def parse_boxes(x):
    """Parse the bounding box list from string representation."""
    try:
        boxes = ast.literal_eval(x)
        if isinstance(boxes, tuple):
            boxes = [boxes]
        return [tuple(map(int, b)) for b in boxes if len(b) == 4]
    except Exception:
        return []

df["boxes"] = df["bounding_boxes"].apply(parse_boxes)
df = df[df["boxes"].map(len) > 0]

print(f"✅ Parsed {len(df)} rows with valid bounding boxes")

✅ Parsed 299546 rows with valid bounding boxes


## Step 3 — Assign Colors per Class

We assign a random RGB color to each unique object noun to make visualization easier.

In [98]:
unique_nouns = df["noun"].unique()
colors = {noun: tuple(random.randint(0, 255) for _ in range(3)) for noun in unique_nouns}

print(f"Assigned colors for {len(unique_nouns)} classes.")

Assigned colors for 784 classes.


## Step 4 — Group Frames by Video and Frame ID

We’ll iterate over each `(video_id, frame)` pair to process all annotations corresponding to that frame.

In [99]:
grouped = df.groupby(["video_id", "frame"])
print(f"Total frames to process: {len(grouped)}")

Total frames to process: 200390


## Step 5 — Draw Bounding Boxes

For each frame:
- Load the corresponding image.
- Draw all bounding boxes associated with that frame.
- Save the annotated frame to the output directory.

---
Note: EPIC-KITCHENS sometimes stores frames with **6-digit** filenames (e.g., `000123.jpg`) or **10-digit** ones.
We handle both formats.

In [100]:
for (video_id, frame_number), rows in tqdm(grouped, desc="[INFO] Drawing"):
    video_dir = os.path.join(FRAMES_DIR, video_id)
    frame_path = os.path.join(video_dir, f"{int(frame_number):010d}.jpg")

    # Handle 6-digit frame name variants
    if not os.path.exists(frame_path):
        frame_path = os.path.join(video_dir, f"{int(frame_number):06d}.jpg")
        if not os.path.exists(frame_path):
            continue  # skip missing frames

    frame = cv2.imread(frame_path)
    if frame is None:
        print(f"[WARN] Could not read {frame_path}")
        continue

    # Draw bounding boxes
    for _, row in rows.iterrows():
        noun = row["noun"]
        color = colors[noun]
        for (top, left, height, width) in row["boxes"]:
            x1, y1 = int(left), int(top)
            x2, y2 = int(left + width), int(top + height)
            cv2.rectangle(frame, (x1, y1), (x2, y2), color, 2)
            cv2.putText(frame, noun, (x1, max(0, y1 - 5)),
                        cv2.FONT_HERSHEY_SIMPLEX, 0.6, color, 2)

    # Save output
    out_dir = os.path.join(OUTPUT_DIR, video_id)
    os.makedirs(out_dir, exist_ok=True)
    out_path = os.path.join(out_dir, f"frame_{int(frame_number):010d}.jpg")
    cv2.imwrite(out_path, frame)

print("\n✅ All annotated frames saved to:", OUTPUT_DIR)

[INFO] Drawing:   0%|          | 0/200390 [00:00<?, ?it/s]

[INFO] Drawing: 100%|██████████| 200390/200390 [00:07<00:00, 26432.83it/s]


✅ All annotated frames saved to: ./data/annotated_frames





## ✅ Summary

- **Input:** `EPIC_train_object_labels.csv`  
- **Output:** Annotated frames with colored bounding boxes per object  
- **Goal:** Verify that bounding boxes align correctly with their objects.

You can now inspect frames inside:

In [97]:
print(f"Annotated frames saved in:\n{OUTPUT_DIR}")

Annotated frames saved in:
./data/annotated_frames
