# Notebook 03: Bounding Box Visualization

**Course:** Deep Neural Network Architectures (21CSE558T)  
**Module 5:** Object Detection and Localization  
**Week 13:** Object Localization Fundamentals  
**Duration:** ~15 minutes

## Learning Objectives
By the end of this notebook, you will be able to:
- Draw bounding boxes on images using different formats
- Convert between coordinate formats (XYXY, XYWH, COCO)
- Visualize multiple objects with labels and confidence scores
- Understand normalized coordinates and their importance

## Introduction
Bounding boxes are the fundamental representation for object detection. Understanding how to work with different coordinate formats is crucial for:
- Reading datasets (COCO, PASCAL VOC, YOLO)
- Implementing detection models
- Visualizing predictions
- Computing metrics like IoU

## Setup and Imports

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image, ImageDraw, ImageFont
import cv2

# Set random seed for reproducibility
np.random.seed(42)

# Configure matplotlib for better visualization
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 10

print("Libraries imported successfully!")
print(f"NumPy version: {np.__version__}")
print(f"OpenCV version: {cv2.__version__}")

## Coordinate Formats Explained

There are three main coordinate formats for bounding boxes:

### 1. XYXY Format (Corner Points)
- **[x_min, y_min, x_max, y_max]**
- Top-left corner: (x_min, y_min)
- Bottom-right corner: (x_max, y_max)
- Used by: PASCAL VOC, many PyTorch libraries
- Example: [50, 30, 200, 180]

### 2. XYWH Format (Center + Size)
- **[x_center, y_center, width, height]**
- Center point: (x_center, y_center)
- Box dimensions: width × height
- Used by: YOLO annotations
- Example: [125, 105, 150, 150]

### 3. COCO Format (Top-Left + Size)
- **[x_top_left, y_top_left, width, height]**
- Top-left corner: (x_top_left, y_top_left)
- Box dimensions: width × height
- Used by: MS COCO dataset
- Example: [50, 30, 150, 150]

### Visual Comparison:
```
        (x_min, y_min) ●─────────┐
                       │         │
                       │    ●    │  ← (x_center, y_center)
                       │         │
                       └─────────● (x_max, y_max)
                       
        Width = x_max - x_min
        Height = y_max - y_min
```

## Create Sample Image

Let's create a synthetic image to demonstrate bounding boxes:

In [None]:
# Create a simple synthetic image (400x300 pixels)
width, height = 400, 300
image = np.ones((height, width, 3), dtype=np.uint8) * 240  # Light gray background

# Add some colored rectangles to simulate objects
# Object 1: Red rectangle
cv2.rectangle(image, (50, 30), (150, 130), (220, 50, 50), -1)
# Object 2: Blue circle
cv2.circle(image, (250, 80), 40, (50, 50, 220), -1)
# Object 3: Green triangle
pts = np.array([[100, 200], [150, 250], [50, 250]], np.int32)
cv2.fillPoly(image, [pts], (50, 220, 50))
# Object 4: Yellow rectangle
cv2.rectangle(image, (280, 180), (370, 260), (220, 220, 50), -1)

# Display the image
plt.figure(figsize=(10, 7))
plt.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
plt.title("Sample Image with Objects", fontsize=14, fontweight='bold')
plt.axis('off')
plt.tight_layout()
plt.show()

print(f"Image shape: {image.shape}")
print(f"Image size: {width} × {height} pixels")

## Draw Box in XYXY Format

In [None]:
def draw_box_xyxy(image, box, color=(0, 255, 0), thickness=2, label=None):
    """
    Draw bounding box in XYXY format.
    
    Args:
        image: Image array (BGR format)
        box: [x_min, y_min, x_max, y_max]
        color: Box color in BGR format
        thickness: Line thickness in pixels
        label: Optional text label
    
    Returns:
        Image with drawn box
    """
    img_copy = image.copy()
    x_min, y_min, x_max, y_max = [int(c) for c in box]
    
    # Draw rectangle
    cv2.rectangle(img_copy, (x_min, y_min), (x_max, y_max), color, thickness)
    
    # Add label if provided
    if label:
        # Background for text
        (text_width, text_height), _ = cv2.getTextSize(
            label, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 1
        )
        cv2.rectangle(img_copy, (x_min, y_min - text_height - 10), 
                     (x_min + text_width + 5, y_min), color, -1)
        cv2.putText(img_copy, label, (x_min + 2, y_min - 5), 
                   cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 1)
    
    return img_copy

# Example: Draw box around red rectangle (Object 1)
box_xyxy = [40, 20, 160, 140]  # [x_min, y_min, x_max, y_max]
result = draw_box_xyxy(image, box_xyxy, color=(0, 255, 0), thickness=3, label="XYXY Format")

# Display
plt.figure(figsize=(10, 7))
plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
plt.title("Bounding Box in XYXY Format", fontsize=14, fontweight='bold')
plt.text(10, 280, f"Box coordinates: {box_xyxy}", 
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8),
         fontsize=12)
plt.axis('off')
plt.tight_layout()
plt.show()

print(f"XYXY box: {box_xyxy}")
print(f"Width: {box_xyxy[2] - box_xyxy[0]} pixels")
print(f"Height: {box_xyxy[3] - box_xyxy[1]} pixels")

## Draw Box in XYWH Format

In [None]:
def draw_box_xywh(image, box, color=(255, 0, 0), thickness=2, label=None):
    """
    Draw bounding box in XYWH format (center-based).
    
    Args:
        image: Image array (BGR format)
        box: [x_center, y_center, width, height]
        color: Box color in BGR format
        thickness: Line thickness
        label: Optional text label
    
    Returns:
        Image with drawn box
    """
    img_copy = image.copy()
    x_center, y_center, w, h = box
    
    # Convert to corner points
    x_min = int(x_center - w / 2)
    y_min = int(y_center - h / 2)
    x_max = int(x_center + w / 2)
    y_max = int(y_center + h / 2)
    
    # Draw rectangle
    cv2.rectangle(img_copy, (x_min, y_min), (x_max, y_max), color, thickness)
    
    # Draw center point
    cv2.circle(img_copy, (int(x_center), int(y_center)), 5, color, -1)
    
    # Add label if provided
    if label:
        (text_width, text_height), _ = cv2.getTextSize(
            label, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 1
        )
        cv2.rectangle(img_copy, (x_min, y_min - text_height - 10), 
                     (x_min + text_width + 5, y_min), color, -1)
        cv2.putText(img_copy, label, (x_min + 2, y_min - 5), 
                   cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 1)
    
    return img_copy

# Example: Draw box around blue circle (Object 2)
box_xywh = [250, 80, 100, 100]  # [x_center, y_center, width, height]
result = draw_box_xywh(image, box_xywh, color=(255, 0, 0), thickness=3, label="XYWH Format")

# Display
plt.figure(figsize=(10, 7))
plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
plt.title("Bounding Box in XYWH Format (Center-based)", fontsize=14, fontweight='bold')
plt.text(10, 280, f"Box coordinates: {box_xywh}\nRed dot = center point", 
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8),
         fontsize=12)
plt.axis('off')
plt.tight_layout()
plt.show()

print(f"XYWH box: {box_xywh}")
print(f"Center: ({box_xywh[0]}, {box_xywh[1]})")
print(f"Size: {box_xywh[2]} × {box_xywh[3]} pixels")

## Format Conversion Functions

These functions convert between different bounding box formats:

In [None]:
def xyxy_to_xywh(box):
    """
    Convert XYXY format to XYWH format (center-based).
    
    Args:
        box: [x_min, y_min, x_max, y_max]
    
    Returns:
        [x_center, y_center, width, height]
    """
    x_min, y_min, x_max, y_max = box
    width = x_max - x_min
    height = y_max - y_min
    x_center = x_min + width / 2
    y_center = y_min + height / 2
    return [x_center, y_center, width, height]

def xywh_to_xyxy(box):
    """
    Convert XYWH format (center-based) to XYXY format.
    
    Args:
        box: [x_center, y_center, width, height]
    
    Returns:
        [x_min, y_min, x_max, y_max]
    """
    x_center, y_center, width, height = box
    x_min = x_center - width / 2
    y_min = y_center - height / 2
    x_max = x_center + width / 2
    y_max = y_center + height / 2
    return [x_min, y_min, x_max, y_max]

def xyxy_to_coco(box):
    """
    Convert XYXY format to COCO format.
    
    Args:
        box: [x_min, y_min, x_max, y_max]
    
    Returns:
        [x_top_left, y_top_left, width, height]
    """
    x_min, y_min, x_max, y_max = box
    width = x_max - x_min
    height = y_max - y_min
    return [x_min, y_min, width, height]

def coco_to_xyxy(box):
    """
    Convert COCO format to XYXY format.
    
    Args:
        box: [x_top_left, y_top_left, width, height]
    
    Returns:
        [x_min, y_min, x_max, y_max]
    """
    x, y, width, height = box
    x_max = x + width
    y_max = y + height
    return [x, y, x_max, y_max]

def xywh_to_coco(box):
    """
    Convert XYWH (center-based) to COCO format.
    
    Args:
        box: [x_center, y_center, width, height]
    
    Returns:
        [x_top_left, y_top_left, width, height]
    """
    x_center, y_center, width, height = box
    x_top_left = x_center - width / 2
    y_top_left = y_center - height / 2
    return [x_top_left, y_top_left, width, height]

def coco_to_xywh(box):
    """
    Convert COCO format to XYWH (center-based).
    
    Args:
        box: [x_top_left, y_top_left, width, height]
    
    Returns:
        [x_center, y_center, width, height]
    """
    x, y, width, height = box
    x_center = x + width / 2
    y_center = y + height / 2
    return [x_center, y_center, width, height]

print("All conversion functions defined successfully!")
print("\nAvailable conversions:")
print("  - xyxy_to_xywh()")
print("  - xywh_to_xyxy()")
print("  - xyxy_to_coco()")
print("  - coco_to_xyxy()")
print("  - xywh_to_coco()")
print("  - coco_to_xywh()")

## Test Conversions

Let's verify that our conversions work correctly:

In [None]:
# Original box in XYXY format
original_xyxy = [50, 30, 200, 180]
print("Original XYXY:", original_xyxy)
print("=" * 60)

# Convert to XYWH and back
converted_xywh = xyxy_to_xywh(original_xyxy)
print(f"\nConverted to XYWH: {converted_xywh}")
back_to_xyxy = xywh_to_xyxy(converted_xywh)
print(f"Back to XYXY: {back_to_xyxy}")
print(f"Match: {np.allclose(original_xyxy, back_to_xyxy)}")

# Convert to COCO and back
converted_coco = xyxy_to_coco(original_xyxy)
print(f"\nConverted to COCO: {converted_coco}")
back_to_xyxy2 = coco_to_xyxy(converted_coco)
print(f"Back to XYXY: {back_to_xyxy2}")
print(f"Match: {np.allclose(original_xyxy, back_to_xyxy2)}")

# Convert XYWH to COCO and back
converted_coco2 = xywh_to_coco(converted_xywh)
print(f"\nXYWH to COCO: {converted_coco2}")
print(f"Match with previous COCO: {np.allclose(converted_coco, converted_coco2)}")
back_to_xywh = coco_to_xywh(converted_coco2)
print(f"Back to XYWH: {back_to_xywh}")
print(f"Match: {np.allclose(converted_xywh, back_to_xywh)}")

print("\n" + "=" * 60)
print("All conversions verified successfully!")

# Summary table
print("\nFormat Comparison Table:")
print("─" * 60)
print(f"{'Format':<15} {'Coordinates':<45}")
print("─" * 60)
print(f"{'XYXY':<15} {str([f'{x:.1f}' for x in original_xyxy]):<45}")
print(f"{'XYWH (center)':<15} {str([f'{x:.1f}' for x in converted_xywh]):<45}")
print(f"{'COCO':<15} {str([f'{x:.1f}' for x in converted_coco]):<45}")
print("─" * 60)

## Normalized Coordinates

Many deep learning models use normalized coordinates in the range [0, 1]. This makes the model resolution-independent.

In [None]:
def normalize_box(box, img_width, img_height, format='xyxy'):
    """
    Normalize box coordinates to [0, 1] range.
    
    Args:
        box: Box coordinates
        img_width: Image width in pixels
        img_height: Image height in pixels
        format: 'xyxy', 'xywh', or 'coco'
    
    Returns:
        Normalized box coordinates
    """
    if format == 'xyxy':
        x_min, y_min, x_max, y_max = box
        return [
            x_min / img_width,
            y_min / img_height,
            x_max / img_width,
            y_max / img_height
        ]
    elif format in ['xywh', 'coco']:
        x, y, w, h = box
        return [
            x / img_width,
            y / img_height,
            w / img_width,
            h / img_height
        ]

def denormalize_box(box, img_width, img_height, format='xyxy'):
    """
    Convert normalized coordinates back to pixel coordinates.
    
    Args:
        box: Normalized box coordinates [0, 1]
        img_width: Image width in pixels
        img_height: Image height in pixels
        format: 'xyxy', 'xywh', or 'coco'
    
    Returns:
        Pixel coordinates
    """
    if format == 'xyxy':
        x_min, y_min, x_max, y_max = box
        return [
            x_min * img_width,
            y_min * img_height,
            x_max * img_width,
            y_max * img_height
        ]
    elif format in ['xywh', 'coco']:
        x, y, w, h = box
        return [
            x * img_width,
            y * img_height,
            w * img_width,
            h * img_height
        ]

# Example: Normalize and denormalize
box_pixel = [50, 30, 200, 180]
img_w, img_h = 400, 300

print("Image size:", img_w, "×", img_h)
print("Original box (pixels):", box_pixel)

# Normalize
box_norm = normalize_box(box_pixel, img_w, img_h, format='xyxy')
print(f"\nNormalized box: {[f'{x:.4f}' for x in box_norm]}")

# Denormalize
box_denorm = denormalize_box(box_norm, img_w, img_h, format='xyxy')
print(f"Denormalized box: {[f'{x:.1f}' for x in box_denorm]}")
print(f"Match: {np.allclose(box_pixel, box_denorm)}")

print("\n" + "=" * 60)
print("Why Normalized Coordinates?")
print("=" * 60)
print("1. Resolution Independence: Same model works on any image size")
print("2. Training Stability: Values in [0,1] help gradient flow")
print("3. Transfer Learning: Pre-trained models use normalized coords")
print("4. Data Augmentation: Easy to scale/resize images")
print("\nExample: YOLO models always output normalized coordinates!")

## Multiple Boxes with Labels

Real object detection involves detecting multiple objects in a single image:

In [None]:
def draw_multiple_boxes(image, boxes, labels, colors=None, thickness=2):
    """
    Draw multiple bounding boxes with labels.
    
    Args:
        image: Image array (BGR)
        boxes: List of boxes in XYXY format
        labels: List of label strings
        colors: List of colors (BGR). If None, uses random colors
        thickness: Line thickness
    
    Returns:
        Image with drawn boxes
    """
    img_copy = image.copy()
    
    # Generate random colors if not provided
    if colors is None:
        np.random.seed(42)
        colors = [tuple(np.random.randint(0, 255, 3).tolist()) for _ in boxes]
    
    for box, label, color in zip(boxes, labels, colors):
        x_min, y_min, x_max, y_max = [int(c) for c in box]
        
        # Draw rectangle
        cv2.rectangle(img_copy, (x_min, y_min), (x_max, y_max), color, thickness)
        
        # Add label with background
        (text_width, text_height), baseline = cv2.getTextSize(
            label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2
        )
        cv2.rectangle(img_copy, 
                     (x_min, y_min - text_height - baseline - 5), 
                     (x_min + text_width + 5, y_min), 
                     color, -1)
        cv2.putText(img_copy, label, (x_min + 2, y_min - 5), 
                   cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
    
    return img_copy

# Define multiple objects
boxes = [
    [40, 20, 160, 140],    # Red rectangle
    [200, 30, 300, 130],   # Blue circle
    [40, 190, 160, 260],   # Green triangle
    [270, 170, 380, 270],  # Yellow rectangle
    [150, 100, 250, 180]   # Additional object
]

labels = [
    "Rectangle",
    "Circle",
    "Triangle",
    "Square",
    "Unknown"
]

# Custom colors (BGR format)
colors = [
    (0, 255, 0),    # Green
    (255, 0, 0),    # Blue
    (0, 165, 255),  # Orange
    (203, 192, 255), # Pink
    (0, 255, 255)   # Yellow
]

# Draw all boxes
result = draw_multiple_boxes(image, boxes, labels, colors, thickness=3)

# Display
plt.figure(figsize=(12, 8))
plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
plt.title("Multiple Object Detection with Labels", fontsize=14, fontweight='bold')
plt.axis('off')
plt.tight_layout()
plt.show()

print(f"Total objects detected: {len(boxes)}")
print("\nDetections:")
for i, (box, label) in enumerate(zip(boxes, labels), 1):
    print(f"{i}. {label:<12} at {box}")

## Confidence Scores

Object detectors output confidence scores (0-1) indicating how certain they are about each detection:

In [None]:
def draw_boxes_with_confidence(image, boxes, labels, confidences, 
                               colors=None, thickness=2, conf_threshold=0.0):
    """
    Draw bounding boxes with confidence scores.
    
    Args:
        image: Image array (BGR)
        boxes: List of boxes in XYXY format
        labels: List of class labels
        confidences: List of confidence scores (0-1)
        colors: List of colors (BGR)
        thickness: Line thickness
        conf_threshold: Only draw boxes above this confidence
    
    Returns:
        Image with drawn boxes and count of drawn boxes
    """
    img_copy = image.copy()
    
    if colors is None:
        np.random.seed(42)
        colors = [tuple(np.random.randint(0, 255, 3).tolist()) for _ in boxes]
    
    drawn_count = 0
    for box, label, conf, color in zip(boxes, labels, confidences, colors):
        # Skip low-confidence detections
        if conf < conf_threshold:
            continue
        
        drawn_count += 1
        x_min, y_min, x_max, y_max = [int(c) for c in box]
        
        # Draw rectangle
        cv2.rectangle(img_copy, (x_min, y_min), (x_max, y_max), color, thickness)
        
        # Create label with confidence
        text = f"{label} {conf*100:.1f}%"
        
        # Add label with background
        (text_width, text_height), baseline = cv2.getTextSize(
            text, cv2.FONT_HERSHEY_SIMPLEX, 0.6, 2
        )
        cv2.rectangle(img_copy, 
                     (x_min, y_min - text_height - baseline - 5), 
                     (x_min + text_width + 5, y_min), 
                     color, -1)
        cv2.putText(img_copy, text, (x_min + 2, y_min - 5), 
                   cv2.FONT_HERSHEY_SIMPLEX, 0.6, (255, 255, 255), 2)
    
    return img_copy, drawn_count

# Simulate detection results with varying confidences
boxes = [
    [40, 20, 160, 140],
    [200, 30, 300, 130],
    [40, 190, 160, 260],
    [270, 170, 380, 270],
    [150, 100, 250, 180],
    [10, 120, 80, 200]     # Low confidence detection
]

labels = ["Rectangle", "Circle", "Triangle", "Square", "Unknown", "Noise"]
confidences = [0.95, 0.89, 0.92, 0.78, 0.45, 0.23]  # Varying confidence scores

# Create figure with subplots
fig, axes = plt.subplots(1, 3, figsize=(18, 6))

# Different confidence thresholds
thresholds = [0.0, 0.5, 0.8]

for ax, threshold in zip(axes, thresholds):
    result, count = draw_boxes_with_confidence(
        image, boxes, labels, confidences, 
        colors=colors + [(128, 128, 128)],  # Gray for low confidence
        thickness=3,
        conf_threshold=threshold
    )
    
    ax.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
    ax.set_title(f"Confidence Threshold: {threshold:.1f}\n({count} detections)", 
                fontweight='bold')
    ax.axis('off')

plt.suptitle("Effect of Confidence Threshold on Detections", 
            fontsize=16, fontweight='bold', y=1.02)
plt.tight_layout()
plt.show()

print("Detection Summary:")
print("=" * 60)
for label, conf in zip(labels, confidences):
    status = "✓" if conf >= 0.5 else "✗"
    print(f"{status} {label:<12} Confidence: {conf*100:5.1f}%")

print("\nConfidence Threshold Best Practices:")
print("  - 0.5: Default threshold for most applications")
print("  - 0.3-0.4: When recall is more important (medical, security)")
print("  - 0.7-0.9: When precision is critical (autonomous vehicles)")

## Exercise: Coordinate Format Practice

Try these exercises to test your understanding:

In [None]:
print("EXERCISE: Coordinate Format Conversions")
print("=" * 60)

# Exercise 1: Convert between formats
print("\n1. Given a box in XYXY format [100, 50, 300, 250]:")
print("   a) Convert to XYWH (center-based) format")
print("   b) Convert to COCO format")
print("   c) Normalize coordinates for a 640×480 image")

# Solution (uncomment to see)
# exercise_box = [100, 50, 300, 250]
# print("\n   Solutions:")
# print(f"   a) XYWH: {xyxy_to_xywh(exercise_box)}")
# print(f"   b) COCO: {xyxy_to_coco(exercise_box)}")
# print(f"   c) Normalized: {normalize_box(exercise_box, 640, 480, 'xyxy')}")

print("\n" + "=" * 60)
print("\n2. Draw bounding boxes for these detections:")
print("   - Person at [50, 100, 150, 300] with 0.92 confidence")
print("   - Car at [200, 80, 350, 200] with 0.87 confidence")
print("   - Dog at [180, 220, 280, 320] with 0.65 confidence")
print("\n   Filter out detections below 0.7 confidence threshold")

# Solution template
# exercise_boxes = [[50, 100, 150, 300], [200, 80, 350, 200], [180, 220, 280, 320]]
# exercise_labels = ["Person", "Car", "Dog"]
# exercise_confs = [0.92, 0.87, 0.65]
# result, count = draw_boxes_with_confidence(image, exercise_boxes, exercise_labels, 
#                                            exercise_confs, conf_threshold=0.7)
# plt.figure(figsize=(10, 7))
# plt.imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
# plt.title(f"Exercise Result ({count} detections above threshold)")
# plt.axis('off')
# plt.show()

print("\n" + "=" * 60)
print("\n3. Calculate the area of these boxes:")
print("   a) XYXY: [10, 20, 110, 120]")
print("   b) XYWH: [60, 70, 100, 100]")
print("   c) COCO: [10, 20, 100, 100]")

# Hint: Area = width × height

print("\n" + "=" * 60)
print("\nUncomment the solution code above to see answers!")

## Summary

### Key Takeaways

1. **Three Main Coordinate Formats:**
   - **XYXY**: [x_min, y_min, x_max, y_max] - Corner points
   - **XYWH**: [x_center, y_center, width, height] - Center-based
   - **COCO**: [x_top_left, y_top_left, width, height] - Top-left + size

2. **Format Conversions:**
   - Essential for working with different datasets and models
   - Simple mathematical transformations
   - Always verify conversions are reversible

3. **Normalized Coordinates:**
   - Range: [0, 1] instead of pixel values
   - Benefits: Resolution independence, training stability
   - Used by: YOLO, many modern detectors

4. **Visualization Best Practices:**
   - Use distinct colors for different classes
   - Display confidence scores with labels
   - Apply confidence thresholds to filter predictions
   - Draw boxes with appropriate thickness for visibility

5. **Confidence Thresholds:**
   - Trade-off between precision and recall
   - Typical values: 0.5 (default), 0.3-0.4 (high recall), 0.7-0.9 (high precision)
   - Application-specific tuning required

### Next Steps
- **Notebook 04**: Learn evaluation metrics (IoU, Precision, Recall, mAP)
- **Notebook 05**: Understand classical sliding window detection
- **Week 14**: Modern detection architectures (YOLO, R-CNN)

### Common Dataset Formats
- **PASCAL VOC**: Uses XYXY format
- **MS COCO**: Uses COCO format (top-left + size)
- **YOLO**: Uses normalized XYWH format

### Useful Resources
- OpenCV Documentation: Drawing functions
- COCO Dataset API: Standard annotation format
- YOLO Documentation: Normalized coordinates

---

**Completion Time:** ~15 minutes  
**Estimated Mastery:** Practice with real datasets for 2-3 hours