**Implementing the Fundamental Functions**

Write a function to compute IoU (Intersection over Union) https://pyimagesearch.com/2016/11/07/intersection-over-union-iou-for-object-detection/ between two axis-aligned bounding boxes specified in the Ultralytics YOLO format. You MUST use the shapely library [https://pypi.org/project/shapely/] and its functionalities to write your function.

 Show that your function provides the same or similar answer as IoU computed using `supervision` library

# **Procedure for Computing IoU (Intersection over Union) using Shapely**

## **1. Understanding YOLO Format**
The YOLO format represents bounding boxes using the following normalized values:


where:
- **category**: The class label of the object.
- **x_center, y_center**: The normalized center coordinates of the bounding box (values between 0 and 1).
- **width, height**: The normalized width and height of the bounding box (values between 0 and 1).

## **2. Steps to Compute IoU**
### **Step 1: Convert YOLO Box to Pixel Coordinates**
Since the YOLO format provides normalized coordinates, we first convert them to absolute pixel values.

1. Convert **normalized width and height** to pixels:
   $$
   \text{width}_{px} = \text{width}_{norm} \times \text{image\_width}
   $$

   $$
   \text{height}_{px} = \text{height}_{norm} \times \text{image\_height}
   $$

2. Convert the **normalized center coordinates** to pixel coordinates:
   $$
   cx = x_{norm} \times \text{image\_width}
   $$

   $$
   cy = y_{norm} \times \text{image\_height}
   $$

3. Compute the **bounding box corners**:
   $$
   x_{\text{min}} = cx - \frac{\text{width}_{px}}{2}
   $$
   $$
   x_{\text{max}} = cx + \frac{\text{width}_{px}}{2}
   $$
   $$
   y_{\text{min}} = cy - \frac{\text{height}_{px}}{2}
   $$
   $$
   y_{\text{max}} = cy + \frac{\text{height}_{px}}{2}
   $$

### **Step 2: Create Bounding Box Polygons**
Using the **Shapely** library, we create a polygon for each bounding box using the computed corner points:
   $$
   \text{Polygon}([(x_{\text{min}}, y_{\text{min}}), (x_{\text{max}}, y_{\text{min}}), (x_{\text{max}}, y_{\text{max}}), (x_{\text{min}}, y_{\text{max}})])
   $$

### **Step 3: Compute Intersection and Union**
1. Compute the **intersection area** between the two bounding boxes using:
   $$
   \text{intersection} = \text{polygon}_1 \cap \text{polygon}_2
   $$

2. Compute the **union area**:
   $$
   \text{union} = \text{polygon}_1 \cup \text{polygon}_2
   $$

### **Step 4: Compute IoU**
Finally, IoU is computed using:
   $$
   IoU = \frac{\text{intersection area}}{\text{union area}}
   $$

If the **union area is zero**, IoU is defined as 0.




In [14]:

import numpy as np
from shapely.geometry import Polygon

def yolo_to_polygon(yolo_box, img_width, img_height):
    """
    Convert YOLO box to Shapely Polygon using your formula.
    """
    class_id, cx_norm, cy_norm, w_norm, h_norm = yolo_box

    
    cx = cx_norm * img_width * 2
    cy = cy_norm * img_height * 2
    w = w_norm * img_width
    h = h_norm * img_height

    x_min = (cx - w) / 2
    x_max = (cx + w) / 2
    y_min = (cy - h) / 2
    y_max = (cy + h) / 2

    return Polygon([(x_min, y_min), (x_max, y_min), (x_max, y_max), (x_min, y_max)])

def calculate_iou(box1, box2, img_width, img_height):
    poly1 = yolo_to_polygon(box1, img_width, img_height)
    poly2 = yolo_to_polygon(box2, img_width, img_height)
    
    intersection = poly1.intersection(poly2).area
    union = poly1.union(poly2).area
    
    return intersection / union if union != 0 else 0.0

In [15]:
import supervision as sv
import numpy as np

# YOLO boxes (class_id, cx, cy, w, h)
box1 = [0, 0.3125, 0.5208, 0.3125, 0.4167]
box2 = [0, 0.5, 0.5, 0.25, 0.25]

# Image dimensions
img_width, img_height = 640, 480


def yolo_to_xyxy(yolo_box, img_width, img_height):
    cx = yolo_box[1] * img_width * 2
    cy = yolo_box[2] * img_height * 2
    w = yolo_box[3] * img_width
    h = yolo_box[4] * img_height
    x_min = (cx - w) / 2
    y_min = (cy - h) / 2
    x_max = (cx + w) / 2
    y_max = (cy + h) / 2
    return [x_min, y_min, x_max, y_max]

# Convert boxes to xyxy
xyxy_box1 = yolo_to_xyxy(box1, img_width, img_height)
xyxy_box2 = yolo_to_xyxy(box2, img_width, img_height)

# Calculate IoU with supervision
iou_supervision = sv.box_iou_batch(
  np.array([xyxy_box1]),
    np.array([xyxy_box2])
)

# Calculate IoU with Shapely
iou_shapely = calculate_iou(box1, box2, img_width, img_height)

print(f"Shapely IoU: {iou_shapely:.4f}")       
print(f"Supervision IOU: {iou_supervision[0][0]:.4f}")

Shapely IoU: 0.1385
Supervision IOU: 0.1385


Write a function to compute Average Precision (AP)

**q2.a** Use Pascal VOC 11 point interpolation method to implement the function


In [16]:
# VOC 11-point AP
def voc_ap(recall, precision):
    r_levels = np.arange(0, 1.1, 0.1)
    ap = 0.0
    for r in r_levels:
        prec_above_r = precision[recall >= r]
        p = np.max(prec_above_r) if prec_above_r.size > 0 else 0.0
        ap += p / 11.0
    return ap


**Q2.b** Use COCO 101-point interpolation method to implement the function 



In [17]:
def coco_ap(recall, precision):
    r_levels = np.linspace(0, 1, 101)
    ap = 0.0
    for r in r_levels:
        prec_above_r = precision[recall >= r]
        p = np.max(prec_above_r) if prec_above_r.size > 0 else 0.0
        ap += p / 101.0
    return ap

**Q 2.c** Use Area under Precision-Recall Curve (AP) method to implement the function 
Randomly generate 10 images of size 100x100. Randomly generate 10 ground truth boxes of size 20x20 and 10 predicted boxes of size 20x20 in each image. Assume there is only one class of objects. Compare the AP50 (Average Precision at IoU 0.5) computed by 3 of your methods 

In [18]:
def auc_ap(recall, precision):
    sorted_indices = np.argsort(recall)
    recall_sorted = recall[sorted_indices]
    precision_sorted = precision[sorted_indices]
    return np.trapz(precision_sorted, recall_sorted)

In [19]:
import numpy as np

def compute_iou(box1, box2):
    x1 = max(box1[0], box2[0])
    y1 = max(box1[1], box2[1])
    x2 = min(box1[2], box2[2])
    y2 = min(box1[3], box2[3])
    if x2 < x1 or y2 < y1:
        return 0.0
    intersection = (x2 - x1) * (y2 - y1)
    area1 = (box1[2] - box1[0]) * (box1[3] - box1[1])
    area2 = (box2[2] - box2[0]) * (box2[3] - box2[1])
    union = area1 + area2 - intersection
    return intersection / union if union != 0 else 0.0

# Generate 10 images with ground truth and predicted boxes
np.random.seed(42)  # For reproducibility
images = []
for _ in range(10):
    gt_boxes = np.random.randint(0, 81, size=(10, 2))
    pred_boxes = np.random.randint(0, 81, size=(10, 2))
    confidences = np.random.rand(10)
    preds = [(x, y, conf) for (x, y), conf in zip(pred_boxes, confidences)]
    images.append({'gt': gt_boxes, 'preds': preds})

# Collect all predictions with image index
all_preds = []
for img_idx, img in enumerate(images):
    for pred in img['preds']:
        x, y, conf = pred
        all_preds.append((conf, x, y, img_idx))

# Sort predictions by confidence descending
sorted_preds = sorted(all_preds, key=lambda x: -x[0])

# Process each prediction to determine TP/FP
tp = []
fp = []
n_gt = sum(len(img['gt']) for img in images)
matched_gt = {i: set() for i in range(10)}

for pred in sorted_preds:
    conf, x, y, img_idx = pred
    img_gts = images[img_idx]['gt']
    matched = matched_gt[img_idx]
    
    pred_box = (x, y, x + 20, y + 20)
    max_iou = 0.0
    best_gt_idx = -1
    for gt_idx, (gt_x, gt_y) in enumerate(img_gts):
        if gt_idx in matched:
            continue
        gt_box = (gt_x, gt_y, gt_x + 20, gt_y + 20)
        iou = compute_iou(pred_box, gt_box)
        if iou > max_iou:
            max_iou = iou
            best_gt_idx = gt_idx
    if max_iou >= 0.5:
        matched.add(best_gt_idx)
        tp.append(1)
        fp.append(0)
    else:
        tp.append(0)
        fp.append(1)

# Calculate cumulative TP and FP
tp_cumulative = np.cumsum(tp)
fp_cumulative = np.cumsum(fp)

# Compute precision and recall
precision = tp_cumulative / (tp_cumulative + fp_cumulative)
recall = tp_cumulative / n_gt









ap_voc = voc_ap(recall, precision)
ap_coco = coco_ap(recall, precision)
ap_auc = auc_ap(recall, precision)

print(f"VOC 11-point AP50: {ap_voc:.4f}")
print(f"COCO 101-point AP50: {ap_coco:.4f}")
print(f"AUC AP50: {ap_auc:.4f}")

VOC 11-point AP50: 0.1098
COCO 101-point AP50: 0.0582
AUC AP50: 0.0359
