### Mean Average Precision (mAP)

The most important metric used to evaluate object-detection models.

mAP is a comprehensive metric that combines both precision and recall across all object classes in a detection system. Here's a breakdown of how it works:

1. **Precision and Recall**
   - **Precision**: The percentage of correct detections among all detections made.(compared with predictions)
   - **Recall**: The percentage of actual objects that were correctly detected from the image.(compared with ground truth)

2. **Average Precision (AP)**
   - For each class, AP calculates the area under the precision-recall curve
   - It's computed by varying the confidence threshold and plotting precision vs recall
   - A higher AP means the model is better at detecting that specific class

3. **Mean Average Precision (mAP)**
   - mAP is simply the mean of AP values across all classes
   - Formula: mAP = (AP1 + AP2 + ... + APn) / n
   - Where n is the number of classes

### How it's Calculated

1. **For each class**:
   - Sort all detections by confidence score (as shown in the non-max suppression code)

2. **For each detection**:
   - Calculate IoU with ground truth boxes
   - A detection is considered correct if IoU > threshold (typically 0.5)
   - Update precision and recall values

3. **Compute AP**:
   - Plot precision vs recall curve
   - Calculate area under the curve
   - This gives AP for one class

4. **Calculate mAP**:
   - Take the mean of all class APs

### Why mAP is Important

1. **Comprehensive Evaluation**: Considers both precision and recall
2. **Class Balance**: Equally weights performance across all classes
3. **Industry Standard**: Widely used in competitions (COCO, Pascal VOC) and research
4. **Confidence Threshold Independent**: Evaluates model performance across all confidence thresholds

The metric is particularly useful because it:
- Penalizes both false positives and false negatives
- Accounts for confidence scores of predictions
- Handles multi-class detection scenarios
- Provides a single number to compare different models


In [None]:
import torch
from collections import Counter

def intersection_over_union(predictions,labels,format):
  '''
  Calculates intersection over union 
  
  Parameters:
    predictions: predictions of bounding boxes
    lables: correct labels of boxes
    format: midpoint/corners (x,y,w,h) or (x1,y1,x2,y2)
  '''

  if format == "midpoint":
      box1_x1 = predictions[..., 0:1] - predictions[..., 2:3] / 2
      box1_y1 = predictions[..., 1:2] - predictions[..., 3:4] / 2
      box1_x2 = predictions[..., 0:1] + predictions[..., 2:3] / 2
      box1_y2 = predictions[..., 1:2] + predictions[..., 3:4] / 2
      box2_x1 = labels[..., 0:1] - labels[..., 2:3] / 2
      box2_y1 = labels[..., 1:2] - labels[..., 3:4] / 2
      box2_x2 = labels[..., 0:1] + labels[..., 2:3] / 2
      box2_y2 = labels[..., 1:2] + labels[..., 3:4] / 2

  elif format == "corners":
      box1_x1 = predictions[..., 0:1]
      box1_y1 = predictions[..., 1:2]
      box1_x2 = predictions[..., 2:3]
      box1_y2 = predictions[..., 3:4]
      box2_x1 = labels[..., 0:1]
      box2_y1 = labels[..., 1:2]
      box2_x2 = labels[..., 2:3]
      box2_y2 = labels[..., 3:4]

  x1 = torch.max(box1_x1, box2_x1)
  y1 = torch.max(box1_y1, box2_y1)
  x2 = torch.min(box1_x2, box2_x2)
  y2 = torch.min(box1_y2, box2_y2)

  # clamp(0) for boxes with no intersection  
  intersection = (x2 - x1).clamp(0) * (y2 - y1).clamp(0)
  box1_area = abs((box1_x2 - box1_x1) * (box1_y2 - box1_y1))
  box2_area = abs((box2_x2 - box2_x1) * (box2_y2 - box2_y1))

  return intersection / (box1_area + box2_area - intersection + 1e-6)


def mean_average_precision(
    pred_boxes, true_boxes, iou_threshold=0.5, box_format="midpoint", num_classes=20
):
    """
    Calculates mean average precision 

    Parameters:
        pred_boxes (list): list of lists containing all bboxes with each bboxes
        specified as [train_idx, class_prediction, prob_score, x1, y1, x2, y2]
        true_boxes (list): Similar as pred_boxes except all the correct ones 
        iou_threshold (float): threshold where predicted bboxes is correct
        box_format (str): "midpoint" or "corners" used to specify bboxes
        num_classes (int): number of classes

    Returns:
        float: mAP value across all classes given a specific IoU threshold 
    """

    # list storing all AP for respective classes
    average_precisions = []

    # used for numerical stability later on
    epsilon = 1e-6

    for c in range(num_classes):
        class_detections = []
        ground_truths = []

        # append current class class_detections
        for detection in pred_boxes:
            if detection[1] == c:
                class_detections.append(detection)

        # append current class lables
        for true_box in true_boxes:
            if true_box[1] == c:
                ground_truths.append(true_box)

        # count boxes for each image lable
        amount_bboxes = Counter([gt[0] for gt in ground_truths])

        # We then go through each key, val in this dictionary
        # and convert to the following (w.r.t same example):
        # ammount_bboxes = {0:torch.tensor[0,0,0], 1:torch.tensor[0,0,0,0,0]}
        for key, val in amount_bboxes.items():
            amount_bboxes[key] = torch.zeros(val)

        # sort by box probabilities which is index 2
        class_detections.sort(key=lambda x: x[2], reverse=True)

        TP = torch.zeros((len(class_detections)))
        FP = torch.zeros((len(class_detections)))

        total_true_bboxes = len(ground_truths)
        
        # If none exists for this class then we can safely skip
        if total_true_bboxes == 0:
            continue

        for detection_idx, detection in enumerate(class_detections):
            # Only take out the ground_truths that have the same
            # training idx as detection
            ground_truth_img = [
                bbox for bbox in ground_truths if bbox[0] == detection[0]
            ]

            num_gts = len(ground_truth_img)
            best_iou = 0

            for idx, gt in enumerate(ground_truth_img):
                iou = intersection_over_union(
                    torch.tensor(detection[3:]),
                    torch.tensor(gt[3:]),
                    box_format=box_format,
                )

                if iou > best_iou:
                    best_iou = iou
                    best_gt_idx = idx

            if best_iou > iou_threshold:
                # only detect ground truth detection once
                if amount_bboxes[detection[0]][best_gt_idx] == 0:
                    # this box is detected by the model
                    TP[detection_idx] = 1
                    amount_bboxes[detection[0]][best_gt_idx] = 1
                else:
                    FP[detection_idx] = 1

            # if IOU is lower then the detection is a false positive
            else:
                FP[detection_idx] = 1

        TP_cumsum = torch.cumsum(TP, dim=0)
        FP_cumsum = torch.cumsum(FP, dim=0)

        precisions = TP_cumsum / (TP_cumsum + FP_cumsum + epsilon)
        precisions = torch.cat((torch.tensor([1]), precisions))

        recalls = TP_cumsum / (total_true_bboxes + epsilon)
        recalls = torch.cat((torch.tensor([0]), recalls))

        # append current class average precision
        average_precisions.append(torch.trapz(precisions, recalls))

    return sum(average_precisions) / len(average_precisions)
