In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

# **Intersection over Union (IoU) in YOLO**

---

## 1. Introduction

In **object detection**, models like **YOLO**, **Faster R-CNN**, and **SSD** predict **bounding boxes** around objects.
To evaluate how well a predicted box matches the ground truth, we use **Intersection over Union (IoU)**.

**IoU** is a metric that measures the **overlap** between two bounding boxes:

* The **predicted bounding box** from the model, and
* The **ground truth bounding box** (actual labeled object).

---

## 2. Definition

The **Intersection over Union (IoU)** is defined as:

[
IoU = \frac{\text{Area of Overlap}}{\text{Area of Union}}
]

That is:

[
IoU = \frac{|B_p \cap B_{gt}|}{|B_p \cup B_{gt}|}
]

Where:

* ( B_p ): predicted bounding box
* ( B_{gt} ): ground truth bounding box
* ( \cap ): intersection area (common area between boxes)
* ( \cup ): union area (combined area of both boxes)

---

## 3. Intuition

* **IoU = 1.0** → perfect overlap (prediction is exactly correct).
* **IoU = 0.0** → no overlap (completely wrong).
* **IoU between 0.5–0.7** → usually considered a *good* detection in benchmarks.

---

## 4. IoU in YOLO

In **YOLO (You Only Look Once)**, IoU plays two main roles:

### (a) During Training

YOLO divides the image into grids (e.g., 13×13, 19×19).
Each cell predicts multiple **anchor boxes** (predefined aspect ratios).

For each ground-truth object:

* YOLO computes IoU between the object and all anchors.
* The anchor with the **highest IoU** is assigned to predict that object.

This ensures:

* Each object is detected by **only one anchor box**.
* Anchors specialize in detecting objects of certain sizes or shapes.

### (b) During Inference (Post-Processing)

YOLO outputs multiple bounding boxes per object.
To remove duplicates, YOLO applies **Non-Maximum Suppression (NMS)** based on IoU:

1. Keep the box with the **highest confidence score**.
2. Remove all other boxes that have **IoU > threshold (e.g., 0.5)** with it.
3. Repeat until no boxes remain.

This ensures that each object is represented by only one bounding box.

---

## 5. Example Calculation

### Given:

Ground Truth Box: `(x1=2, y1=2, x2=6, y2=6)`
Predicted Box: `(x1=4, y1=4, x2=8, y2=8)`

### Step 1: Find Intersection

Intersection coordinates:

```
x_left = max(2, 4) = 4
y_top = max(2, 4) = 4
x_right = min(6, 8) = 6
y_bottom = min(6, 8) = 6
```

Intersection area = (6 - 4) × (6 - 4) = 4

### Step 2: Find Union

Area of GT = 4×4 = 16
Area of Prediction = 4×4 = 16
Union = 16 + 16 - 4 = 28

### Step 3: IoU

[
IoU = \frac{4}{28} = 0.1428
]

So the IoU = **0.14** (poor match).

---

## 6. Python Implementation Example

```python
def calculate_iou(box1, box2):
    """
    box format: [x1, y1, x2, y2]
    """
    x_left = max(box1[0], box2[0])
    y_top = max(box1[1], box2[1])
    x_right = min(box1[2], box2[2])
    y_bottom = min(box1[3], box2[3])

    if x_right < x_left or y_bottom < y_top:
        return 0.0

    intersection_area = (x_right - x_left) * (y_bottom - y_top)

    box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
    box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])

    union_area = box1_area + box2_area - intersection_area

    iou = intersection_area / union_area
    return iou

# Example
boxA = [2, 2, 6, 6]
boxB = [4, 4, 8, 8]

print("IoU:", calculate_iou(boxA, boxB))
```

Output:

```
IoU: 0.14285714285714285
```

---

## 7. IoU Thresholds in YOLO

| IoU Threshold | Interpretation                            |
| ------------- | ----------------------------------------- |
| < 0.3         | Poor prediction (ignored or background)   |
| 0.3 – 0.5     | Ambiguous region                          |
| ≥ 0.5         | Good detection (counted as True Positive) |
| ≥ 0.75        | High-quality detection                    |

In training metrics:

* **mAP@0.5** → Mean Average Precision at IoU = 0.5
* **mAP@0.5:0.95** → Mean AP averaged over IoU thresholds (0.5 to 0.95, step 0.05)

---

## 8. IoU Variants in YOLOv5–YOLOv8

YOLOv5+ introduced **IoU variants** for better localization and convergence:

| Metric                     | Description                                  | Purpose                                   |
| -------------------------- | -------------------------------------------- | ----------------------------------------- |
| **IoU**                    | Basic overlap measure                        | Simple baseline                           |
| **GIoU (Generalized IoU)** | Adds penalty for non-overlapping boxes       | Better gradient when boxes don’t overlap  |
| **DIoU (Distance IoU)**    | Adds distance between box centers            | Improves localization                     |
| **CIoU (Complete IoU)**    | Combines distance, overlap, and aspect ratio | Most effective for YOLOv5–YOLOv8 training |

### CIoU Loss (used in YOLOv5)

[
L_{CIoU} = 1 - IoU + \frac{\rho^2(b, b^{gt})}{c^2} + \alpha v
]
Where:

* ( \rho(b, b^{gt}) ): Euclidean distance between box centers
* ( c ): diagonal length of the smallest enclosing box
* ( v ): aspect ratio consistency term
* ( \alpha ): weighting factor

This ensures boxes not only overlap but are **well-aligned and proportional**.

---

## 9. Visualization Example

```python
import matplotlib.pyplot as plt
import matplotlib.patches as patches

fig, ax = plt.subplots(1)
ax.set_xlim(0, 10)
ax.set_ylim(0, 10)

# Ground truth
gt = patches.Rectangle((2, 2), 4, 4, linewidth=2, edgecolor='g', facecolor='none', label='Ground Truth')
# Prediction
pred = patches.Rectangle((4, 4), 4, 4, linewidth=2, edgecolor='r', facecolor='none', label='Prediction')

ax.add_patch(gt)
ax.add_patch(pred)
ax.legend()
plt.gca().invert_yaxis()
plt.title("IoU = 0.14 (Low Overlap)")
plt.show()
```

---

## 10. Summary

| Concept              | Description                                   |
| -------------------- | --------------------------------------------- |
| **IoU Definition**   | Ratio of overlap area to union area           |
| **Used in YOLO for** | Anchor matching and NMS                       |
| **High IoU ⇒**       | Better localization                           |
| **IoU Thresholds**   | ≥0.5 = good detection                         |
| **Variants**         | GIoU, DIoU, CIoU for better training feedback |
| **In Evaluation**    | Used in mAP@IoU metrics                       |

---

### Key Insight

In YOLO, **IoU is the foundation** for:

* Assigning anchors during training
* Measuring localization accuracy
* Filtering detections using NMS
* Computing final evaluation metrics (mAP)

# **Non-Maximum Suppression (NMS) in YOLO**

---

## 1. Introduction

In object detection, models like **YOLO** predict **multiple bounding boxes** per object.
These predictions often **overlap heavily**, because:

* Each grid cell predicts multiple anchors.
* Nearby grid cells may detect the same object.

To keep only the **best detection per object**, YOLO applies **Non-Maximum Suppression (NMS)**.

---

## 2. Purpose

**Goal of NMS**:

* Remove duplicate bounding boxes for the same object.
* Keep only the box with the **highest confidence score**.

Without NMS:

* A single object could have 3–10 overlapping boxes.
* Evaluation metrics like **mAP** would be artificially inflated or misleading.

---

## 3. How NMS Works

### Step-by-step

1. **Sort predictions by confidence score** (objectness × class probability).
2. Take the **highest-scoring box** and keep it.
3. Compute **IoU** of this box with all remaining boxes.
4. Remove boxes with **IoU > threshold** (e.g., 0.5).
5. Repeat steps 2–4 until no boxes remain.

---

### Example Workflow

| Step | Boxes (with scores)              |
| ---- | -------------------------------- |
| 1    | B1(0.9), B2(0.8), B3(0.6)        |
| 2    | Keep B1 (highest score 0.9)      |
| 3    | Compute IoU(B1, B2), IoU(B1, B3) |
| 4    | Remove boxes with IoU > 0.5      |
| 5    | Repeat for remaining boxes       |

**Result:** Only non-overlapping boxes with high confidence remain.

---

## 4. NMS Algorithm (Python Example)

```python
import numpy as np

def non_max_suppression(boxes, scores, iou_threshold=0.5):
    """
    boxes: list of [x1, y1, x2, y2]
    scores: confidence scores for each box
    """
    boxes = np.array(boxes)
    scores = np.array(scores)
    idxs = scores.argsort()[::-1]  # sort by score descending
    keep = []

    while len(idxs) > 0:
        i = idxs[0]
        keep.append(i)
        if len(idxs) == 1:
            break

        # Compute IoU of highest-score box with remaining boxes
        ious = np.array([calculate_iou(boxes[i], boxes[j]) for j in idxs[1:]])

        # Keep boxes with IoU <= threshold
        idxs = idxs[1:][ious <= iou_threshold]

    return keep

# Example usage
boxes = [[2, 2, 6, 6], [4, 4, 8, 8], [10, 10, 14, 14]]
scores = [0.9, 0.8, 0.7]

keep_idx = non_max_suppression(boxes, scores, iou_threshold=0.5)
print("Boxes to keep:", keep_idx)
```

**Output**:

```
Boxes to keep: [0, 2]
```

---

## 5. NMS in YOLO

* YOLO predicts **multiple boxes per grid cell**.
* Each box has:

  * **Objectness score** (confidence that object exists)
  * **Class probabilities** (probability for each class)
* Final box score = **objectness × class probability**
* NMS is applied **per class**, i.e., remove duplicates for each object class independently.

### Key Parameters in YOLO:

* **Confidence threshold**: Minimum score to consider a box.
* **IoU threshold for NMS**: Remove boxes with IoU > threshold.

---

## 6. Soft-NMS (Optional Advanced)

Instead of removing overlapping boxes completely, **Soft-NMS** reduces their scores gradually:

[
score_i = score_i \times (1 - IoU(B_{max}, B_i))
]

* Prevents missing detections in crowded scenes.
* Sometimes used in **YOLOv5+** for better multi-object detection.

---

## 7. Visualization Example

```python
import matplotlib.pyplot as plt
import matplotlib.patches as patches

boxes = [[2, 2, 6, 6], [4, 4, 8, 8], [10, 10, 14, 14]]
scores = [0.9, 0.8, 0.7]
keep = [0, 2]

fig, ax = plt.subplots(1)
ax.set_xlim(0, 15)
ax.set_ylim(0, 15)

colors = ['g', 'r', 'b']
for i, box in enumerate(boxes):
    rect = patches.Rectangle((box[0], box[1]), box[2]-box[0], box[3]-box[1], linewidth=2, edgecolor=colors[i], facecolor='none')
    ax.add_patch(rect)
    ax.text(box[0], box[1]-0.5, f"{scores[i]:.2f}", color=colors[i])

plt.gca().invert_yaxis()
plt.title("Non-Maximum Suppression Example")
plt.show()
```

* Green: highest score, kept
* Red: overlapping box removed
* Blue: separate box, kept

---

## 8. Summary

| Concept           | Description                                                          |
| ----------------- | -------------------------------------------------------------------- |
| **NMS**           | Filters overlapping predictions, keeps only highest confidence boxes |
| **IoU threshold** | Determines overlap cutoff for suppression (typical 0.5)              |
| **Class-wise**    | NMS applied separately for each object class                         |
| **Soft-NMS**      | Gradually decreases scores instead of hard removal                   |
| **Used in YOLO**  | Anchor assignment post-processing, final detection refinement        |

---

**Key Takeaway:**
IoU tells you **how well a predicted box matches the ground truth**, while NMS uses IoU to **eliminate duplicate detections**, ensuring each object is represented by a single, confident bounding box.

[https://github.com/AlexeyAB/darknet#how-to-compile-on-windows-using-cmake](http://)