# üìâ Task 8: Loss Function Mathematics

## üéØ Objective
Understand and implement the loss functions used in YOLO training from scratch.

---

## üìö Why Loss Functions Matter

The loss function guides learning by measuring how wrong predictions are:
- **Lower loss** = Better predictions
- **Gradient of loss** = Direction to improve

### ML Rules Applied:
- **Rule #21**: The number you optimize is not the one you want to maximize
- **Rule #22**: Keep your code modular for fast experimentation

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from pathlib import Path

np.random.seed(42)
PROJECT_ROOT = Path(r"D:\het\SELF\RP\YOLO-V11-PRO")
print("‚úÖ Libraries imported (NumPy only!)")

---

# Part 1: YOLO Total Loss

## üìê YOLO Loss Formula

$$
\mathcal{L}_{total} = \lambda_{box} \cdot \mathcal{L}_{box} + \lambda_{cls} \cdot \mathcal{L}_{cls} + \lambda_{obj} \cdot \mathcal{L}_{obj}
$$

Where:
- **L_box**: Bounding box regression loss
- **L_cls**: Classification loss
- **L_obj**: Objectness/confidence loss
- **Œª**: Weight coefficients

```
YOLO Predictions
       ‚îÇ
       ‚îú‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¨‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îê
       ‚îÇ              ‚îÇ              ‚îÇ
       ‚Üì              ‚Üì              ‚Üì
   Box Loss      Class Loss     Obj Loss
   (CIoU)        (BCE/Focal)    (BCE)
       ‚îÇ              ‚îÇ              ‚îÇ
       ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚î¥‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò
                      ‚îÇ
                Total Loss
```

---

# Part 2: Binary Cross-Entropy (BCE) Loss

## üìê Mathematical Definition

Used for objectness and classification (per-class):

$$
BCE(y, \hat{y}) = -\frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i) \right]
$$

Where:
- **y**: Ground truth (0 or 1)
- **≈∑**: Predicted probability [0, 1]

In [None]:
# ============================================================
# BINARY CROSS-ENTROPY - NumPy Implementation
# ============================================================

def binary_cross_entropy(y_true, y_pred, epsilon=1e-7):
    """
    Binary Cross-Entropy Loss (NumPy).
    
    Formula: BCE = -[y¬∑log(≈∑) + (1-y)¬∑log(1-≈∑)]
    
    Args:
        y_true: Ground truth labels (0 or 1)
        y_pred: Predicted probabilities [0, 1]
        epsilon: Small value to prevent log(0)
    
    Returns:
        BCE loss value
    """
    # Clip predictions to prevent log(0)
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    # Calculate BCE
    loss = -(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
    
    return np.mean(loss)

# Test
y_true = np.array([1, 0, 1, 1, 0])
y_pred = np.array([0.9, 0.1, 0.8, 0.7, 0.3])

bce = binary_cross_entropy(y_true, y_pred)
print(f"‚úÖ BCE Loss: {bce:.4f}")

In [None]:
# Visualize BCE
def visualize_bce():
    """Visualize BCE loss behavior."""
    fig, axes = plt.subplots(1, 2, figsize=(12, 4))
    
    p = np.linspace(0.01, 0.99, 100)
    
    # When y=1 (should predict high)
    loss_y1 = -np.log(p)
    axes[0].plot(p, loss_y1, 'b-', linewidth=2)
    axes[0].set_xlabel('Predicted Probability ≈∑')
    axes[0].set_ylabel('Loss')
    axes[0].set_title('BCE when y=1 (should predict HIGH)')
    axes[0].axvline(x=1, color='g', linestyle='--', label='Target')
    axes[0].legend()
    axes[0].grid(True, alpha=0.3)
    
    # When y=0 (should predict low)
    loss_y0 = -np.log(1 - p)
    axes[1].plot(p, loss_y0, 'r-', linewidth=2)
    axes[1].set_xlabel('Predicted Probability ≈∑')
    axes[1].set_ylabel('Loss')
    axes[1].set_title('BCE when y=0 (should predict LOW)')
    axes[1].axvline(x=0, color='g', linestyle='--', label='Target')
    axes[1].legend()
    axes[1].grid(True, alpha=0.3)
    
    plt.suptitle('üìâ Binary Cross-Entropy Behavior', fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.savefig(PROJECT_ROOT / 'docs' / 'assets' / 'bce_loss.png', dpi=150)
    plt.show()

visualize_bce()

---

# Part 3: Focal Loss

## üìê Mathematical Definition

Focal Loss addresses class imbalance by down-weighting easy examples:

$$
FL(p_t) = -\alpha_t (1 - p_t)^\gamma \log(p_t)
$$

Where:
- **Œ≥** (gamma): Focusing parameter (typically 2)
- **Œ±** (alpha): Class weight
- **(1-p_t)^Œ≥**: Modulating factor that reduces loss for easy examples

In [None]:
# ============================================================
# FOCAL LOSS - NumPy Implementation
# ============================================================

def focal_loss(y_true, y_pred, gamma=2.0, alpha=0.25, epsilon=1e-7):
    """
    Focal Loss for handling class imbalance.
    
    Formula: FL = -Œ±(1-p_t)^Œ≥ √ó log(p_t)
    
    Args:
        y_true: Ground truth labels
        y_pred: Predicted probabilities
        gamma: Focusing parameter (reduces easy example loss)
        alpha: Class weight
    
    Returns:
        Focal loss value
    """
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    # p_t = p if y=1, else 1-p
    p_t = y_true * y_pred + (1 - y_true) * (1 - y_pred)
    
    # Alpha weighting
    alpha_t = y_true * alpha + (1 - y_true) * (1 - alpha)
    
    # Focal weight: (1 - p_t)^gamma
    focal_weight = (1 - p_t) ** gamma
    
    # Focal loss
    loss = -alpha_t * focal_weight * np.log(p_t)
    
    return np.mean(loss)

# Compare BCE vs Focal
fl = focal_loss(y_true, y_pred)
print(f"‚úÖ BCE Loss: {bce:.4f}")
print(f"‚úÖ Focal Loss: {fl:.4f}")

In [None]:
# Visualize Focal vs BCE
def visualize_focal_vs_bce():
    """Compare Focal Loss vs BCE."""
    fig, ax = plt.subplots(figsize=(10, 6))
    
    p = np.linspace(0.01, 0.99, 100)
    
    # BCE (when y=1)
    bce = -np.log(p)
    ax.plot(p, bce, 'b-', linewidth=2, label='BCE')
    
    # Focal Loss with different gamma
    for gamma in [0.5, 1, 2, 5]:
        focal = -(1 - p)**gamma * np.log(p)
        ax.plot(p, focal, '--', linewidth=1.5, label=f'Focal Œ≥={gamma}')
    
    ax.set_xlabel('Predicted Probability (for positive class)', fontsize=12)
    ax.set_ylabel('Loss', fontsize=12)
    ax.set_title('üìâ BCE vs Focal Loss (when y=1)', fontsize=14, fontweight='bold')
    ax.legend()
    ax.grid(True, alpha=0.3)
    ax.set_ylim(0, 5)
    
    plt.tight_layout()
    plt.savefig(PROJECT_ROOT / 'docs' / 'assets' / 'focal_vs_bce.png', dpi=150)
    plt.show()

visualize_focal_vs_bce()

---

# Part 4: IoU-Based Box Loss (CIoU)

## üìê Evolution of Box Loss

### 1. L1/L2 Loss (Basic)
$$L_{box} = \sum |x - \hat{x}| + |y - \hat{y}| + |w - \hat{w}| + |h - \hat{h}|$$

**Problem**: Doesn't consider box overlap directly.

### 2. IoU Loss
$$L_{IoU} = 1 - IoU$$

**Problem**: No gradient when boxes don't overlap.

### 3. GIoU (Generalized IoU)
$$GIoU = IoU - \frac{|C - A \cup B|}{|C|}$$

Where C is the smallest enclosing box.

### 4. CIoU (Complete IoU) - Used in YOLO
$$CIoU = IoU - \frac{\rho^2(b, b^{gt})}{c^2} - \alpha v$$

Where:
- **œÅ**: Euclidean distance between centers
- **c**: Diagonal of smallest enclosing box
- **v**: Aspect ratio consistency
- **Œ±**: Trade-off parameter

In [None]:
# ============================================================
# CIoU LOSS - NumPy Implementation
# ============================================================

def calculate_iou(box1, box2):
    """Calculate IoU between two boxes [x1,y1,x2,y2]."""
    x1_a, y1_a, x2_a, y2_a = box1
    x1_b, y1_b, x2_b, y2_b = box2
    
    inter_x1 = max(x1_a, x1_b)
    inter_y1 = max(y1_a, y1_b)
    inter_x2 = min(x2_a, x2_b)
    inter_y2 = min(y2_a, y2_b)
    
    inter_w = max(0, inter_x2 - inter_x1)
    inter_h = max(0, inter_y2 - inter_y1)
    inter_area = inter_w * inter_h
    
    area_a = (x2_a - x1_a) * (y2_a - y1_a)
    area_b = (x2_b - x1_b) * (y2_b - y1_b)
    union_area = area_a + area_b - inter_area
    
    return inter_area / (union_area + 1e-6)

def ciou_loss(box_pred, box_gt):
    """
    Complete IoU Loss (NumPy).
    
    Formula: CIoU = IoU - (d¬≤/c¬≤) - Œ±v
    
    Args:
        box_pred: Predicted [x1, y1, x2, y2]
        box_gt: Ground truth [x1, y1, x2, y2]
    
    Returns:
        CIoU loss value
    """
    # Basic IoU
    iou = calculate_iou(box_pred, box_gt)
    
    # Box centers
    center_pred = [(box_pred[0] + box_pred[2])/2, (box_pred[1] + box_pred[3])/2]
    center_gt = [(box_gt[0] + box_gt[2])/2, (box_gt[1] + box_gt[3])/2]
    
    # Distance between centers (œÅ¬≤)
    rho2 = (center_pred[0] - center_gt[0])**2 + (center_pred[1] - center_gt[1])**2
    
    # Smallest enclosing box
    enclose_x1 = min(box_pred[0], box_gt[0])
    enclose_y1 = min(box_pred[1], box_gt[1])
    enclose_x2 = max(box_pred[2], box_gt[2])
    enclose_y2 = max(box_pred[3], box_gt[3])
    
    # Diagonal of enclosing box (c¬≤)
    c2 = (enclose_x2 - enclose_x1)**2 + (enclose_y2 - enclose_y1)**2 + 1e-6
    
    # Width and height
    w_pred = box_pred[2] - box_pred[0]
    h_pred = box_pred[3] - box_pred[1]
    w_gt = box_gt[2] - box_gt[0]
    h_gt = box_gt[3] - box_gt[1]
    
    # Aspect ratio term (v)
    v = (4 / np.pi**2) * (np.arctan(w_gt / (h_gt + 1e-6)) - np.arctan(w_pred / (h_pred + 1e-6)))**2
    
    # Trade-off parameter (Œ±)
    alpha = v / (1 - iou + v + 1e-6)
    
    # CIoU
    ciou = iou - (rho2 / c2) - alpha * v
    
    return 1 - ciou  # Loss = 1 - CIoU

# Test
box_pred = [100, 100, 200, 200]
box_gt = [110, 105, 195, 205]

ciou = ciou_loss(box_pred, box_gt)
print(f"‚úÖ CIoU Loss: {ciou:.4f}")

---

# Part 5: Complete YOLO Loss

Combining all components:

In [None]:
# ============================================================
# COMPLETE YOLO LOSS - NumPy Implementation
# ============================================================

class YOLOLoss:
    """
    Complete YOLO Loss Function (NumPy only).
    
    Total Loss = Œª_box √ó L_box + Œª_cls √ó L_cls + Œª_obj √ó L_obj
    """
    
    def __init__(self, lambda_box=0.05, lambda_cls=0.5, lambda_obj=1.0):
        self.lambda_box = lambda_box
        self.lambda_cls = lambda_cls
        self.lambda_obj = lambda_obj
    
    def box_loss(self, pred_boxes, gt_boxes):
        """CIoU-based box regression loss."""
        losses = [ciou_loss(p, g) for p, g in zip(pred_boxes, gt_boxes)]
        return np.mean(losses) if losses else 0
    
    def cls_loss(self, pred_cls, gt_cls):
        """Classification loss (BCE or Focal)."""
        return binary_cross_entropy(gt_cls, pred_cls)
    
    def obj_loss(self, pred_obj, gt_obj):
        """Objectness loss (BCE)."""
        return binary_cross_entropy(gt_obj, pred_obj)
    
    def __call__(self, predictions, targets):
        """
        Calculate total loss.
        
        Args:
            predictions: Dict with 'boxes', 'classes', 'objectness'
            targets: Dict with 'boxes', 'classes', 'objectness'
        """
        l_box = self.box_loss(predictions['boxes'], targets['boxes'])
        l_cls = self.cls_loss(predictions['classes'], targets['classes'])
        l_obj = self.obj_loss(predictions['objectness'], targets['objectness'])
        
        total = (self.lambda_box * l_box + 
                self.lambda_cls * l_cls + 
                self.lambda_obj * l_obj)
        
        return {
            'total': total,
            'box_loss': l_box,
            'cls_loss': l_cls,
            'obj_loss': l_obj
        }

# Test
loss_fn = YOLOLoss()

predictions = {
    'boxes': [[100, 100, 200, 200], [300, 300, 400, 400]],
    'classes': np.array([0.9, 0.8]),
    'objectness': np.array([0.95, 0.9])
}

targets = {
    'boxes': [[105, 105, 195, 195], [305, 305, 395, 395]],
    'classes': np.array([1, 1]),
    'objectness': np.array([1, 1])
}

losses = loss_fn(predictions, targets)
print(f"\n‚úÖ YOLO Loss Components:")
print(f"   Box Loss: {losses['box_loss']:.4f}")
print(f"   Cls Loss: {losses['cls_loss']:.4f}")
print(f"   Obj Loss: {losses['obj_loss']:.4f}")
print(f"   TOTAL: {losses['total']:.4f}")

## üìù Summary

### Loss Functions Implemented:

| Loss | Formula | Use |
|------|---------|-----|
| **BCE** | -[y¬∑log(≈∑) + (1-y)¬∑log(1-≈∑)] | Classification, Objectness |
| **Focal** | -Œ±(1-p_t)^Œ≥¬∑log(p_t) | Imbalanced classification |
| **CIoU** | 1 - IoU + d¬≤/c¬≤ + Œ±v | Box regression |

### YOLO Total Loss:
$$\mathcal{L} = \lambda_{box} \cdot CIoU + \lambda_{cls} \cdot BCE + \lambda_{obj} \cdot BCE$$

### Next: Phase 3 - Model Development!

In [None]:
print("\n" + "="*60)
print("‚úÖ TASK 8 COMPLETE: Loss Function Mathematics")
print("="*60)
print("\nüìã Implemented (NumPy only):")
print("   ‚úì Binary Cross-Entropy (BCE)")
print("   ‚úì Focal Loss")
print("   ‚úì CIoU Loss")
print("   ‚úì Complete YOLOLoss class")
print("\nüéâ Phase 2 Complete (Theory)!")
print("\n‚û°Ô∏è Ready for Phase 3: Model Development")