## YOLO11s Detection Training

This notebook fine-tunes the **YOLO11s** model for **polyp detection** using a dataset derived from **PolypGen** and additional **clinical sources**.

---

### **Dataset Composition**

#### **Training Set**
- **Single images (Centers C1–C5, PolypGen)** — ~**1,100 images**  
  - Static frames augmented offline using Albumentations with a custom script  
    `augment_single_pos_det.py`.  
  - Augmentations include **random cropping**, **geometric transforms**,  
    **brightness/contrast adjustments**, **mild Gaussian noise**, and **flips** — all applied  
    in a **bbox-safe** manner and saved as new images and YOLO labels.  
  - → These are referred to as **static augmentations**.

- **Positive sequences (PolypGen and Real-colon)** — cleaned video sequences with redundant or highly similar frames removed.

- **Negative sequences** — frames without polyps from **hospital** data.

---

#### **Validation Set**
- ~**140 single images** from Centers **C1–C5**.  
- **Kvasir-SEG** images (~**1,000**) to test **generalization across distributions**.  
- One **short negative sequence** to ensure negative coverage during validation.

---

### **Training Configuration**

Training uses the configuration file:  
`../configs/data_yolo_split4.yaml`  
and initializes from **`yolo11s.pt`**.

- **Image size:** 832 × 832  
- **Base model:** YOLO11s  
 
---

### **Augmentation Strategy**

#### **1. Static Albumentations (Offline)**
- Applied **only to single-image PolypGen data (Centers C1–C5)**.  
- Includes:
  - Cropping and rotation  
  - Color and brightness variation  
  - Gaussian noise and motion blur  
  - Horizontal flips  
- Augmented samples are saved as **additional training images and labels**.

#### **2. YOLO Built-in Augmentations (Online)**
- Applied **dynamically during training**:
  - **Mosaic:** 0.10  
  - **MixUp:** 0.08  
  - **Multi-scale training** to expose the model to varied resolutions.  
  - Controlled **geometric** and **color jitter** for robust illumination and orientation handling.

---

### **Results**

| Class | Images | Instances | Box Precision | Box Recall | mAP@50 | mAP@50–95 |
|:------|:-------:|:----------:|:--------------:|:-----------:|:--------:|:-----------:|
| **all** | **1417** | **1211** | **0.916** | **0.870** | **0.937** | **0.743** |

---
<p align="center">
  <!-- Top large plot -->
  <img src="../demo_images/plots/results_det.png" alt="Training Results" width="75%">
</p>

<p align="center">
  <!-- Two smaller plots side by side -->
  <img src="../demo_images/plots/BoxPR_curve_det.png" alt="Precision-Recall Curve" width="38%">
  <img src="../demo_images/plots/confusion_matrix_det.png" alt="Confusion Matrix" width="38%">
</p>

### **Summary**

These two complementary augmentation stages — **static (offline)** and **dynamic (online)** —  
enhanced **generalization** and **recall** while maintaining **medical image realism**.  

**Result:** Incorporating **MixUp**, **Mosaic**, and **multi-scale training** yielded  
**notably higher validation performance** compared to other runs without them.

In [None]:
from ultralytics import YOLO

DATA_YAML = "../configs/data_yolo_split4.yaml"   
model = YOLO("yolo11s.pt")                       

model.train(
    data=DATA_YAML,
    imgsz=832,
    epochs=200,
    batch=8,
    device=0,
    single_cls=True,

    optimizer="AdamW",
    lr0=2e-4,
    lrf=0.01,
    cos_lr=True,
    warmup_epochs=10,
    weight_decay=7e-4,

    # Augmentations (detection)
    multi_scale=True,
    mosaic=0.10,         # << enable lightly
    mixup=0.08,          # << built-in MixUp (probability)
    copy_paste=0.0,     
    close_mosaic=25,     # << stop mosaic near the end

    # Geometry
    degrees=90.0,
    translate=0.05,
    scale=0.25,
    shear=3.0,
    perspective=0.0,

    # Flips
    fliplr=0.5,
    flipud=0.1,

    # Color jitter (conservative)
    hsv_h=0.005,
    hsv_s=0.15,
    hsv_v=0.15,

    amp=True,
    cache="disk",
    workers=2,
    seed=0,
    patience=60,
    save_period=10,
    name="y11s_det_split4_832_mosaicmix",
)


New https://pypi.org/project/ultralytics/8.3.226 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.11.9 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070, 12282MiB)
[34m[1mengine\trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=8, bgr=0.0, box=7.5, cache=disk, cfg=None, classes=None, close_mosaic=25, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=True, cutmix=0.0, data=../configs/data_yolo_split4_2.yaml, degrees=90.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=200, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.1, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.005, hsv_s=0.15, hsv_v=0.15, imgsz=832, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.0002, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.08, mode=train, model=yolo11s.pt, momentum=0.937, mosaic=0.1, multi_scale=True, name=y11s_

ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([0])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x000001B7C936EE90>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.033033,    0.034034,    0.035035,    0.036036,    0.037037,    0.038038,    0.039039,     0.04004,    0.041041,    0.042042,    0.043043,    0.044044,    0.045045,    0.046046,    0.047047,
          0.0480

: 