# Generate Yolo Segmentation Labels From Masks and YOLO 11s-Seg Training

### Purpose of This Notebook

- **Converts binary masks** to YOLO segmentation labels (`.txt`) using polygon extraction.  
- **Generates labels per folder**, preserving modularity for flexible split composition later.  
- **Trains segmentation models** directly after label generation, using the created YOLO splits.

In [None]:
!python ../src/label_generation/masks_to_yolo_seg_labels.py \
  --img_root "../data/detection2/train/images_single" \
  --mask_roots "../data/PolypGen2021_MultiCenterData_v3/data_C1/masks_C1" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C2/masks_C2" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C3/masks_C3" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C4/masks_C4" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C5/masks_C5" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C6/masks_C6" \
  --out_labels "../data/segmentation2/train_yolo_labels/images_single_yolo_labels" \
  --approx_eps 1.5 \
  --min_area_px 25

[progress] 200/1105 | with_mask=153 | instances=167
[progress] 400/1105 | with_mask=334 | instances=360
[progress] 600/1105 | with_mask=534 | instances=581
[progress] 800/1105 | with_mask=718 | instances=800
[progress] 1000/1105 | with_mask=908 | instances=1010
[progress] 1105/1105 | with_mask=1013 | instances=1135
[done] images=1105 | with_mask=1013 | instances=1135 | labels_dir=..\data\segmentation2\train_yolo_labels\images_single_yolo_labels


In [None]:
!python ../src/label_generation/masks_to_yolo_seg_labels.py \
  --img_root "../data/detection2/val/images_single" \
  --mask_roots "../data/PolypGen2021_MultiCenterData_v3/data_C1/masks_C1" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C2/masks_C2" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C3/masks_C3" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C4/masks_C4" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C5/masks_C5" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C6/masks_C6" \
  --out_labels "../data/segmentation2/val_yolo_labels/val_images_single_yolo_labels" \
  --approx_eps 1.5 \
  --min_area_px 25

[progress] 139/139 | with_mask=128 | instances=147
[done] images=139 | with_mask=128 | instances=147 | labels_dir=..\data\segmentation2\val_yolo_labels\val_images_single_yolo_labels


## Train Seq Images, Generate Yolo Labels Per Folder

In [None]:
!python ../src/label_generation/build_seq_yolo_seg_labels.py \
  --root "../data/detection2/train/seq" \
  --out_labels "../data/segmentation2/train_yolo_labels" \
  --mirror True \
  --approx_eps 1.5 \
  --min_area_px 25 

[pos]  seq10: images=25 | with_instances=7
[pos]  seq11: images=228 | with_instances=136
[pos]  seq12: images=250 | with_instances=250
[pos]  seq13: images=250 | with_instances=199
[pos]  seq14: images=204 | with_instances=157
[neg]  seq16_neg: wrote 141 empty labels
[neg]  seq1_neg: wrote 315 empty labels
[pos]  seq2: images=63 | with_instances=60
[neg]  seq22_neg: wrote 82 empty labels
[neg]  seq2_neg: wrote 302 empty labels
[pos]  seq3: images=15 | with_instances=15
[neg]  seq3_neg: wrote 40 empty labels
[pos]  seq4: images=48 | with_instances=46
[neg]  seq4_neg: wrote 72 empty labels
[pos]  seq5: images=250 | with_instances=199
[neg]  seq5_neg: wrote 61 empty labels
[pos]  seq6: images=91 | with_instances=62
[neg]  seq6_neg: wrote 90 empty labels
[neg]  seq7_neg: wrote 207 empty labels
[pos]  seq8: images=73 | with_instances=54
[neg]  seq8_neg: wrote 311 empty labels
[pos]  seq9: images=51 | with_instances=42
[neg]  seq9_neg: wrote 99 empty labels

[done]
  positive images: 1548 | 

## Validation Seq Images, Generate Yolo Labels Per Folder

In [None]:
!python ../src/label_generation/build_seq_yolo_seg_labels.py \
  --root "../data/detection2/val/seq" \
  --out_labels "../data/segmentation2/val_yolo_labels" \
  --mirror True \
  --approx_eps 1.5 \
  --min_area_px 25 

[neg]  seq15_neg: wrote 278 empty labels

[done]
  positive images: 0 | with instances: 0
  negative images (empty labels): 278
  labels root: ..\data\segmentation2\val_yolo_labels


## Kvasir Yolo Labels

In [None]:
!python ../src/label_generation/masks_to_yolo_seg_labels.py \
  --img_root "../data/kvasir-seg/images" \
  --mask_roots "../data/kvasir-seg/masks" \
  --out_labels "../data/segmentation2/val_yolo_labels/kvasir_yolo_labels" \
  --approx_eps 1.5 \
  --min_area_px 25

[progress] 200/1000 | with_mask=200 | instances=208
[progress] 400/1000 | with_mask=400 | instances=423
[progress] 600/1000 | with_mask=600 | instances=645
[progress] 800/1000 | with_mask=800 | instances=855
[progress] 1000/1000 | with_mask=1000 | instances=1060
[done] images=1000 | with_mask=1000 | instances=1060 | labels_dir=..\data\segmentation2\val_yolo_labels\kvasir_yolo_labels


## Single Data Static Augmentation

In [None]:
!python ../src/augmentation/augment_single_pos_seg.py \
  --img_root "../data/detection2/train/images_single" \
  --mask_roots "../data/PolypGen2021_MultiCenterData_v3/data_C1/masks_C1" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C2/masks_C2" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C3/masks_C3" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C4/masks_C4" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C5/masks_C5" \
               "../data/PolypGen2021_MultiCenterData_v3/data_C6/masks_C6" \
  --out_images "../data/segmentation2/aug_pos_images" \
  --out_masks  "../data/segmentation2/aug_pos_masks" \
  --out_labels "../data/segmentation2/train_yolo_labels/aug_pos_labels_seg" \
  --copies_per_img 2 \
  --approx_eps 1.5 --min_area_px 25


[info] images found: 1105
[progress] 100/1105 images processed, total_aug=200
[progress] 200/1105 images processed, total_aug=400
[progress] 300/1105 images processed, total_aug=600
[progress] 400/1105 images processed, total_aug=800
[progress] 500/1105 images processed, total_aug=1000
[progress] 600/1105 images processed, total_aug=1200
[progress] 700/1105 images processed, total_aug=1400
[progress] 800/1105 images processed, total_aug=1600
[progress] 900/1105 images processed, total_aug=1800
[progress] 1000/1105 images processed, total_aug=2000
[progress] 1100/1105 images processed, total_aug=2200
[progress] 1105/1105 images processed, total_aug=2210
[done] Augmented images=2210 written to ..\data\segmentation2\aug_pos_images


## Combine All Images and Labels

In [None]:
!python ../src/label_generation/combine_images_and_labels.py \
  --train_images_root "../data/detection2/train" \
  --extra_images "../data/segmentation2/aug_neg_images" "../data/segmentation2/aug_pos_images" \
  --labels_root "../data/segmentation2/train_yolo_labels" \
  --output_root "../data/segmentation2/yolo_split2/train"

[info] found 9932 total images before filtering

[done]
  image-label pairs copied : 9679
  images skipped (no label): 45
  duplicates skipped       : 208

Targets:
  images -> ..\data\segmentation2\yolo_split2\train\images
  labels -> ..\data\segmentation2\yolo_split2\train\labels


In [None]:
!python ../src/label_generation/combine_images_and_labels.py \
  --train_images_root "../data/detection2/val" \
  --extra_images "../data/kvasir-seg/images" \
  --labels_root "../data/segmentation2/val_yolo_labels" \
  --output_root "../data/segmentation2/yolo_split2/val"

[info] found 1417 total images before filtering

[done]
  image-label pairs copied : 1416
  images skipped (no label): 0
  duplicates skipped       : 1

Targets:
  images -> ..\data\segmentation2\yolo_split2\val\images
  labels -> ..\data\segmentation2\yolo_split2\val\labels


# Model Training

In [2]:
import yaml
from ultralytics import YOLO

# ---- paths ----
DATA_YAML = "../configs/data_seg3.yaml"                    
model = YOLO("yolo11s-seg.pt")


# ---- fine-tune ----
model.train(
    data=DATA_YAML,
    imgsz=832,           
    epochs=200,
    batch=8,             
    device=0,
    single_cls=True,

    # Optimizer & schedule 
    optimizer="AdamW",
    lr0=2e-4,              
    lrf=0.01,
    cos_lr=True,
    warmup_epochs=10,
    weight_decay=7e-4,

    # augmentations
    multi_scale=True,                        
    mosaic=0.0, mixup=0.0, copy_paste=0.0,
    # geometry (union, but stable)
    degrees=180.0,             
    translate=0.05,             # small shifts (≈ random crop spirit)
    scale=0.25,                 # zoom in/out ±25%
    shear=3.0,                  # mild shear to avoid warping shapes too much
    perspective=0.0,            

    # flips
    fliplr=0.5,                
    flipud=0.2,                 

    hsv_h=0.005,                # tiny hue jitter (mucosa colors are sensitive)
    hsv_s=0.18,                 # moderate saturation jitter
    hsv_v=0.20,                 # moderate value (brightness) jitter

    # Stability / speed
    amp=True,
    cache="disk",
    workers=2,
    seed=0,
    patience=60,           
    save_period=10,
    name="aug2_seg3_832",
)     



New https://pypi.org/project/ultralytics/8.3.223 available  Update with 'pip install -U ultralytics'
Ultralytics 8.3.221  Python-3.11.9 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070, 12282MiB)
[34m[1mengine\trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=8, bgr=0.0, box=7.5, cache=disk, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=True, cutmix=0.0, data=../configs/data_seg3.yaml, degrees=180.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=200, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.2, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.005, hsv_s=0.18, hsv_v=0.2, imgsz=832, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.0002, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11s-seg.pt, momentum=0.937, mosaic=0.0, multi_scale=True, name=aug2_seg3_8

KeyboardInterrupt: 