# Plan: VinBigData Chest X-ray Abnormalities Detection

Objectives:
- Establish GPU-ready environment and fast, correct baseline.
- Build robust data pipeline: DICOM -> PNG with proper windowing; labels -> YOLO format.
- Lock CV and training procedure; produce OOF for diagnostics.
- Train a strong object detector (Ultralytics YOLO, pretrained) with efficient epochs.
- Inference on test; format per competition requirements; iterate and improve; aim for medal.

Phases:
1) Environment & GPU check
   - Verify nvidia-smi and torch CUDA 12.1 install.
   - Set constraints to avoid torch drift.

2) Data audit & EDA
   - Inspect train.csv, sample_submission.csv; count images/classes; check box stats.
   - Verify DICOM integrity; test reading and conversion.

3) Preprocessing
   - DICOM -> 8-bit PNG (or JPEG) with chest-appropriate windowing and histogram normalization; cache at 1024px.
   - Create YOLO labels per image (one .txt with class_id x_center y_center w h in normalized coords).
   - Handle 'No finding' properly (no boxes).

4) Validation protocol
   - Stratified KFold by presence of each class (multilabel stratification) into 5 folds; save folds.

5) Baseline model
   - Ultralytics YOLOv8n or v5s pretrained on COCO; train 5-10 epochs @ 1024 with strong aug (mixup off initially).
   - Use GPU, AMP, cosine LR, patience; log per-epoch mAP on val.

6) Inference & submission
   - TTA (flip) optional; confidence/NMS tuned on OOF.
   - Convert YOLO outputs to competition format (class_name/conf/xmin ymin xmax ymax ...).
   - Save submission.csv.

7) Iterations for medal
   - Resolution sweep (640 -> 1024), model size sweep (n/s), augment tuning, EMA.
   - Ensemble diverse seeds/models.
   - Error analysis on OOF by class; adjust thresholds per class.

Checkpoints for expert review:
- After this plan.
- After data/labels pipeline built & validated.
- After baseline training (OOF metrics).
- After first LB submission; then iterate for improvements.

In [1]:
# Environment & GPU check: ensure CUDA 12.1 torch stack is correctly installed
import os, sys, subprocess, shutil, time
from pathlib import Path

def run(cmd):
    print('>>', ' '.join(cmd), flush=True)
    return subprocess.run(cmd, check=False, text=True, capture_output=True)

print('Checking nvidia-smi...', flush=True)
res = run(['bash','-lc','nvidia-smi || true'])
print(res.stdout)
print(res.stderr, file=sys.stderr)

# Uninstall any existing torch stack to avoid conflicts
for pkg in ('torch','torchvision','torchaudio'):
    subprocess.run([sys.executable, '-m', 'pip', 'uninstall', '-y', pkg], check=False)

# Clean stray site dirs that can shadow correct wheels (idempotent)
for d in (
    '/app/.pip-target/torch',
    '/app/.pip-target/torchvision',
    '/app/.pip-target/torchaudio',
    '/app/.pip-target/torchgen',
    '/app/.pip-target/functorch',
):
    if os.path.exists(d):
        print('Removing', d, flush=True)
        shutil.rmtree(d, ignore_errors=True)

def pip(*args):
    print('> pip', *args, flush=True)
    subprocess.run([sys.executable, '-m', 'pip', *args], check=True)

# Install EXACT cu121 stack
pip('install',
    '--index-url','https://download.pytorch.org/whl/cu121',
    '--extra-index-url','https://pypi.org/simple',
    'torch==2.4.1','torchvision==0.19.1','torchaudio==2.4.1'
)

# Freeze versions
Path('constraints.txt').write_text('torch==2.4.1\ntorchvision==0.19.1\ntorchaudio==2.4.1\n')

import torch
print('torch:', torch.__version__, 'CUDA build:', getattr(torch.version, 'cuda', None), flush=True)
print('CUDA available:', torch.cuda.is_available(), flush=True)
if not torch.cuda.is_available():
    raise SystemExit('CUDA not available. Exiting to avoid wasting time.')
assert str(getattr(torch.version,'cuda','')).startswith('12.1'), f'Wrong CUDA build: {torch.version.cuda}'
print('GPU:', torch.cuda.get_device_name(0), flush=True)

print('Environment ready.', flush=True)

Checking nvidia-smi...


>> bash -lc nvidia-smi || true


Thu Sep 25 02:14:50 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.06             Driver Version: 550.144.06     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA A10-24Q                 On  |   00000002:00:00.0 Off |                    0 |
| N/A   N/A    P0             N/A /  N/A  |     182MiB /  24512MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                








> pip install --index-url https://download.pytorch.org/whl/cu121 --extra-index-url https://pypi.org/simple torch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1




Looking in indexes: https://download.pytorch.org/whl/cu121, https://pypi.org/simple


Collecting torch==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torch-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl (799.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 799.0/799.0 MB 381.5 MB/s eta 0:00:00


Collecting torchvision==0.19.1
  Downloading https://download.pytorch.org/whl/cu121/torchvision-0.19.1%2Bcu121-cp311-cp311-linux_x86_64.whl (7.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.1/7.1 MB 391.8 MB/s eta 0:00:00


Collecting torchaudio==2.4.1
  Downloading https://download.pytorch.org/whl/cu121/torchaudio-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl (3.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.4/3.4 MB 487.6 MB/s eta 0:00:00
Collecting filelock
  Downloading filelock-3.19.1-py3-none-any.whl (15 kB)


Collecting nvidia-cusparse-cu12==12.1.0.106
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 79.4 MB/s eta 0:00:00


Collecting nvidia-cudnn-cu12==9.1.0.70
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 144.1 MB/s eta 0:00:00


Collecting triton==3.0.0
  Downloading triton-3.0.0-1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 182.8 MB/s eta 0:00:00


Collecting nvidia-cuda-runtime-cu12==12.1.105
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 KB 481.9 MB/s eta 0:00:00


Collecting nvidia-cublas-cu12==12.1.3.1
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 122.2 MB/s eta 0:00:00


Collecting nvidia-cufft-cu12==11.0.2.54
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 42.6 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 258.5 MB/s eta 0:00:00


Collecting fsspec
  Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 475.5 MB/s eta 0:00:00


Collecting nvidia-nccl-cu12==2.20.5
  Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 81.7 MB/s eta 0:00:00


Collecting typing-extensions>=4.8.0
  Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 379.4 MB/s eta 0:00:00


Collecting nvidia-cuda-nvrtc-cu12==12.1.105
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 120.4 MB/s eta 0:00:00


Collecting nvidia-cusolver-cu12==11.4.5.107
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 138.6 MB/s eta 0:00:00


Collecting sympy
  Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 233.7 MB/s eta 0:00:00


Collecting jinja2
  Downloading jinja2-3.1.6-py3-none-any.whl (134 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 443.1 MB/s eta 0:00:00


Collecting nvidia-nvtx-cu12==12.1.105
  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 KB 394.7 MB/s eta 0:00:00


Collecting networkx
  Downloading networkx-3.5-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 440.3 MB/s eta 0:00:00


Collecting nvidia-cuda-cupti-cu12==12.1.105
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 166.2 MB/s eta 0:00:00


Collecting pillow!=8.3.*,>=5.3.0
  Downloading pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 107.2 MB/s eta 0:00:00


Collecting numpy
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 196.2 MB/s eta 0:00:00


Collecting nvidia-nvjitlink-cu12
  Downloading nvidia_nvjitlink_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.7 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.7/39.7 MB 282.2 MB/s eta 0:00:00


Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)


Collecting mpmath<1.4,>=1.1.0
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 512.3 MB/s eta 0:00:00


Installing collected packages: mpmath, typing-extensions, sympy, pillow, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, fsspec, filelock, triton, nvidia-cusparse-cu12, nvidia-cudnn-cu12, jinja2, nvidia-cusolver-cu12, torch, torchvision, torchaudio


Successfully installed MarkupSafe-3.0.2 filelock-3.19.1 fsspec-2025.9.0 jinja2-3.1.6 mpmath-1.3.0 networkx-3.5 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.9.86 nvidia-nvtx-cu12-12.1.105 pillow-11.3.0 sympy-1.14.0 torch-2.4.1+cu121 torchaudio-2.4.1+cu121 torchvision-0.19.1+cu121 triton-3.0.0 typing-extensions-4.15.0


torch: 2.4.1+cu121 CUDA build: 12.1


CUDA available: True


GPU: NVIDIA A10-24Q


Environment ready.


In [2]:
# Quick EDA: inspect CSVs and dataset structure
import pandas as pd, os, json, sys
from collections import Counter

print('Listing data dirs...', flush=True)
train_dir = 'train'
test_dir = 'test'
print('train files:', len(os.listdir(train_dir)) if os.path.exists(train_dir) else 'missing')
print('test files:', len(os.listdir(test_dir)) if os.path.exists(test_dir) else 'missing')

print('Reading CSVs...', flush=True)
train_csv = pd.read_csv('train.csv')
ss = pd.read_csv('sample_submission.csv')
print('train.csv shape:', train_csv.shape)
print('train.csv columns:', list(train_csv.columns))
print(train_csv.head(3))
print('sample_submission.csv shape:', ss.shape)
print(ss.head(3))

# Basic label stats
if {'image_id','class_name','class_id','x_min','y_min','x_max','y_max'}.issubset(train_csv.columns):
    n_images = train_csv['image_id'].nunique()
    classes = sorted(train_csv['class_id'].unique().tolist())
    print('Unique images:', n_images)
    print('Classes:', classes)
    cnt = train_csv['class_id'].value_counts().sort_index()
    print('Counts per class_id:\n', cnt.to_string())
    no_find_mask = (train_csv['class_name'].str.lower()=='no finding') if 'class_name' in train_csv.columns else (train_csv['class_id']==14)
    print('Images with only No finding (approx by rows labeled No finding):', no_find_mask.sum())
else:
    print('Unexpected train.csv schema; will inspect later in detail.')

# Validate DICOM presence for a few samples
sample_ids = train_csv['image_id'].drop_duplicates().head(5).tolist()
missing = 0
for iid in sample_ids:
    p = os.path.join(train_dir, f'{iid}.dicom')
    if not os.path.exists(p):
        missing += 1
print('Missing dicoms among 5-sample check:', missing, 'out of', len(sample_ids))
print('EDA done.', flush=True)

Listing data dirs...


train files: 13500
test files: 1500
Reading CSVs...


train.csv shape: (61171, 8)
train.csv columns: ['image_id', 'class_name', 'class_id', 'rad_id', 'x_min', 'y_min', 'x_max', 'y_max']
                           image_id    class_name  class_id rad_id  x_min  \
0  50a418190bc3fb1ef1633bf9678929b3    No finding        14    R11    NaN   
1  21a10246a5ec7af151081d0cd6d65dc9    No finding        14     R7    NaN   
2  9a5094b2563a1ef3ff50dc5c7ff71345  Cardiomegaly         3    R10  691.0   

    y_min   x_max   y_max  
0     NaN     NaN     NaN  
1     NaN     NaN     NaN  
2  1375.0  1653.0  1831.0  
sample_submission.csv shape: (1500, 2)
                           image_id PredictionString
0  24b3c4ccc0e19044935c8f40ab37fc18     14 1 0 0 1 1
1  295add70002001e13d65c0d0d4a100a0     14 1 0 0 1 1
2  c2a691b7f3af90af7d2b09985f75ae2d     14 1 0 0 1 1
Unique images: 13500
Classes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
Counts per class_id:
 class_id
0      6479
1       249
2       827
3      4894
4       507
5       879
6      1127


In [4]:
# Install non-torch dependencies (honor torch constraints)
import sys, subprocess, os
def pip(*args):
    print('> pip', *args, flush=True)
    subprocess.run([sys.executable, '-m', 'pip', *args], check=True)

constrained = os.path.exists('constraints.txt')
cmd = ['install']
if constrained:
    cmd += ['-c','constraints.txt','--upgrade-strategy','only-if-needed']
cmd += [
    'pydicom==2.4.4',
    'opencv-python-headless==4.10.0.84',
    'tqdm==4.66.5',
    'iterative-stratification==0.1.7',
    'scikit-learn==1.5.2',
    'albumentations==1.4.18',
    'matplotlib==3.9.2',
    'ensemble-boxes==1.0.9',
    'ultralytics==8.3.60'
]
pip(*cmd)
print('Deps installed.', flush=True)

> pip install -c constraints.txt --upgrade-strategy only-if-needed pydicom==2.4.4 opencv-python-headless==4.10.0.84 tqdm==4.66.5 iterative-stratification==0.1.7 scikit-learn==1.5.2 albumentations==1.4.18 matplotlib==3.9.2 ensemble-boxes==1.0.9 ultralytics==8.3.60


Collecting pydicom==2.4.4
  Downloading pydicom-2.4.4-py3-none-any.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 51.2 MB/s eta 0:00:00
Collecting opencv-python-headless==4.10.0.84
  Downloading opencv_python_headless-4.10.0.84-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (49.9 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.9/49.9 MB 226.2 MB/s eta 0:00:00
Collecting tqdm==4.66.5
  Downloading tqdm-4.66.5-py3-none-any.whl (78 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.4/78.4 KB 440.7 MB/s eta 0:00:00


Collecting iterative-stratification==0.1.7
  Downloading iterative_stratification-0.1.7-py3-none-any.whl (8.5 kB)
Collecting scikit-learn==1.5.2


  Downloading scikit_learn-1.5.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.3/13.3 MB 112.7 MB/s eta 0:00:00


Collecting albumentations==1.4.18
  Downloading albumentations-1.4.18-py3-none-any.whl (224 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.0/224.0 KB 183.0 MB/s eta 0:00:00


Collecting matplotlib==3.9.2
  Downloading matplotlib-3.9.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.3/8.3 MB 114.6 MB/s eta 0:00:00
Collecting ensemble-boxes==1.0.9
  Downloading ensemble_boxes-1.0.9-py3-none-any.whl (23 kB)


Collecting ultralytics==8.3.60
  Downloading ultralytics-8.3.60-py3-none-any.whl (906 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 906.9/906.9 KB 377.0 MB/s eta 0:00:00


Collecting numpy>=1.17.0
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.3/18.3 MB 173.3 MB/s eta 0:00:00


Collecting scipy
  Downloading scipy-1.16.2-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (35.9 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 35.9/35.9 MB 143.9 MB/s eta 0:00:00
Collecting joblib>=1.2.0
  Downloading joblib-1.5.2-py3-none-any.whl (308 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 308.4/308.4 KB 522.5 MB/s eta 0:00:00
Collecting threadpoolctl>=3.1.0
  Downloading threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Collecting eval-type-backport
  Downloading eval_type_backport-0.2.2-py3-none-any.whl (5.8 kB)


Collecting scikit-image>=0.21.0
  Downloading scikit_image-0.25.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (14.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.8/14.8 MB 305.1 MB/s eta 0:00:00
Collecting albucore==0.0.17
  Downloading albucore-0.0.17-py3-none-any.whl (10 kB)


Collecting PyYAML
  Downloading PyYAML-6.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (762 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 763.0/763.0 KB 542.7 MB/s eta 0:00:00
Collecting pydantic>=2.7.0


  Downloading pydantic-2.11.9-py3-none-any.whl (444 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 444.9/444.9 KB 36.6 MB/s eta 0:00:00


Collecting fonttools>=4.22.0
  Downloading fonttools-4.60.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (5.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 239.1 MB/s eta 0:00:00


Collecting pillow>=8
  Downloading pillow-11.3.0-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.6/6.6 MB 228.0 MB/s eta 0:00:00
Collecting python-dateutil>=2.7
  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 KB 486.1 MB/s eta 0:00:00
Collecting packaging>=20.0
  Downloading packaging-25.0-py3-none-any.whl (66 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.5/66.5 KB 428.7 MB/s eta 0:00:00
Collecting contourpy>=1.0.1


  Downloading contourpy-1.3.3-cp311-cp311-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (355 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 355.2/355.2 KB 479.7 MB/s eta 0:00:00
Collecting cycler>=0.10
  Downloading cycler-0.12.1-py3-none-any.whl (8.3 kB)
Collecting pyparsing>=2.3.1
  Downloading pyparsing-3.2.5-py3-none-any.whl (113 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.9/113.9 KB 476.1 MB/s eta 0:00:00
Collecting kiwisolver>=1.3.1
  Downloading kiwisolver-1.4.9-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (1.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 405.4 MB/s eta 0:00:00


Collecting numba
  Downloading numba-0.62.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (3.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.5/3.5 MB 82.3 MB/s eta 0:00:00


Collecting pandas
  Downloading pandas-2.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.4 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.4/12.4 MB 325.6 MB/s eta 0:00:00
Collecting psutil


  Downloading psutil-7.1.0-cp36-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (291 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 291.2/291.2 KB 536.3 MB/s eta 0:00:00
Collecting torch>=1.8.0
  Downloading torch-2.4.1-cp311-cp311-manylinux1_x86_64.whl (797.1 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 797.1/797.1 MB 86.1 MB/s eta 0:00:00


Collecting py-cpuinfo
  Downloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Collecting ultralytics-thop>=2.0.0
  Downloading ultralytics_thop-2.0.17-py3-none-any.whl (28 kB)
Collecting torchvision>=0.9.0
  Downloading torchvision-0.19.1-cp311-cp311-manylinux1_x86_64.whl (7.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 325.9 MB/s eta 0:00:00
Collecting seaborn>=0.11.0
  Downloading seaborn-0.13.2-py3-none-any.whl (294 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 294.9/294.9 KB 478.1 MB/s eta 0:00:00


Collecting opencv-python>=4.6.0
  Downloading opencv_python-4.12.0.88-cp37-abi3-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (67.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.0/67.0 MB 236.6 MB/s eta 0:00:00
Collecting requests>=2.23.0
  Downloading requests-2.32.5-py3-none-any.whl (64 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.7/64.7 KB 343.1 MB/s eta 0:00:00
Collecting opencv-python>=4.6.0
  Downloading opencv_python-4.11.0.86-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (63.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 63.0/63.0 MB 68.0 MB/s eta 0:00:00
Collecting tzdata>=2022.7
  Downloading tzdata-2025.2-py2.py3-none-any.whl (347 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 347.8/347.8 KB 517.1 MB/s eta 0:00:00
Collecting pytz>=2020.1
  Downloading pytz-2025.2-py2.py3-none-any.whl (509 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 509.2/509.2 KB 518.0 MB/s eta 0:00:00


Collecting annotated-types>=0.6.0
  Downloading annotated_types-0.7.0-py3-none-any.whl (13 kB)
Collecting typing-inspection>=0.4.0
  Downloading typing_inspection-0.4.1-py3-none-any.whl (14 kB)
Collecting typing-extensions>=4.12.2
  Downloading typing_extensions-4.15.0-py3-none-any.whl (44 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.6/44.6 KB 378.8 MB/s eta 0:00:00


Collecting pydantic-core==2.33.2
  Downloading pydantic_core-2.33.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 387.1 MB/s eta 0:00:00
Collecting six>=1.5
  Downloading six-1.17.0-py2.py3-none-any.whl (11 kB)
Collecting certifi>=2017.4.17
  Downloading certifi-2025.8.3-py3-none-any.whl (161 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 161.2/161.2 KB 496.3 MB/s eta 0:00:00


Collecting charset_normalizer<4,>=2
  Downloading charset_normalizer-3.4.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (150 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 150.3/150.3 KB 448.5 MB/s eta 0:00:00
Collecting urllib3<3,>=1.21.1
  Downloading urllib3-2.5.0-py3-none-any.whl (129 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 129.8/129.8 KB 450.8 MB/s eta 0:00:00
Collecting idna<4,>=2.5
  Downloading idna-3.10-py3-none-any.whl (70 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 70.4/70.4 KB 409.0 MB/s eta 0:00:00
Collecting networkx>=3.0
  Downloading networkx-3.5-py3-none-any.whl (2.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 362.0 MB/s eta 0:00:00


Collecting tifffile>=2022.8.12
  Downloading tifffile-2025.9.20-py3-none-any.whl (230 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 230.1/230.1 KB 106.0 MB/s eta 0:00:00
Collecting imageio!=2.35.0,>=2.33
  Downloading imageio-2.37.0-py3-none-any.whl (315 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 315.8/315.8 KB 471.9 MB/s eta 0:00:00
Collecting lazy-loader>=0.4
  Downloading lazy_loader-0.4-py3-none-any.whl (12 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 664.8/664.8 MB 175.1 MB/s eta 0:00:00


Collecting nvidia-nccl-cu12==2.20.5
  Downloading nvidia_nccl_cu12-2.20.5-py3-none-manylinux2014_x86_64.whl (176.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 176.2/176.2 MB 206.5 MB/s eta 0:00:00
Collecting nvidia-cuda-nvrtc-cu12==12.1.105
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 165.2 MB/s eta 0:00:00
Collecting filelock
  Downloading filelock-3.19.1-py3-none-any.whl (15 kB)
Collecting nvidia-cufft-cu12==11.0.2.54
  Downloading nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.6/121.6 MB 124.7 MB/s eta 0:00:00
Collecting nvidia-cusolver-cu12==11.4.5.107
  Downloading nvidia_cusolver_cu12-11.4.5.107-py3-none-manylinux1_x86_64.whl (124.2 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.2/124.2 MB 227.9 MB/s eta 0:00:00
Collecting nvidia-curand-cu12==10.3.2.106
  Downloading nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.5/56.5 MB 217.6 MB/s eta 0:00:00
Collecting nvidia-cuda-cupti-cu12==12.1.105
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 167.4 MB/s eta 0:00:00


Collecting nvidia-cusparse-cu12==12.1.0.106
  Downloading nvidia_cusparse_cu12-12.1.0.106-py3-none-manylinux1_x86_64.whl (196.0 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 196.0/196.0 MB 130.6 MB/s eta 0:00:00
Collecting sympy
  Downloading sympy-1.14.0-py3-none-any.whl (6.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.3/6.3 MB 207.5 MB/s eta 0:00:00


Collecting jinja2
  Downloading jinja2-3.1.6-py3-none-any.whl (134 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 KB 482.9 MB/s eta 0:00:00
Collecting triton==3.0.0
  Downloading triton-3.0.0-1-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (209.4 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 209.4/209.4 MB 114.3 MB/s eta 0:00:00
Collecting fsspec
  Downloading fsspec-2025.9.0-py3-none-any.whl (199 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 199.3/199.3 KB 484.6 MB/s eta 0:00:00


Collecting nvidia-nvtx-cu12==12.1.105
  Downloading nvidia_nvtx_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (99 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 99.1/99.1 KB 431.5 MB/s eta 0:00:00
Collecting nvidia-cublas-cu12==12.1.3.1
  Downloading nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.6/410.6 MB 181.1 MB/s eta 0:00:00


Collecting nvidia-cuda-runtime-cu12==12.1.105
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 823.6/823.6 KB 269.2 MB/s eta 0:00:00
Collecting nvidia-nvjitlink-cu12
  Downloading nvidia_nvjitlink_cu12-12.9.86-py3-none-manylinux2010_x86_64.manylinux_2_12_x86_64.whl (39.7 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 39.7/39.7 MB 153.9 MB/s eta 0:00:00
Collecting llvmlite<0.46,>=0.45.0dev0
  Downloading llvmlite-0.45.0-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (56.3 MB)


     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.3/56.3 MB 74.1 MB/s eta 0:00:00


Collecting MarkupSafe>=2.0
  Downloading MarkupSafe-3.0.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (23 kB)
Collecting mpmath<1.4,>=1.1.0
  Downloading mpmath-1.3.0-py3-none-any.whl (536 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 536.2/536.2 KB 502.9 MB/s eta 0:00:00


Installing collected packages: pytz, py-cpuinfo, mpmath, urllib3, tzdata, typing-extensions, tqdm, threadpoolctl, sympy, six, PyYAML, pyparsing, pydicom, psutil, pillow, packaging, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, numpy, networkx, MarkupSafe, llvmlite, kiwisolver, joblib, idna, fsspec, fonttools, filelock, eval-type-backport, cycler, charset_normalizer, certifi, annotated-types, typing-inspection, triton, tifffile, scipy, requests, python-dateutil, pydantic-core, opencv-python-headless, opencv-python, nvidia-cusparse-cu12, nvidia-cudnn-cu12, numba, lazy-loader, jinja2, imageio, contourpy, scikit-learn, scikit-image, pydantic, pandas, nvidia-cusolver-cu12, matplotlib, albucore, torch, seaborn, iterative-stratification, ensemble-boxes, albumentations, ultralytics-thop, torchvision, ultralytics


Successfully installed MarkupSafe-3.0.2 PyYAML-6.0.2 albucore-0.0.17 albumentations-1.4.18 annotated-types-0.7.0 certifi-2025.8.3 charset_normalizer-3.4.3 contourpy-1.3.3 cycler-0.12.1 ensemble-boxes-1.0.9 eval-type-backport-0.2.2 filelock-3.19.1 fonttools-4.60.0 fsspec-2025.9.0 idna-3.10 imageio-2.37.0 iterative-stratification-0.1.7 jinja2-3.1.6 joblib-1.5.2 kiwisolver-1.4.9 lazy-loader-0.4 llvmlite-0.45.0 matplotlib-3.9.2 mpmath-1.3.0 networkx-3.5 numba-0.62.0 numpy-1.26.4 nvidia-cublas-cu12-12.1.3.1 nvidia-cuda-cupti-cu12-12.1.105 nvidia-cuda-nvrtc-cu12-12.1.105 nvidia-cuda-runtime-cu12-12.1.105 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.0.2.54 nvidia-curand-cu12-10.3.2.106 nvidia-cusolver-cu12-11.4.5.107 nvidia-cusparse-cu12-12.1.0.106 nvidia-nccl-cu12-2.20.5 nvidia-nvjitlink-cu12-12.9.86 nvidia-nvtx-cu12-12.1.105 opencv-python-4.11.0.86 opencv-python-headless-4.10.0.84 packaging-25.0 pandas-2.3.2 pillow-11.3.0 psutil-7.1.0 py-cpuinfo-9.0.0 pydantic-2.11.9 pydantic-core-2.33.2









Deps installed.


In [6]:
# Install system libraries required by OpenCV (fix libGL error)
import subprocess, sys
def sh(cmd):
    print('>>', cmd, flush=True)
    subprocess.run(['bash','-lc', cmd], check=True)
sh('apt-get update -y')
sh('apt-get install -y libgl1 libglib2.0-0')
print('System deps installed.')

>> apt-get update -y


Reading package lists...


E: List directory /var/lib/apt/lists/partial is missing. - Acquire (30: Read-only file system)


CalledProcessError: Command '['bash', '-lc', 'apt-get update -y']' returned non-zero exit status 100.

In [8]:
# Preprocessing: DICOM -> multi-window PNG@1024 and YOLO labels + 5-fold multilabel CV (with WBF + proper MONOCHROME1 handling)
import os, json, math, numpy as np, pandas as pd
from pathlib import Path
from tqdm import tqdm
import pydicom
from iterstrat.ml_stratifiers import MultilabelStratifiedKFold
from PIL import Image
from ensemble_boxes import weighted_boxes_fusion

TRAIN_DIR = Path('train')
TEST_DIR = Path('test')
IMG_OUT_DIR = Path('images_1024')  # cached 1024x1024 PNGs
LBL_OUT_DIR = Path('labels_yolo')  # YOLO txt labels (train only, excluding class 14)
META_DIR = Path('meta')
for d in (IMG_OUT_DIR, LBL_OUT_DIR, META_DIR):
    d.mkdir(parents=True, exist_ok=True)

def load_dicom_pixel(ds: pydicom.Dataset) -> np.ndarray:
    arr = ds.pixel_array.astype(np.float32)
    # Apply rescale slope/intercept
    slope = float(getattr(ds, 'RescaleSlope', 1.0))
    inter = float(getattr(ds, 'RescaleIntercept', 0.0))
    arr = arr * slope + inter
    # Do NOT invert here for MONOCHROME1; we'll invert the final 8-bit image after windowing
    # Optional clamp to robust HU range
    arr = np.clip(arr, -2000, 2000)
    return arr

def window_image(img: np.ndarray, center: float, width: float) -> np.ndarray:
    low = center - width / 2.0
    high = center + width / 2.0
    img_w = np.clip(img, low, high)
    img_w = (img_w - low) / max(1e-6, (high - low))
    return (img_w * 255.0).astype(np.uint8)

def multi_window_stack(img: np.ndarray) -> np.ndarray:
    # Clinical windows: lung, soft tissue (mediastinum), bone
    lung = window_image(img, center=-600, width=1500)
    soft = window_image(img, center=40, width=400)
    bone = window_image(img, center=300, width=2000)
    return np.stack([lung, soft, bone], axis=-1)

def letterbox_square(img: np.ndarray, target: int = 1024):
    # img: HxWxC uint8
    h, w = img.shape[:2]
    scale = target / max(h, w)
    nh, nw = int(round(h * scale)), int(round(w * scale))
    pil = Image.fromarray(img)
    resized = pil.resize((nw, nh), resample=Image.BILINEAR)
    canvas = Image.new('RGB', (target, target), (0, 0, 0))
    top = (target - nh) // 2
    left = (target - nw) // 2
    canvas.paste(resized, (left, top))
    meta = {'orig_h': h, 'orig_w': w, 'scale': scale, 'top': top, 'left': left, 'target': target}
    return np.array(canvas), meta

def save_png(img: np.ndarray, out_path: Path):
    out_path.parent.mkdir(parents=True, exist_ok=True)
    Image.fromarray(img).save(str(out_path), format='PNG', compress_level=3)

def fuse_gt_wbf(df_img: pd.DataFrame, orig_w: int, orig_h: int, iou_thr: float = 0.5):
    # df_img contains rows for a single image (all classes and rads); we'll fuse per class_id != 14 across rad_id
    fused = []
    df_img = df_img[df_img['class_id'] != 14]
    if df_img.empty:
        return fused
    for cid, df_c in df_img.groupby('class_id'):
        # group by radiologist to build per-detector lists
        box_lists, score_lists, label_lists = [], [], []
        for rad, df_r in df_c.groupby('rad_id'):
            boxes = []
            for _, r in df_r.iterrows():
                x1, y1, x2, y2 = r['x_min'], r['y_min'], r['x_max'], r['y_max']
                if not (np.isfinite(x1) and np.isfinite(y1) and np.isfinite(x2) and np.isfinite(y2)):
                    continue
                if x2 <= x1 or y2 <= y1:
                    continue
                # normalize to [0,1]
                boxes.append([x1 / orig_w, y1 / orig_h, x2 / orig_w, y2 / orig_h])
            if len(boxes) == 0:
                continue
            box_lists.append(boxes)
            score_lists.append([1.0] * len(boxes))
            label_lists.append([int(cid)] * len(boxes))
        if len(box_lists) == 0:
            continue
        b, s, l = weighted_boxes_fusion(box_lists, score_lists, label_lists, iou_thr=iou_thr, skip_box_thr=0.0)
        # denormalize back to pixels
        for (x1n, y1n, x2n, y2n), lab, sc in zip(b, l, s):
            x1 = float(x1n * orig_w); y1 = float(y1n * orig_h); x2 = float(x2n * orig_w); y2 = float(y2n * orig_h)
            fused.append({'class_id': int(cid), 'x_min': x1, 'y_min': y1, 'x_max': x2, 'y_max': y2})
    return fused

def convert_boxes_to_yolo(rows: pd.DataFrame, meta: dict) -> list[str]:
    # rows contains columns: class_id, x_min, y_min, x_max, y_max in ORIGINAL pixels
    S = meta['scale']; L = meta['left']; T = meta['top']; tgt = meta['target']
    lines = []
    for _, r in rows.iterrows():
        cid = int(r['class_id'])
        x1, y1, x2, y2 = float(r['x_min']), float(r['y_min']), float(r['x_max']), float(r['y_max'])
        if not (np.isfinite(x1) and np.isfinite(y1) and np.isfinite(x2) and np.isfinite(y2)):
            continue
        if x2 <= x1 or y2 <= y1:
            continue
        # scale and letterbox shift
        x1p = x1 * S + L; x2p = x2 * S + L
        y1p = y1 * S + T; y2p = y2 * S + T
        # clip
        x1p = np.clip(x1p, 0, tgt); x2p = np.clip(x2p, 0, tgt)
        y1p = np.clip(y1p, 0, tgt); y2p = np.clip(y2p, 0, tgt)
        if x2p <= x1p or y2p <= y1p:
            continue
        xc = (x1p + x2p) / 2.0 / tgt
        yc = (y1p + y2p) / 2.0 / tgt
        bw = (x2p - x1p) / tgt
        bh = (y2p - y1p) / tgt
        lines.append(f"{cid} {xc:.6f} {yc:.6f} {bw:.6f} {bh:.6f}")
    return lines

def process_split(csv_path='train.csv', limit=None):
    df = pd.read_csv(csv_path)
    img_ids = df['image_id'].unique().tolist()
    if limit is not None:
        img_ids = img_ids[:limit]
    meta_records = {}
    for iid in tqdm(img_ids, desc='DICOM->PNG'):
        dcm_path = TRAIN_DIR / f'{iid}.dicom'
        try:
            ds = pydicom.dcmread(str(dcm_path))
            base = load_dicom_pixel(ds)
            rgb = multi_window_stack(base)
            # Proper MONOCHROME1 handling: invert AFTER windowing
            if getattr(ds, 'PhotometricInterpretation', 'MONOCHROME2') == 'MONOCHROME1':
                rgb = 255 - rgb
            out_img_path = IMG_OUT_DIR / f'{iid}.png'
            img_1024, meta = letterbox_square(rgb, 1024)
            save_png(img_1024, out_img_path)
            meta_records[iid] = meta
            # labels: WBF merge across rad_id per class
            orig_h, orig_w = int(getattr(ds, 'Rows', img_1024.shape[0])), int(getattr(ds, 'Columns', img_1024.shape[1]))
            fused = fuse_gt_wbf(df[df.image_id == iid], orig_w=orig_w, orig_h=orig_h, iou_thr=0.5)
            if len(fused):
                rows = pd.DataFrame(fused)
            else:
                rows = pd.DataFrame(columns=['class_id','x_min','y_min','x_max','y_max'])
            yolo_lines = convert_boxes_to_yolo(rows, meta)
            out_lbl_path = LBL_OUT_DIR / f'{iid}.txt'
            with open(out_lbl_path, 'w') as f:
                if len(yolo_lines):
                    f.write('\n'.join(yolo_lines))
                else:
                    f.write('')
        except Exception as e:
            print(f'Failed {iid}: {e}', flush=True)
    # Save meta for later inverse-mapping to original size
    with open(META_DIR / 'image_meta.json', 'w') as f:
        json.dump(meta_records, f)
    print('Processed images:', len(meta_records))

def build_folds(csv_path='train.csv', n_splits=5, seed=42):
    df = pd.read_csv(csv_path)
    # Build multilabel presence per image for classes 0-13 (exclude 14 No finding)
    pos = df[df['class_id'] != 14][['image_id','class_id']].dropna()
    pos['class_id'] = pos['class_id'].astype(int)
    classes = list(range(14))
    img_ids = df['image_id'].unique()
    img_to_idx = {iid:i for i,iid in enumerate(img_ids)}
    Y = np.zeros((len(img_ids), len(classes)), dtype=int)
    for iid, cid in zip(pos['image_id'].values, pos['class_id'].values):
        Y[img_to_idx[iid], cid] = 1
    mskf = MultilabelStratifiedKFold(n_splits=n_splits, shuffle=True, random_state=seed)
    folds = np.full(len(img_ids), -1, dtype=int)
    for k, (_, val_idx) in enumerate(mskf.split(np.zeros(len(img_ids)), Y)):
        folds[val_idx] = k
    folds_df = pd.DataFrame({'image_id': img_ids, 'fold': folds})
    folds_df.to_csv(META_DIR / 'folds.csv', index=False)
    print('Folds saved:', (META_DIR / 'folds.csv'))

print('Preprocessing utilities ready. Next steps:')
print('- Run process_split(limit=50) for smoke test, then full run (may take minutes).')
print('- Run build_folds() to create 5-fold split for training.')

Preprocessing utilities ready. Next steps:
- Run process_split(limit=50) for smoke test, then full run (may take minutes).
- Run build_folds() to create 5-fold split for training.


In [10]:
# Smoke test: convert 50 images and build 5-folds
import time
t0 = time.time()
print('Starting smoke conversion for 50 images...', flush=True)
process_split(csv_path='train.csv', limit=50)
print('Building folds...', flush=True)
build_folds(csv_path='train.csv', n_splits=5, seed=42)
print('Elapsed: %.2f sec' % (time.time() - t0), flush=True)
print('Check outputs: images_1024/*.png, labels_yolo/*.txt, meta/image_meta.json, meta/folds.csv', flush=True)

Starting smoke conversion for 50 images...


DICOM->PNG:   0%|          | 0/50 [00:00<?, ?it/s]

DICOM->PNG:   2%|▏         | 1/50 [00:01<01:00,  1.23s/it]

DICOM->PNG:   4%|▍         | 2/50 [00:01<00:29,  1.65it/s]

DICOM->PNG:   6%|▌         | 3/50 [00:02<00:40,  1.16it/s]

DICOM->PNG:   8%|▊         | 4/50 [00:03<00:49,  1.09s/it]

DICOM->PNG:  12%|█▏        | 6/50 [00:04<00:24,  1.78it/s]

DICOM->PNG:  14%|█▍        | 7/50 [00:05<00:33,  1.29it/s]

DICOM->PNG:  16%|█▌        | 8/50 [00:05<00:25,  1.67it/s]

DICOM->PNG:  18%|█▊        | 9/50 [00:07<00:34,  1.20it/s]

DICOM->PNG:  20%|██        | 10/50 [00:07<00:24,  1.61it/s]

DICOM->PNG:  22%|██▏       | 11/50 [00:07<00:18,  2.06it/s]

DICOM->PNG:  24%|██▍       | 12/50 [00:08<00:28,  1.35it/s]

DICOM->PNG:  26%|██▌       | 13/50 [00:08<00:20,  1.78it/s]

DICOM->PNG:  28%|██▊       | 14/50 [00:08<00:15,  2.31it/s]

DICOM->PNG:  32%|███▏      | 16/50 [00:10<00:16,  2.01it/s]

DICOM->PNG:  34%|███▍      | 17/50 [00:10<00:13,  2.44it/s]

DICOM->PNG:  36%|███▌      | 18/50 [00:11<00:22,  1.45it/s]

DICOM->PNG:  38%|███▊      | 19/50 [00:11<00:16,  1.88it/s]

DICOM->PNG:  40%|████      | 20/50 [00:13<00:25,  1.17it/s]

DICOM->PNG:  42%|████▏     | 21/50 [00:14<00:27,  1.07it/s]

DICOM->PNG:  46%|████▌     | 23/50 [00:14<00:15,  1.76it/s]

DICOM->PNG:  48%|████▊     | 24/50 [00:15<00:12,  2.10it/s]

DICOM->PNG:  50%|█████     | 25/50 [00:15<00:09,  2.58it/s]

DICOM->PNG:  52%|█████▏    | 26/50 [00:15<00:07,  3.19it/s]

DICOM->PNG:  56%|█████▌    | 28/50 [00:15<00:04,  4.53it/s]

DICOM->PNG:  60%|██████    | 30/50 [00:15<00:03,  5.79it/s]

DICOM->PNG:  62%|██████▏   | 31/50 [00:16<00:07,  2.66it/s]

DICOM->PNG:  64%|██████▍   | 32/50 [00:16<00:05,  3.17it/s]

DICOM->PNG:  66%|██████▌   | 33/50 [00:17<00:04,  3.66it/s]

DICOM->PNG:  68%|██████▊   | 34/50 [00:17<00:03,  4.26it/s]

DICOM->PNG:  70%|███████   | 35/50 [00:17<00:03,  4.99it/s]

DICOM->PNG:  72%|███████▏  | 36/50 [00:17<00:02,  5.61it/s]

DICOM->PNG:  74%|███████▍  | 37/50 [00:17<00:02,  6.26it/s]

DICOM->PNG:  76%|███████▌  | 38/50 [00:17<00:01,  6.26it/s]

DICOM->PNG:  78%|███████▊  | 39/50 [00:17<00:01,  6.76it/s]

DICOM->PNG:  80%|████████  | 40/50 [00:17<00:01,  7.30it/s]

DICOM->PNG:  82%|████████▏ | 41/50 [00:18<00:01,  7.26it/s]

DICOM->PNG:  84%|████████▍ | 42/50 [00:18<00:01,  7.60it/s]

DICOM->PNG:  88%|████████▊ | 44/50 [00:18<00:00,  8.05it/s]

DICOM->PNG:  92%|█████████▏| 46/50 [00:20<00:01,  2.48it/s]

DICOM->PNG:  96%|█████████▌| 48/50 [00:21<00:01,  1.98it/s]

DICOM->PNG:  98%|█████████▊| 49/50 [00:21<00:00,  2.36it/s]

DICOM->PNG: 100%|██████████| 50/50 [00:21<00:00,  2.83it/s]

DICOM->PNG: 100%|██████████| 50/50 [00:21<00:00,  2.30it/s]

Processed images: 50
Building folds...





Folds saved: meta/folds.csv
Elapsed: 21.99 sec


Check outputs: images_1024/*.png, labels_yolo/*.txt, meta/image_meta.json, meta/folds.csv


In [11]:
# Utilities: Full conversion runners (train full, test), and YOLO fold file lists + data.yaml
import pandas as pd, json, os
from pathlib import Path

IMG_OUT_DIR = Path('images_1024')
IMG_OUT_TEST_DIR = Path('images_1024_test')
META_DIR = Path('meta')
TRAIN_DIR = Path('train')
TEST_DIR = Path('test')
LBL_OUT_DIR = Path('labels_yolo')
IMG_OUT_TEST_DIR.mkdir(parents=True, exist_ok=True)

def process_test(target=1024):
    from tqdm import tqdm
    import pydicom, numpy as np
    meta_records = {}
    iids = [p.stem for p in sorted(TEST_DIR.glob('*.dicom'))]
    for iid in tqdm(iids, desc='TEST DICOM->PNG'):
        dcm_path = TEST_DIR / f'{iid}.dicom'
        try:
            ds = pydicom.dcmread(str(dcm_path))
            base = load_dicom_pixel(ds)
            rgb = multi_window_stack(base)
            if getattr(ds, 'PhotometricInterpretation', 'MONOCHROME2') == 'MONOCHROME1':
                rgb = 255 - rgb
            out_img_path = IMG_OUT_TEST_DIR / f'{iid}.png'
            img_1024, meta = letterbox_square(rgb, target)
            save_png(img_1024, out_img_path)
            meta_records[iid] = meta
        except Exception as e:
            print(f'Failed TEST {iid}: {e}', flush=True)
    with open(META_DIR / 'test_image_meta.json', 'w') as f:
        json.dump(meta_records, f)
    print('Processed TEST images:', len(meta_records))

def prepare_yolo_folds_files(folds_csv='meta/folds.csv', out_dir='meta/fold_files'):
    out_dir = Path(out_dir); out_dir.mkdir(parents=True, exist_ok=True)
    folds = pd.read_csv(folds_csv)
    # ensure all images exist
    exist_set = set(p.stem for p in IMG_OUT_DIR.glob('*.png'))
    folds = folds[folds['image_id'].isin(exist_set)].copy()
    # write per-fold train/val lists
    for k in sorted(folds['fold'].unique()):
        val_ids = folds.loc[folds['fold']==k, 'image_id'].tolist()
        train_ids = folds.loc[folds['fold']!=k, 'image_id'].tolist()
        with open(out_dir / f'train_fold{k}.txt', 'w') as f:
            for iid in train_ids:
                f.write(str(IMG_OUT_DIR / f'{iid}.png') + '\n')
        with open(out_dir / f'val_fold{k}.txt', 'w') as f:
            for iid in val_ids:
                f.write(str(IMG_OUT_DIR / f'{iid}.png') + '\n')
    print('Wrote YOLO filelists to', out_dir)

def write_classes_and_yaml(train_csv='train.csv', out_dir='meta'):
    out_dir = Path(out_dir); out_dir.mkdir(parents=True, exist_ok=True)
    df = pd.read_csv(train_csv)
    df = df[df['class_id'] != 14]
    # Build class names mapping 0..13
    names = df[['class_id','class_name']].drop_duplicates().sort_values('class_id')
    # Some datasets may not list all 0..13 in this subset; ensure full 0..13 order by filling names if missing
    class_names = {}
    for cid in range(14):
        rows = names[names['class_id']==cid]['class_name'].tolist()
        class_names[cid] = (rows[0] if rows else f'class_{cid}')
    with open(out_dir / 'classes.json','w') as f:
        json.dump(class_names, f, indent=2)
    # Write a base data.yaml (we will override train/val per fold when launching training)
    yaml_content = [
        'path: .',
        f'train: {str((Path(out_dir)/"fold_files"/"train_fold0.txt").as_posix())}',
        f'val: {str((Path(out_dir)/"fold_files"/"val_fold0.txt").as_posix())}',
        'nc: 14',
        'names: [' + ', '.join([f'"{class_names[i]}"' for i in range(14)]) + ']'
    ]
    with open(out_dir / 'data.yaml','w') as f:
        f.write('\n'.join(yaml_content))
    print('Wrote classes.json and base data.yaml in', out_dir)

print('Runners ready: process_test(), prepare_yolo_folds_files(), write_classes_and_yaml().')
print('Next: run full train conversion (process_split with limit=None), process_test(), make fold lists, and write data.yaml.')

Runners ready: process_test(), prepare_yolo_folds_files(), write_classes_and_yaml().
Next: run full train conversion (process_split with limit=None), process_test(), make fold lists, and write data.yaml.


In [None]:
# Full conversion + lists/yaml
import time
t0 = time.time()
print('Starting FULL train conversion...', flush=True)
process_split(csv_path='train.csv', limit=None)
print('Train conversion done in %.2f sec' % (time.time()-t0), flush=True)
t1 = time.time()
print('Processing TEST conversion...', flush=True)
process_test(target=1024)
print('Test conversion done in %.2f sec' % (time.time()-t1), flush=True)
print('Preparing YOLO fold filelists...', flush=True)
prepare_yolo_folds_files(folds_csv='meta/folds.csv', out_dir='meta/fold_files')
print('Writing classes and base data.yaml...', flush=True)
write_classes_and_yaml(train_csv='train.csv', out_dir='meta')
print('All preprocessing steps completed in %.2f sec' % (time.time()-t0), flush=True)

Starting FULL train conversion...


DICOM->PNG:   0%|          | 0/13500 [00:00<?, ?it/s]

DICOM->PNG:   0%|          | 1/13500 [00:01<4:35:59,  1.23s/it]

DICOM->PNG:   0%|          | 2/13500 [00:01<2:14:46,  1.67it/s]

DICOM->PNG:   0%|          | 3/13500 [00:02<3:12:53,  1.17it/s]

DICOM->PNG:   0%|          | 4/13500 [00:03<4:03:48,  1.08s/it]

DICOM->PNG:   0%|          | 6/13500 [00:04<2:06:15,  1.78it/s]

In [None]:
# YOLOv8m training utilities (per-fold) - tuned per expert advice
from ultralytics import YOLO
import yaml, shutil, os, time
from pathlib import Path

META_DIR = Path('meta')
FOLDS_DIR = META_DIR / 'fold_files'
RUNS_DIR = Path('runs')
RUNS_DIR.mkdir(exist_ok=True, parents=True)

def make_fold_yaml(base_yaml='meta/data.yaml', fold=0) -> str:
    base = yaml.safe_load(Path(base_yaml).read_text())
    base['train'] = str((FOLDS_DIR / f'train_fold{fold}.txt').as_posix())
    base['val'] = str((FOLDS_DIR / f'val_fold{fold}.txt').as_posix())
    out = META_DIR / f'data_fold{fold}.yaml'
    out.write_text(yaml.safe_dump(base))
    return str(out)

def train_yolov8m_fold(
    fold=0,
    epochs=25,
    batch=16,
    imgsz=1024,
    seed=42,
    workers=8,
    cache='ram',  # 'ram' or 'disk' or False
    mosaic=0.3,
    close_mosaic=10,
    degrees=5.0,
    scale=0.2,
    translate=0.05,
    fliplr=0.5,
    hsv_v=0.1,
):
    data_yaml = make_fold_yaml(fold=fold)
    model = YOLO('yolov8m.pt')
    name = f'v8m_{imgsz}_fold{fold}_e{epochs}_b{batch}'
    print(f'Start training: {name}', flush=True)
    t0 = time.time()
    results = model.train(
        data=data_yaml,
        epochs=epochs,
        imgsz=imgsz,
        batch=batch,
        device=0,
        seed=seed,
        workers=workers,
        project=str(RUNS_DIR),
        name=name,
        pretrained=True,
        amp=True,
        ema=True,
        cos_lr=True,
        warmup_epochs=3,
        patience=5,
        cache=cache,
        rect=False,
        # Augs (medical-safe)
        fliplr=fliplr,
        flipud=0.0,
        degrees=degrees,
        scale=scale,
        translate=translate,
        shear=0.0,
        hsv_h=0.0, hsv_s=0.0, hsv_v=hsv_v,
        mixup=0.0,
        copy_paste=0.0,
        mosaic=mosaic,
        close_mosaic=close_mosaic,
    )
    print(f'Training done in {(time.time()-t0)/60:.1f} min', flush=True)
    return results

print('Training utilities ready. After full conversion completes, call train_yolov8m_fold(fold=0) as a smoke, then scale to 5 folds.')

In [None]:
# Inspect sanity training metrics (results.csv) live - posonly2 run
import os, time, pandas as pd, glob
from pathlib import Path

# Point to the corrected sanity run
run_dir = Path('runs/partial_v8m_1024_fold0_e4_b12_posonly2')
csv_path = run_dir / 'results.csv'
if not csv_path.exists():
    print('results.csv not found at', csv_path)
else:
    try:
        df = pd.read_csv(csv_path)
        print('results.csv rows:', len(df), 'cols:', list(df.columns))
        print(df.tail(5))
    except Exception as e:
        print('Failed to read results.csv:', e)
    st = os.stat(csv_path)
    print('Last modified (epoch):', st.st_mtime, '->', time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(st.st_mtime)))
    print('results.csv size (bytes):', os.path.getsize(csv_path))

    # List artifacts to gauge training progress
    files = sorted(glob.glob(str(run_dir / '*')))
    print('Artifacts:', [Path(f).name for f in files])

In [None]:
# Inference, WBF, and submission utilities (prep while conversion runs)
import json, os, math, numpy as np, pandas as pd
from pathlib import Path
from ultralytics import YOLO
from ensemble_boxes import weighted_boxes_fusion

META_DIR = Path('meta')
IMG_TEST_DIR = Path('images_1024_test')

def load_meta(kind='train'):
    if kind == 'train':
        mp = META_DIR / 'image_meta.json'
    else:
        mp = META_DIR / 'test_image_meta.json'
    return json.loads(mp.read_text()) if mp.exists() else {}

def unletterbox_to_original(xyxy, meta):
    # xyxy in letterboxed 1024 space -> back to original pixels, then normalize to [0,1]
    x1, y1, x2, y2, conf, cid = xyxy
    S = meta['scale']; L = meta['left']; T = meta['top']; tgt = meta['target']
    orig_w = meta['orig_w']; orig_h = meta['orig_h']
    # remove letterbox/pad
    x1p = (x1 - L) / max(S, 1e-6)
    x2p = (x2 - L) / max(S, 1e-6)
    y1p = (y1 - T) / max(S, 1e-6)
    y2p = (y2 - T) / max(S, 1e-6)
    # clip to original frame
    x1p = float(np.clip(x1p, 0, orig_w)); x2p = float(np.clip(x2p, 0, orig_w))
    y1p = float(np.clip(y1p, 0, orig_h)); y2p = float(np.clip(y2p, 0, orig_h))
    if x2p <= x1p or y2p <= y1p:
        return None
    # normalize to [0,1] and return class id, conf, normalized xyxy
    return int(cid), float(conf), x1p / orig_w, y1p / orig_h, x2p / orig_w, y2p / orig_h

def predict_single_with_tta(model, img_path, tta_hflip=True):
    # Returns list of (boxes_xyxy, scores, labels) for each TTA view in letterboxed 1024 coords
    outs = []
    # base
    r = model.predict(source=str(img_path), conf=0.001, iou=0.5, device=0, imgsz=1024, half=False, verbose=False, max_det=300)[0]
    if r.boxes is not None and len(r.boxes) > 0:
        b = r.boxes.xyxy.cpu().numpy(); s = r.boxes.conf.cpu().numpy(); l = r.boxes.cls.cpu().numpy().astype(int)
        outs.append((b, s, l))
    else:
        outs.append((np.zeros((0,4)), np.zeros((0,)), np.zeros((0,), int)))
    if tta_hflip:
        r = model.predict(source=str(img_path), conf=0.001, iou=0.5, device=0, imgsz=1024, half=False, verbose=False, max_det=300, augment=True)[0]
        # Ultralytics augment includes hflip among others; we don't need to manually invert
        if r.boxes is not None and len(r.boxes) > 0:
            b = r.boxes.xyxy.cpu().numpy(); s = r.boxes.conf.cpu().numpy(); l = r.boxes.cls.cpu().numpy().astype(int)
            outs.append((b, s, l))
        else:
            outs.append((np.zeros((0,4)), np.zeros((0,)), np.zeros((0,), int)))
    return outs

def wbf_fuse_views(views, iou_thr=0.55):
    # views: list of (boxes_xyxy, scores, labels) in SAME coord space
    if not views:
        return np.zeros((0,4)), np.zeros((0,)), np.zeros((0,), int)
    bxs = []; scs = []; lbs = []
    for b, s, l in views:
        # normalize to [0,1] for WBF fusion in 1024 space
        if len(b) == 0:
            bxs.append([]); scs.append([]); lbs.append([]); continue
        b_norm = b / 1024.0
        b_norm = b_norm.tolist()
        bxs.append(b_norm); scs.append(s.tolist()); lbs.append(l.tolist())
    if sum(len(x) for x in bxs) == 0:
        return np.zeros((0,4)), np.zeros((0,)), np.zeros((0,), int)
    fb, fs, fl = weighted_boxes_fusion(bxs, scs, lbs, iou_thr=iou_thr, skip_box_thr=0.0, conf_type='avg')
    fb = np.asarray(fb) * 1024.0
    fs = np.asarray(fs); fl = np.asarray(fl, int)
    return fb, fs, fl

def build_submission_from_preds(preds_norm):
    # preds_norm: dict[iid] -> list of tuples (cid, conf, x1n, y1n, x2n, y2n) on original normalized coords
    rows = []
    for iid, items in preds_norm.items():
        if not items:
            rows.append({'image_id': iid, 'PredictionString': '14 1 0 0 1 1'})
            continue
        parts = []
        for cid, conf, x1, y1, x2, y2 in items:
            parts.extend([str(cid), f'{conf:.6f}', f'{x1:.6f}', f'{y1:.6f}', f'{x2:.6f}', f'{y2:.6f}'])
        rows.append({'image_id': iid, 'PredictionString': ' '.join(parts) if parts else '14 1 0 0 1 1'})
    return pd.DataFrame(rows, columns=['image_id','PredictionString'])

def infer_test_fold(model_path, tta_hflip=True, min_conf_per_class=None):
    # min_conf_per_class: optional dict[cid] -> threshold to filter after WBF
    model = YOLO(model_path)
    test_meta = load_meta('test')
    preds_norm = {}
    iids = [p.stem for p in sorted(IMG_TEST_DIR.glob('*.png'))]
    for i, iid in enumerate(iids):
        img_path = IMG_TEST_DIR / f'{iid}.png'
        views = predict_single_with_tta(model, img_path, tta_hflip=tta_hflip)
        fb, fs, fl = wbf_fuse_views(views, iou_thr=0.55)
        # per-class thresholding
        keep = []
        for (x1, y1, x2, y2), sc, cl in zip(fb, fs, fl):
            if min_conf_per_class is not None:
                thr = min_conf_per_class.get(int(cl), 0.0)
                if sc < thr:
                    continue
            keep.append((x1, y1, x2, y2, sc, cl))
        items = []
        meta = test_meta.get(iid, None)
        if meta is None:
            preds_norm[iid] = []
            continue
        for obj in keep:
            unm = unletterbox_to_original(obj, meta)
            if unm is not None:
                items.append(unm)
        preds_norm[iid] = items
        if (i+1) % 100 == 0:
            print(f'Infer {i+1}/{len(iids)} done', flush=True)
    sub_df = build_submission_from_preds(preds_norm)
    return sub_df

print('Inference utilities ready: use infer_test_fold(model_path, tta_hflip=True, min_conf_per_class=...) after training completes.')

In [None]:
# Launch fold-0 full training after preprocessing completes
import os, glob, time
from pathlib import Path

def clear_ultralytics_caches():
    removed = 0
    for p in list(Path('.').glob('*.cache')) + list(Path('meta').glob('*.cache')) + list(Path('meta/fold_files').glob('*.cache')):
        try:
            p.unlink(); removed += 1
        except Exception:
            pass
    print('Cleared cache files:', removed, flush=True)

def sanity_preflight():
    # quick checks before training
    imgs = list(Path('images_1024').glob('*.png'))
    lbls = list(Path('labels_yolo').glob('*.txt'))
    print('Found images:', len(imgs), 'labels:', len(lbls), flush=True)
    assert Path('meta/folds.csv').exists(), 'Missing meta/folds.csv'
    assert Path('meta/fold_files/train_fold0.txt').exists(), 'Missing fold filelists; run prepare_yolo_folds_files()'
    assert Path('meta/data.yaml').exists(), 'Missing meta/data.yaml'

def train_fold0():
    clear_ultralytics_caches()
    sanity_preflight()
    # Start training with tuned settings (adjust batch if OOM)
    train_yolov8m_fold(
        fold=0,
        epochs=25,
        batch=16,
        imgsz=1024,
        workers=8,
        cache='ram',
        mosaic=0.3,
        close_mosaic=10,
        degrees=5.0,
        scale=0.2,
        translate=0.05,
        fliplr=0.5,
        hsv_v=0.1,
    )

print('When Cell 8 finishes, run train_fold0() to launch full fold-0 training.', flush=True)