<div align="center">

  <a href="https://ultralytics.com/yolo" target="_blank">
    <img width="1024", src="https://raw.githubusercontent.com/ultralytics/assets/main/yolov8/banner-yolov8.png"></a>

  [中文](https://docs.ultralytics.com/zh/) | [한국어](https://docs.ultralytics.com/ko/) | [日本語](https://docs.ultralytics.com/ja/) | [Русский](https://docs.ultralytics.com/ru/) | [Deutsch](https://docs.ultralytics.com/de/) | [Français](https://docs.ultralytics.com/fr/) | [Español](https://docs.ultralytics.com/es/) | [Português](https://docs.ultralytics.com/pt/) | [Türkçe](https://docs.ultralytics.com/tr/) | [Tiếng Việt](https://docs.ultralytics.com/vi/) | [العربية](https://docs.ultralytics.com/ar/)

  <a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yml/badge.svg" alt="Ultralytics CI"></a>
  <a href="https://colab.research.google.com/github/ultralytics/notebooks/blob/main/notebooks/how-to-train-ultralytics-yolo-on-crack-segmentation-dataset.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>


  <a href="https://ultralytics.com/discord"><img alt="Discord" src="https://img.shields.io/discord/1089800235347353640?logo=discord&logoColor=white&label=Discord&color=blue"></a>
  <a href="https://community.ultralytics.com"><img alt="Ultralytics Forums" src="https://img.shields.io/discourse/users?server=https%3A%2F%2Fcommunity.ultralytics.com&logo=discourse&label=Forums&color=blue"></a>
  <a href="https://reddit.com/r/ultralytics"><img alt="Ultralytics Reddit" src="https://img.shields.io/reddit/subreddit-subscribers/ultralytics?style=flat&logo=reddit&logoColor=white&label=Reddit&color=blue"></a>
  
  Welcome to the Crack segmentation with Ultralytics YOLO11 🚀 notebook! <a href="https://github.com/ultralytics/ultralytics">YOLO11</a> is the latest version of the YOLO (You Only Look Once) AI models developed by <a href="https://ultralytics.com">Ultralytics</a>. We hope that the resources in this notebook will help you get the most out of YOLO11. Please browse the YOLO11 <a href="https://docs.ultralytics.com/">Docs</a> for details, raise an issue on <a href="https://github.com/ultralytics/ultralytics">GitHub</a> for support, and join our <a href="https://ultralytics.com/discord">Discord</a> community for questions and discussions!</div>

# Crack Segmentation using Ultralytics YOLO11

This notebook acts as a starting point for training the YOLO11 model using the [crack segmentation dataset](https://docs.ultralytics.com/datasets/segment/crack-seg/).

## Dataset Structure

The division of data within the Crack Segmentation Dataset is outlined as follows:

- **Training set**: Consists of 3717 images with corresponding annotations.

- **Testing set**: Comprises 112 images along with their respective annotations.

- **Validation set**: Includes 200 images with their corresponding annotations.

## Applications

Crack segmentation finds practical applications in infrastructure maintenance, aiding in the identification and assessment of structural damage. It also plays a crucial role in enhancing road safety by enabling automated systems to detect and address pavement cracks for timely repairs.


In [1]:
#1 기본설치 및 경로설정 import 추가
# 1. 기본설치 및 경로설정 import 추가
!pip install -q scipy scikit-learn tensorboard

# 2. 필요 모듈 임포트
from google.colab import files
import zipfile
import os
import glob
import shutil
import numpy as np
import scipy.io
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from sklearn.model_selection import StratifiedKFold
from torch.utils.tensorboard import SummaryWriter

# 3. 압축 파일 업로드
print("▶️ 압축 파일을 선택하세요 (.zip)")
uploaded = files.upload()
zip_filename = next(iter(uploaded))
print(f"✅ 업로드 완료: {zip_filename}")

# 4. 압축 해제
extract_root = '.'
with zipfile.ZipFile(zip_filename, 'r') as zip_ref:
    zip_ref.extractall(extract_root)
print(f"✅ 압축 해제 완료: '{extract_root}'에 내용이 풀렸습니다")

# 5. CrackForest-dataset 폴더명 정리
target_dir = "CrackForest-dataset"
src_dir = None

for item in os.listdir(extract_root):
    if item.startswith("CrackForest") and os.path.isdir(item) and item != target_dir:
        src_dir = item
        break

if src_dir:
    if os.path.exists(target_dir):
        print(f"기존 '{target_dir}' 폴더가 이미 존재합니다. 삭제 후 이름 변경을 진행합니다.")
        shutil.rmtree(target_dir)
    os.rename(src_dir, target_dir)
    print(f"📁 폴더 이름을 '{src_dir}' ➜ '{target_dir}' 으로 변경했습니다.")
else:
    print(f"변경할 폴더가 없거나 이미 '{target_dir}' 폴더가 존재합니다.")

# 6. 데이터셋 경로 설정 및 검증
dataset_dir = os.path.join(extract_root, target_dir)
if not os.path.exists(dataset_dir):
    raise FileNotFoundError(f"❌ '{target_dir}' 폴더가 존재하지 않습니다. 압축 구조를 확인해 주세요.")

img_dir = os.path.join(dataset_dir, 'image')
gt_dir = os.path.join(dataset_dir, 'groundTruth')
seg_dir = os.path.join(dataset_dir, 'seg')

for subdir, name in zip([img_dir, gt_dir, seg_dir], ['image', 'groundTruth', 'seg']):
    if not os.path.exists(subdir):
        raise FileNotFoundError(f"❌ '{name}/' 폴더가 '{dataset_dir}' 내에 없습니다. 압축 구조를 확인해 주세요.")

# 7. 출력 디렉토리 생성
os.makedirs('runs', exist_ok=True)
os.makedirs('saved_models', exist_ok=True)

# 8. 경로 출력
print(f"✅ 데이터셋 루트: {dataset_dir}")
print(f"📁 이미지 폴더: {img_dir}")
print(f"📁 마스크 폴더: {gt_dir}")
print(f"📁 Segmentation 폴더: {seg_dir}")

# 9. 이미지 및 .mat 파일 경로 수집 (확장자 자동 감지)
supported_img_exts = ['png', 'jpg', 'jpeg', 'bmp', 'tif', 'tiff']
img_paths = []
for ext in supported_img_exts:
    img_paths.extend(glob.glob(os.path.join(img_dir, f"*.{ext}")))
img_paths = sorted(img_paths)
mat_paths = sorted(glob.glob(os.path.join(gt_dir, "*.mat")))

# 🔍 디버깅 로그
print(f"\n✅ 이미지 디렉토리 경로: {img_dir}")
print(f"🖼️ 감지된 이미지 수: {len(img_paths)}")
if len(img_paths) > 0:
    print("✅ 이미지 샘플:", img_paths[:3])
else:
    print("⚠️ 지원되는 이미지 확장자로 된 파일이 없습니다. 실제 확장자 확인 필요")
print(f"📐 감지된 GroundTruth 수: {len(mat_paths)}")

# 10. 이미지와 마스크 이름 기준으로 공통 매칭
img_names = set([os.path.splitext(os.path.basename(p))[0] for p in img_paths])
mat_names = set([os.path.splitext(os.path.basename(p))[0] for p in mat_paths])

common_keys = sorted(img_names & mat_names)
img_dict = {os.path.splitext(os.path.basename(p))[0]: p for p in img_paths}
mat_dict = {os.path.splitext(os.path.basename(p))[0]: p for p in mat_paths}

img_paths_matched = [img_dict[k] for k in common_keys]
mat_paths_matched = [mat_dict[k] for k in common_keys]

# 샘플 라벨 (예: 모두 0으로 설정)
labels = [0] * len(common_keys)

print(f"\n✅ 매칭된 이미지 수: {len(img_paths_matched)}")
print(f"✅ 매칭된 마스크 수: {len(mat_paths_matched)}")
print(f"🧷 공통된 파일 prefix 예시: {common_keys[:3]}")

# 11. Train/Validation 분리 (필요 시)
if len(img_paths_matched) == 0:
    raise ValueError("❌ 이미지와 GroundTruth가 매칭되지 않았습니다. 파일 이름을 확인하세요.")

tr = list(range(0, int(0.8 * len(img_paths_matched))))
va = list(range(int(0.8 * len(img_paths_matched)), len(img_paths_matched)))

print(f"\n🧪 Train indices: {min(tr)} ~ {max(tr)}")
print(f"🧪 Validation indices: {min(va)} ~ {max(va)}")

# 12. 내부 폴더 구조 출력
print("\n📂 'CrackForest-dataset' 폴더 내부 구조:")
for root, dirs, files in os.walk(dataset_dir):
    level = root.replace(dataset_dir, '').count(os.sep)
    indent = ' ' * 2 * level
    print(f"{indent}{os.path.basename(root)}/")
    subindent = ' ' * 2 * (level + 1)
    for f in files[:5]:
        print(f"{subindent}{f}")
    if len(files) > 5:
        print(f"{subindent}... 외 {len(files)-5}개 더")


▶️ 압축 파일을 선택하세요 (.zip)


Saving CrackForest-dataset-master.zip to CrackForest-dataset-master.zip
✅ 업로드 완료: CrackForest-dataset-master.zip
✅ 압축 해제 완료: '.'에 내용이 풀렸습니다
📁 폴더 이름을 'CrackForest-dataset-master' ➜ 'CrackForest-dataset' 으로 변경했습니다.
✅ 데이터셋 루트: ./CrackForest-dataset
📁 이미지 폴더: ./CrackForest-dataset/image
📁 마스크 폴더: ./CrackForest-dataset/groundTruth
📁 Segmentation 폴더: ./CrackForest-dataset/seg

✅ 이미지 디렉토리 경로: ./CrackForest-dataset/image
🖼️ 감지된 이미지 수: 155
✅ 이미지 샘플: ['./CrackForest-dataset/image/001.jpg', './CrackForest-dataset/image/002.jpg', './CrackForest-dataset/image/003.jpg']
📐 감지된 GroundTruth 수: 118

✅ 매칭된 이미지 수: 118
✅ 매칭된 마스크 수: 118
🧷 공통된 파일 prefix 예시: ['001', '002', '003']

🧪 Train indices: 0 ~ 93
🧪 Validation indices: 94 ~ 117

📂 'CrackForest-dataset' 폴더 내부 구조:
CrackForest-dataset/
  README.md
  seg/
    047.seg
    010.seg
    114.seg
    086.seg
    103.seg
    ... 외 113개 더
  groundTruth/
    090.mat
    023.mat
    075.mat
    012.mat
    071.mat
    ... 외 113개 더
  image/
    324.jpg
    046.jpg
  

In [2]:
#1-1 아래를 실행해야 훈련셀이 실됨
#데이터셋 경로설정

!pip install -q scipy scikit-learn tensorboard

import os, random, zipfile, urllib.request
import numpy as np
import scipy.io
from PIL import Image
import matplotlib.pyplot as plt
from tqdm import tqdm

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from sklearn.model_selection import StratifiedKFold
from torch.utils.tensorboard import SummaryWriter

# 경로 설정
root = '/content/CrackForest-dataset'
img_dir = os.path.join(root, 'image')
gt_dir = os.path.join(root, 'groundTruth')
os.makedirs('runs', exist_ok=True)
os.makedirs('saved_models', exist_ok=True)

# 데이터셋 다운로드
if not os.path.exists(root):
    url = 'https://www.dropbox.com/s/06e4r7dk6n31xgk/CrackForest-dataset.zip?dl=1'
    zip_path = '/content/CrackForest.zip'
    urllib.request.urlretrieve(url, zip_path)
    with zipfile.ZipFile(zip_path, 'r') as zip_ref:
        zip_ref.extractall('/content')

In [3]:
#2 시드 및 디바이스 설정
import random
import numpy as np
import torch

def seed_everything(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)

seed_everything()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"사용 중인 디바이스: {device}")


사용 중인 디바이스: cuda


In [4]:
#3 파일 리스트 필터링
def get_file_num(filename):
    return os.path.splitext(filename)[0]

all_imgs = sorted(os.listdir(img_dir))
all_mats = sorted(os.listdir(gt_dir))
mat_nums = set(get_file_num(f) for f in all_mats)

filtered_imgs = [f for f in all_imgs if get_file_num(f) in mat_nums]
filtered_mats = [f for f in all_mats if get_file_num(f) in [get_file_num(i) for i in filtered_imgs]]

img_paths = [os.path.join(img_dir, f) for f in filtered_imgs]
mat_paths = [os.path.join(gt_dir, f) for f in filtered_mats]
labels = [1] * len(img_paths)

print(f"전체 이미지 수: {len(all_imgs)}")
print(f"groundTruth .mat 파일 수: {len(all_mats)}")
print(f"필터링된 이미지 수: {len(img_paths)}")


전체 이미지 수: 156
groundTruth .mat 파일 수: 118
필터링된 이미지 수: 118


In [5]:
#4 CrackDataset 클래스 정의
from torch.utils.data import Dataset
from PIL import Image
import numpy as np
import scipy.io
from torchvision import transforms

class CrackDataset(Dataset):
    def __init__(self, image_paths, mat_paths, transform=None):
        self.image_paths = image_paths
        self.mat_paths = mat_paths
        self.transform = transform

    def __len__(self):
        return len(self.image_paths)

    def __getitem__(self, idx):
        image = Image.open(self.image_paths[idx]).convert("RGB")

        try:
            mat = scipy.io.loadmat(self.mat_paths[idx])
            gt = mat.get("groundTruth")

            if gt is None:
                raise ValueError("groundTruth 키가 없음")

            # (1,1) 구조체 배열에서 Boundaries 추출
            boundaries_obj_array = gt[0, 0]['Boundaries']  # (1,) object array
            mask_array = np.array(boundaries_obj_array[0])  # 실제 (H,W) numpy 배열

        except Exception as e:
            raise ValueError(f"Mask 추출 오류 - {self.mat_paths[idx]}: {e}")

        # 마스크 후처리
        mask = Image.fromarray((mask_array > 0.5).astype(np.uint8) * 255)

        # 이미지 및 마스크 리사이즈
        image = image.resize((64, 64))
        mask = mask.resize((64, 64))

        # Tensor 변환
        image = self.transform(image) if self.transform else transforms.ToTensor()(image)
        mask = transforms.ToTensor()(mask)  # → [0,1] 범위 float tensor

        #return image, (mask > 0.5).float()
        return image, mask  # threshold 적용 X


In [6]:
#5 모델정의 UNet + CBAM (Channel & Spatial Attention)
import torch
import torch.nn as nn
import torch.nn.functional as F

class ChannelAttention(nn.Module):
    def __init__(self, in_planes, ratio=8):
        super().__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.max_pool = nn.AdaptiveMaxPool2d(1)
        self.fc = nn.Sequential(
            nn.Conv2d(in_planes, in_planes // ratio, 1, bias=False),
            nn.ReLU(),
            nn.Conv2d(in_planes // ratio, in_planes, 1, bias=False)
        )
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = self.fc(self.avg_pool(x))
        max_out = self.fc(self.max_pool(x))
        return self.sigmoid(avg_out + max_out) * x

class SpatialAttention(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(2, 1, kernel_size=7, padding=3, bias=False)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        avg_out = torch.mean(x, dim=1, keepdim=True)
        max_out, _ = torch.max(x, dim=1, keepdim=True)
        x_cat = torch.cat([avg_out, max_out], dim=1)
        return self.sigmoid(self.conv(x_cat)) * x

class CBAM(nn.Module):
    def __init__(self, in_planes):
        super().__init__()
        self.channel_attention = ChannelAttention(in_planes)
        self.spatial_attention = SpatialAttention()

    def forward(self, x):
        x = self.channel_attention(x)
        x = self.spatial_attention(x)
        return x

#격자무늬 해결중
class UNetWithAttention(nn.Module):
    def __init__(self):
        super().__init__()

        def conv_block(in_ch, out_ch):
            return nn.Sequential(
                nn.Conv2d(in_ch, out_ch, kernel_size=3, padding=1),
                nn.BatchNorm2d(out_ch),
                nn.ReLU(inplace=True)
            )

        # 인코더
        self.enc1 = nn.Sequential(conv_block(3, 64), CBAM(64))
        self.enc2 = nn.Sequential(conv_block(64, 128), CBAM(128))
        self.enc3 = nn.Sequential(conv_block(128, 256), CBAM(256))
        self.pool = nn.MaxPool2d(2)

        # Bottleneck
        self.mid = conv_block(256, 512)

        # 업샘플링: ConvTranspose → Upsample + Conv2d
        self.up2 = nn.Sequential(
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),
            nn.Conv2d(512, 256, kernel_size=1)
        )
        self.dec2 = conv_block(512, 256)

        self.up1 = nn.Sequential(
            nn.Upsample(scale_factor=2, mode='bilinear', align_corners=False),
            nn.Conv2d(256, 128, kernel_size=1)
        )
        self.dec1 = conv_block(256, 128)

        # 출력 레이어
        self.final = nn.Conv2d(128, 1, kernel_size=1)

    def forward(self, x):
        e1 = self.enc1(x)                   # (B, 64, 64, 64)
        e2 = self.enc2(self.pool(e1))      # (B, 128, 32, 32)
        e3 = self.enc3(self.pool(e2))      # (B, 256, 16, 16)
        m = self.mid(self.pool(e3))        # (B, 512, 8, 8)

        d2 = self.up2(m)                   # (B, 256, 16, 16)
        d2 = self.dec2(torch.cat([d2, e3], dim=1))  # (B, 256, 16, 16)

        d1 = self.up1(d2)                  # (B, 128, 32, 32)
        d1 = self.dec1(torch.cat([d1, e2], dim=1))  # (B, 128, 32, 32)

        #out = torch.sigmoid(self.final(d1))  # (B, 1, 32, 32)
        #BCEDiceLoss와 모델 추론 시점에 다시 sigmoid를 중복호출 제거
        out = self.final(d1)

        # 최종적으로 입력 크기(64x64)에 맞춰보정
        #데이터셋에서 resize를 먼저 했기 때문에 모델 출력과 Ground Truth는 이미 같은 크기 그냥둬도 무방
        #이 경우 F.interpolate(..., size=(64,64))는 중복일 수 있습니다
        return F.interpolate(out, size=(64, 64), mode='bilinear', align_corners=False)

In [7]:
#6 손실함수 및 메트릭
import torch
import torch.nn as nn

class DiceLoss(nn.Module):
    def forward(self, inputs, targets, smooth=1e-6):
        inputs = inputs.view(-1)
        targets = targets.view(-1)
        intersection = (inputs * targets).sum()
        return 1 - (2 * intersection + smooth) / (inputs.sum() + targets.sum() + smooth)

class BCEDiceLoss(nn.Module):
    def __init__(self):
        super().__init__()
        self.bce = nn.BCELoss()
        self.dice = DiceLoss()

    def forward(self, inputs, targets):
        inputs = torch.sigmoid(inputs)  #모델 출력은 로짓(logits)이고 확률로 변환
        return self.bce(inputs, targets) + self.dice(inputs, targets)

class IoUMetric:
    def __call__(self, preds, masks):
        preds = (preds > 0.5).float()
        masks = masks.float()
        intersection = (preds * masks).sum((1, 2, 3))
        union = (preds + masks - preds * masks).sum((1, 2, 3))
        iou = (intersection + 1e-6) / (union + 1e-6)
        return iou.mean().item()


In [8]:
#7 기본 Transform 설정정
transform = transforms.Compose([
    transforms.ToTensor(),
])


In [9]:
# 8 K-Fold 학습 및 평가 루프 함수 정의
#train_and_evaluate()
from torch.utils.tensorboard import SummaryWriter
from sklearn.model_selection import StratifiedKFold
import torch
from torch.utils.data import Dataset, DataLoader

def train_and_evaluate(img_paths, mat_paths, labels, num_epochs=20, batch_size=32, learning_rate=1e-3, k_folds=5):
    skf = StratifiedKFold(n_splits=k_folds, shuffle=True, random_state=42)
    criterion = BCEDiceLoss()
    metric_fn = IoUMetric()

    for fold, (train_idx, val_idx) in enumerate(skf.split(img_paths, labels)):
        print(f"\n🔁 Fold {fold + 1}/{k_folds}")
        writer = SummaryWriter(log_dir=f"runs/fold_{fold+1}")

        train_dataset = CrackDataset([img_paths[i] for i in train_idx],
                                     [mat_paths[i] for i in train_idx],
                                     transform=transform)
        val_dataset = CrackDataset([img_paths[i] for i in val_idx],
                                   [mat_paths[i] for i in val_idx],
                                   transform=transform)

        train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=2)
        val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=2)

        model = UNetWithAttention().to(device)
        optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

        best_iou = 0

        for epoch in range(num_epochs):
            model.train()
            train_loss = 0
            for batch_idx, (images, masks) in enumerate(tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs} [Train]")):
            #for images, masks in tqdm(train_loader, desc=f"Epoch {epoch+1}/{num_epochs} [Train]"):
                images, masks = images.to(device), masks.to(device)
                preds = model(images)
                loss = criterion(preds, masks)

                optimizer.zero_grad()
                loss.backward()
                optimizer.step()
                train_loss += loss.item()

                # 🔍 여기서 예측 마스크 통계 확인 (첫 배치만)
                #if batch_idx % 10 == 0:  #정해진 간격마다 출력
                #if batch_idx == 0:  #매 에폭 첫 배치만 출력
                #if batch_idx == 0 and epoch == 0:
                if batch_idx == 0 and epoch % 5 == 0:  #학습진행상태에 따른 변화확인
                    print(f"\n🔍 Epoch {epoch+1}, Batch {batch_idx+1}")
                    print("  Predicted mask stats:")
                    print("    Min: ", preds.min().item())
                    print("    Max: ", preds.max().item())
                    print("    Mean:", preds.mean().item())

            train_loss /= len(train_loader)

            # Validation
            model.eval()
            val_loss = 0
            iou_score = 0

            with torch.no_grad():
                for images, masks in tqdm(val_loader, desc=f"Epoch {epoch+1}/{num_epochs} [Val]"):
                    images, masks = images.to(device), masks.to(device)
                    preds = model(images)
                    loss = criterion(preds, masks)
                    val_loss += loss.item()
                    iou_score += metric_fn(preds, masks)

            val_loss /= len(val_loader)
            iou_score /= len(val_loader)

            print(f"[Fold {fold+1}] Epoch {epoch+1} | Train Loss: {train_loss:.4f} | Val Loss: {val_loss:.4f} | IoU: {iou_score:.4f}")
            writer.add_scalars("Loss", {"train": train_loss, "val": val_loss}, epoch)
            writer.add_scalar("IoU", iou_score, epoch)

            # 모델 저장
            if iou_score > best_iou:
                best_iou = iou_score
                model_path = f"saved_models/best_model_fold_{fold+1}.pth"
                torch.save(model.state_dict(), model_path)
                print(f"✅ Best model saved to {model_path} (IoU={best_iou:.4f})")

        writer.close()


In [10]:
## 9 추론 및 시각화 함수 2D로 추출로 수정 코드
#segmentation 추가
#자동 vmin/max + contrast 스케일링 포함
def predict_and_visualize(model_path, img_path, mat_path):
    import matplotlib.pyplot as plt
    import scipy.io
    from torchvision import transforms
    from PIL import Image
    import numpy as np
    import torch
    from tqdm import tqdm

    # 이미지 로드 및 전처리
    img = Image.open(img_path).convert("RGB")
    transform = transforms.Compose([
        transforms.Resize((64, 64)),
        transforms.ToTensor()
    ])
    input_tensor = transform(img).unsqueeze(0)  # (1, 3, 64, 64)

    # 모델 로드
    model = UNetWithAttention()  # 사용 중인 모델 클래스명에 맞게 수정
    model.load_state_dict(torch.load(model_path, map_location=torch.device("cpu")))
    model.eval()

    # 예측 (logits → 확률 → numpy)
    with torch.no_grad():
        output = model(input_tensor)
        pred_mask = torch.sigmoid(output).squeeze().numpy()

    # .mat 파일에서 GT 추출
    mat = scipy.io.loadmat(mat_path)
    gt_struct = mat["groundTruth"][0, 0]

    # Boundaries
    boundaries = gt_struct["Boundaries"]
    if isinstance(boundaries, np.ndarray) and boundaries.dtype == object:
        boundary_mask = boundaries[0, 0]
    else:
        boundary_mask = boundaries
    if isinstance(boundary_mask, np.ndarray) and boundary_mask.dtype == object:
        boundary_mask = boundary_mask[0]

    if not isinstance(boundary_mask, np.ndarray) or boundary_mask.ndim != 2:
        raise ValueError(f"Invalid boundary mask shape: {boundary_mask.shape if isinstance(boundary_mask, np.ndarray) else type(boundary_mask)}")

    # Segmentation
    segmentation = gt_struct["Segmentation"]
    if isinstance(segmentation, np.ndarray) and segmentation.dtype == object:
        seg_mask = segmentation[0, 0]
    else:
        seg_mask = segmentation
    if isinstance(seg_mask, np.ndarray) and seg_mask.dtype == object:
        seg_mask = seg_mask[0]

    if not isinstance(seg_mask, np.ndarray) or seg_mask.ndim != 2:
        raise ValueError(f"Invalid segmentation mask shape: {seg_mask.shape if isinstance(seg_mask, np.ndarray) else type(seg_mask)}")

    # GT 마스크 리사이즈
    boundary_resized = Image.fromarray((boundary_mask * 255).astype(np.uint8)).resize((64, 64))
    seg_resized = Image.fromarray((seg_mask * 255).astype(np.uint8)).resize((64, 64))
    boundary_resized = np.array(boundary_resized) / 255.0
    seg_resized = np.array(seg_resized) / 255.0

    # Binary threshold for prediction Thresholding (선택적 추가)
    pred_binary = (pred_mask > 0.5).astype(np.uint8)

    # 예측 마스크 contrast 조정용 스케일링 (선택사항)
    contrast_scale = 20

    # 시각화
    fig, axes = plt.subplots(1, 5, figsize=(20, 4))

    axes[0].imshow(img)
    axes[0].set_title("Original Image")
    axes[0].axis("off")

    axes[1].imshow(boundary_resized, cmap="gray")
    axes[1].set_title("Ground Truth (Boundary)")
    axes[1].axis("off")

    axes[2].imshow(seg_resized, cmap="gray")
    axes[2].set_title("Ground Truth (Segmentation)")
    axes[2].axis("off")

    #axes[3].imshow(pred_mask, cmap="gray", vmin=0, vmax=1)
    #axes[3].set_title("Predicted (Prob)")
    #axes[3].axis("off")

    axes[3].imshow(pred_mask * contrast_scale, cmap="gray")
    axes[3].set_title("Predicted (Prob x {})".format(contrast_scale))
    axes[3].axis("off")

    axes[4].imshow(pred_binary, cmap="gray")
    axes[4].set_title("Predicted (Binary > 0.5)")
    axes[4].axis("off")

    plt.tight_layout()
    plt.show()

In [None]:
#10 최종버전 학습 실행 epoch 40
train_and_evaluate(img_paths_matched, mat_paths_matched, labels, num_epochs=40, batch_size=32, learning_rate=1e-3, k_folds=5)

# 추론 예시 (학습 후 사용)
predict_and_visualize("saved_models/best_model_fold_1.pth", img_paths_matched[0], mat_paths_matched[0])



🔁 Fold 1/5


Epoch 1/40 [Train]:  33%|███▎      | 1/3 [00:03<00:06,  3.01s/it]


🔍 Epoch 1, Batch 1
  Predicted mask stats:
    Min:  -1.0045570135116577
    Max:  3.224926710128784
    Mean: 0.03939331695437431


Epoch 1/40 [Train]: 100%|██████████| 3/3 [00:03<00:00,  1.18s/it]
Epoch 1/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  1.67it/s]


[Fold 1] Epoch 1 | Train Loss: 1.7313 | Val Loss: 1.7561 | IoU: 1.0000
✅ Best model saved to saved_models/best_model_fold_1.pth (IoU=1.0000)


Epoch 2/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.49it/s]
Epoch 2/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  2.77it/s]


[Fold 1] Epoch 2 | Train Loss: 1.5106 | Val Loss: 1.7761 | IoU: 1.0000


Epoch 3/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.72it/s]
Epoch 3/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  2.95it/s]


[Fold 1] Epoch 3 | Train Loss: 1.3887 | Val Loss: 1.7070 | IoU: 1.0000


Epoch 4/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.06it/s]
Epoch 4/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.72it/s]


[Fold 1] Epoch 4 | Train Loss: 1.2949 | Val Loss: 1.5989 | IoU: 1.0000


Epoch 5/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.96it/s]
Epoch 5/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.77it/s]


[Fold 1] Epoch 5 | Train Loss: 1.2604 | Val Loss: 1.4574 | IoU: 1.0000


Epoch 6/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.37it/s]



🔍 Epoch 6, Batch 1
  Predicted mask stats:
    Min:  -2.180427074432373
    Max:  -0.4370596408843994
    Mean: -1.3873008489608765


Epoch 6/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.38it/s]


[Fold 1] Epoch 6 | Train Loss: 1.2244 | Val Loss: 1.3850 | IoU: 1.0000


Epoch 7/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.37it/s]
Epoch 7/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.87it/s]


[Fold 1] Epoch 7 | Train Loss: 1.1968 | Val Loss: 1.3465 | IoU: 1.0000


Epoch 8/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.95it/s]
Epoch 8/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.63it/s]


[Fold 1] Epoch 8 | Train Loss: 1.1708 | Val Loss: 1.3105 | IoU: 1.0000


Epoch 9/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.33it/s]
Epoch 9/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.36it/s]


[Fold 1] Epoch 9 | Train Loss: 1.1608 | Val Loss: 1.2761 | IoU: 1.0000


Epoch 10/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.36it/s]
Epoch 10/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.72it/s]


[Fold 1] Epoch 10 | Train Loss: 1.1397 | Val Loss: 1.2473 | IoU: 1.0000


Epoch 11/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.07it/s]



🔍 Epoch 11, Batch 1
  Predicted mask stats:
    Min:  -2.689789295196533
    Max:  -0.5752020478248596
    Mean: -1.982884168624878


Epoch 11/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.77it/s]


[Fold 1] Epoch 11 | Train Loss: 1.1238 | Val Loss: 1.2221 | IoU: 1.0000


Epoch 12/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.25it/s]
Epoch 12/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.65it/s]


[Fold 1] Epoch 12 | Train Loss: 1.1087 | Val Loss: 1.2018 | IoU: 1.0000


Epoch 13/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.31it/s]
Epoch 13/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.56it/s]


[Fold 1] Epoch 13 | Train Loss: 1.0974 | Val Loss: 1.1765 | IoU: 1.0000


Epoch 14/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.19it/s]
Epoch 14/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.26it/s]


[Fold 1] Epoch 14 | Train Loss: 1.0847 | Val Loss: 1.1460 | IoU: 1.0000


Epoch 15/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.80it/s]
Epoch 15/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.14it/s]


[Fold 1] Epoch 15 | Train Loss: 1.0783 | Val Loss: 1.1255 | IoU: 1.0000


Epoch 16/40 [Train]:  67%|██████▋   | 2/3 [00:00<00:00,  3.98it/s]


🔍 Epoch 16, Batch 1
  Predicted mask stats:
    Min:  -3.558109760284424
    Max:  -1.1151328086853027
    Mean: -2.5579943656921387


Epoch 16/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.29it/s]
Epoch 16/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.64it/s]


[Fold 1] Epoch 16 | Train Loss: 1.0731 | Val Loss: 1.1135 | IoU: 1.0000


Epoch 17/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.98it/s]
Epoch 17/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.64it/s]


[Fold 1] Epoch 17 | Train Loss: 1.0639 | Val Loss: 1.1020 | IoU: 1.0000


Epoch 18/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.04it/s]
Epoch 18/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.84it/s]


[Fold 1] Epoch 18 | Train Loss: 1.0611 | Val Loss: 1.0884 | IoU: 1.0000


Epoch 19/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.64it/s]
Epoch 19/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  2.97it/s]


[Fold 1] Epoch 19 | Train Loss: 1.0562 | Val Loss: 1.0772 | IoU: 1.0000


Epoch 20/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.12it/s]
Epoch 20/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.66it/s]


[Fold 1] Epoch 20 | Train Loss: 1.0524 | Val Loss: 1.0703 | IoU: 1.0000


Epoch 21/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.08it/s]



🔍 Epoch 21, Batch 1
  Predicted mask stats:
    Min:  -3.933854341506958
    Max:  -1.6097354888916016
    Mean: -3.000680446624756


Epoch 21/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.62it/s]


[Fold 1] Epoch 21 | Train Loss: 1.0485 | Val Loss: 1.0618 | IoU: 1.0000


Epoch 22/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.46it/s]
Epoch 22/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.77it/s]


[Fold 1] Epoch 22 | Train Loss: 1.0460 | Val Loss: 1.0550 | IoU: 1.0000


Epoch 23/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.30it/s]
Epoch 23/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.23it/s]


[Fold 1] Epoch 23 | Train Loss: 1.0444 | Val Loss: 1.0522 | IoU: 1.0000


Epoch 24/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.21it/s]
Epoch 24/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.77it/s]


[Fold 1] Epoch 24 | Train Loss: 1.0399 | Val Loss: 1.0511 | IoU: 1.0000


Epoch 25/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.04it/s]
Epoch 25/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.84it/s]


[Fold 1] Epoch 25 | Train Loss: 1.0387 | Val Loss: 1.0444 | IoU: 1.0000


Epoch 26/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.06it/s]



🔍 Epoch 26, Batch 1
  Predicted mask stats:
    Min:  -4.2526068687438965
    Max:  -2.535902738571167
    Mean: -3.3191471099853516


Epoch 26/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.59it/s]


[Fold 1] Epoch 26 | Train Loss: 1.0354 | Val Loss: 1.0401 | IoU: 1.0000


Epoch 27/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.39it/s]
Epoch 27/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.68it/s]


[Fold 1] Epoch 27 | Train Loss: 1.0353 | Val Loss: 1.0385 | IoU: 1.0000


Epoch 28/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.07it/s]
Epoch 28/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.75it/s]


[Fold 1] Epoch 28 | Train Loss: 1.0320 | Val Loss: 1.0375 | IoU: 1.0000


Epoch 29/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.37it/s]
Epoch 29/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.89it/s]


[Fold 1] Epoch 29 | Train Loss: 1.0307 | Val Loss: 1.0350 | IoU: 1.0000


Epoch 30/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.28it/s]
Epoch 30/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.35it/s]


[Fold 1] Epoch 30 | Train Loss: 1.0302 | Val Loss: 1.0333 | IoU: 1.0000


Epoch 31/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.76it/s]


🔍 Epoch 31, Batch 1
  Predicted mask stats:
    Min:  -4.657355785369873
    Max:  -2.731320381164551
    Mean: -3.5957584381103516


Epoch 31/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.88it/s]
Epoch 31/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.17it/s]


[Fold 1] Epoch 31 | Train Loss: 1.0274 | Val Loss: 1.0319 | IoU: 1.0000


Epoch 32/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.39it/s]
Epoch 32/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.83it/s]


[Fold 1] Epoch 32 | Train Loss: 1.0266 | Val Loss: 1.0304 | IoU: 1.0000


Epoch 33/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.14it/s]
Epoch 33/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.68it/s]


[Fold 1] Epoch 33 | Train Loss: 1.0259 | Val Loss: 1.0286 | IoU: 1.0000


Epoch 34/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.39it/s]
Epoch 34/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.84it/s]


[Fold 1] Epoch 34 | Train Loss: 1.0245 | Val Loss: 1.0273 | IoU: 1.0000


Epoch 35/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.47it/s]
Epoch 35/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.72it/s]


[Fold 1] Epoch 35 | Train Loss: 1.0233 | Val Loss: 1.0260 | IoU: 1.0000


Epoch 36/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.02it/s]



🔍 Epoch 36, Batch 1
  Predicted mask stats:
    Min:  -4.613113880157471
    Max:  -2.7895052433013916
    Mean: -3.788024663925171


Epoch 36/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.78it/s]


[Fold 1] Epoch 36 | Train Loss: 1.0221 | Val Loss: 1.0239 | IoU: 1.0000


Epoch 37/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.29it/s]
Epoch 37/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.70it/s]


[Fold 1] Epoch 37 | Train Loss: 1.0213 | Val Loss: 1.0225 | IoU: 1.0000


Epoch 38/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.28it/s]
Epoch 38/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.82it/s]


[Fold 1] Epoch 38 | Train Loss: 1.0198 | Val Loss: 1.0208 | IoU: 1.0000


Epoch 39/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.85it/s]
Epoch 39/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.68it/s]


[Fold 1] Epoch 39 | Train Loss: 1.0195 | Val Loss: 1.0202 | IoU: 1.0000


Epoch 40/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.12it/s]
Epoch 40/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.73it/s]


[Fold 1] Epoch 40 | Train Loss: 1.0185 | Val Loss: 1.0198 | IoU: 1.0000

🔁 Fold 2/5


Epoch 1/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.30it/s]



🔍 Epoch 1, Batch 1
  Predicted mask stats:
    Min:  -1.667335033416748
    Max:  0.7744898200035095
    Mean: -0.10266555845737457


Epoch 1/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.41it/s]


[Fold 2] Epoch 1 | Train Loss: 1.6304 | Val Loss: 1.6856 | IoU: 1.0000
✅ Best model saved to saved_models/best_model_fold_2.pth (IoU=1.0000)


Epoch 2/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.24it/s]
Epoch 2/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.78it/s]


[Fold 2] Epoch 2 | Train Loss: 1.4189 | Val Loss: 1.5547 | IoU: 1.0000


Epoch 3/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.18it/s]
Epoch 3/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.75it/s]


[Fold 2] Epoch 3 | Train Loss: 1.3193 | Val Loss: 1.3735 | IoU: 1.0000


Epoch 4/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.38it/s]
Epoch 4/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.53it/s]


[Fold 2] Epoch 4 | Train Loss: 1.2623 | Val Loss: 1.2550 | IoU: 1.0000


Epoch 5/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.16it/s]
Epoch 5/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.59it/s]


[Fold 2] Epoch 5 | Train Loss: 1.2224 | Val Loss: 1.1896 | IoU: 1.0000


Epoch 6/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.68it/s]


🔍 Epoch 6, Batch 1
  Predicted mask stats:
    Min:  -2.0704104900360107
    Max:  -0.8188629150390625
    Mean: -1.503871202468872


Epoch 6/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.41it/s]
Epoch 6/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.23it/s]


[Fold 2] Epoch 6 | Train Loss: 1.1932 | Val Loss: 1.1553 | IoU: 1.0000


Epoch 7/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  3.86it/s]
Epoch 7/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.31it/s]


[Fold 2] Epoch 7 | Train Loss: 1.1688 | Val Loss: 1.1332 | IoU: 1.0000


Epoch 8/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.13it/s]
Epoch 8/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.37it/s]


[Fold 2] Epoch 8 | Train Loss: 1.1486 | Val Loss: 1.1209 | IoU: 1.0000


Epoch 9/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.19it/s]
Epoch 9/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.75it/s]


[Fold 2] Epoch 9 | Train Loss: 1.1318 | Val Loss: 1.1141 | IoU: 1.0000


Epoch 10/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.37it/s]
Epoch 10/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.70it/s]


[Fold 2] Epoch 10 | Train Loss: 1.1177 | Val Loss: 1.1057 | IoU: 1.0000


Epoch 11/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.14it/s]



🔍 Epoch 11, Batch 1
  Predicted mask stats:
    Min:  -2.596468448638916
    Max:  -1.5297012329101562
    Mean: -2.1608099937438965


Epoch 11/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.76it/s]


[Fold 2] Epoch 11 | Train Loss: 1.1058 | Val Loss: 1.1006 | IoU: 1.0000


Epoch 12/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.05it/s]
Epoch 12/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.62it/s]


[Fold 2] Epoch 12 | Train Loss: 1.0955 | Val Loss: 1.1001 | IoU: 1.0000


Epoch 13/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.23it/s]
Epoch 13/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  1.60it/s]


[Fold 2] Epoch 13 | Train Loss: 1.0867 | Val Loss: 1.1021 | IoU: 1.0000


Epoch 14/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.21it/s]
Epoch 14/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  2.05it/s]


[Fold 2] Epoch 14 | Train Loss: 1.0790 | Val Loss: 1.1133 | IoU: 1.0000


Epoch 15/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  5.37it/s]
Epoch 15/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  2.03it/s]


[Fold 2] Epoch 15 | Train Loss: 1.0723 | Val Loss: 1.1377 | IoU: 1.0000


Epoch 16/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.24it/s]



🔍 Epoch 16, Batch 1
  Predicted mask stats:
    Min:  -3.027567148208618
    Max:  -2.0804381370544434
    Mean: -2.653106212615967


Epoch 16/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.54it/s]


[Fold 2] Epoch 16 | Train Loss: 1.0664 | Val Loss: 1.1536 | IoU: 1.0000


Epoch 17/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.15it/s]
Epoch 17/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.65it/s]


[Fold 2] Epoch 17 | Train Loss: 1.0613 | Val Loss: 1.1521 | IoU: 1.0000


Epoch 18/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.21it/s]
Epoch 18/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.85it/s]


[Fold 2] Epoch 18 | Train Loss: 1.0568 | Val Loss: 1.1374 | IoU: 1.0000


Epoch 19/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.40it/s]
Epoch 19/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.36it/s]


[Fold 2] Epoch 19 | Train Loss: 1.0527 | Val Loss: 1.1090 | IoU: 1.0000


Epoch 20/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.18it/s]
Epoch 20/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.96it/s]


[Fold 2] Epoch 20 | Train Loss: 1.0491 | Val Loss: 1.0881 | IoU: 1.0000


Epoch 21/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.24it/s]


🔍 Epoch 21, Batch 1
  Predicted mask stats:
    Min:  -3.4123892784118652
    Max:  -2.438033103942871
    Mean: -3.0384531021118164


Epoch 21/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.90it/s]
Epoch 21/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.21it/s]


[Fold 2] Epoch 21 | Train Loss: 1.0459 | Val Loss: 1.0742 | IoU: 1.0000


Epoch 22/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.14it/s]
Epoch 22/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  3.31it/s]


[Fold 2] Epoch 22 | Train Loss: 1.0430 | Val Loss: 1.0635 | IoU: 1.0000


Epoch 23/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  4.24it/s]
Epoch 23/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.65it/s]


[Fold 2] Epoch 23 | Train Loss: 1.0404 | Val Loss: 1.0555 | IoU: 1.0000


Epoch 24/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.06it/s]
Epoch 24/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.15it/s]


[Fold 2] Epoch 24 | Train Loss: 1.0380 | Val Loss: 1.0494 | IoU: 1.0000


Epoch 25/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.18it/s]
Epoch 25/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.56it/s]


[Fold 2] Epoch 25 | Train Loss: 1.0358 | Val Loss: 1.0449 | IoU: 1.0000


Epoch 26/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.08it/s]



🔍 Epoch 26, Batch 1
  Predicted mask stats:
    Min:  -3.705808639526367
    Max:  -2.699151039123535
    Mean: -3.3544163703918457


Epoch 26/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.70it/s]


[Fold 2] Epoch 26 | Train Loss: 1.0338 | Val Loss: 1.0411 | IoU: 1.0000


Epoch 27/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.12it/s]
Epoch 27/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.66it/s]


[Fold 2] Epoch 27 | Train Loss: 1.0320 | Val Loss: 1.0378 | IoU: 1.0000


Epoch 28/40 [Train]: 100%|██████████| 3/3 [00:00<00:00,  6.26it/s]
Epoch 28/40 [Val]: 100%|██████████| 1/1 [00:00<00:00,  4.59it/s]


[Fold 2] Epoch 28 | Train Loss: 1.0303 | Val Loss: 1.0352 | IoU: 1.0000


Epoch 29/40 [Train]:  33%|███▎      | 1/3 [00:00<00:00,  3.10it/s]

## Setup

pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.

[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://www.pepy.tech/projects/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/)

In [None]:
!pip install ultralytics
import ultralytics
ultralytics.checks()

## Dataset YAML File

A YAML (Yet Another Markup Language) file defines the dataset configuration, including paths, classes, and other pertinent details. 😀

```yaml
# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Crack-seg dataset by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/segment/crack-seg/
# Example usage: yolo train data=crack-seg.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── crack-seg  ← downloads here (91.2 MB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/crack-seg # dataset root dir
train: train/images # train images (relative to 'path') 3717 images
val: valid/images # val images (relative to 'path') 112 images
test: test/images # test images (relative to 'path') 200 images

# Classes
names:
  0: crack

# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/crack-seg.zip
```

## Train

Train YOLO11 on [Detect](https://docs.ultralytics.com/tasks/detect/), [Segment](https://docs.ultralytics.com/tasks/segment/), [Classify](https://docs.ultralytics.com/tasks/classify/) and [Pose](https://docs.ultralytics.com/tasks/pose/) datasets. See [YOLO11 Train Docs](https://docs.ultralytics.com/modes/train/) for more information.

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n-seg.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="crack-seg.yaml", epochs=3, imgsz=640, batch=64, workers=64)

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n-seg.pt to 'yolo11n-seg.pt'...


100%|██████████| 5.90M/5.90M [00:00<00:00, 99.5MB/s]


Ultralytics 8.3.70 🚀 Python-3.11.11 torch-2.5.1+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0mtask=segment, mode=train, model=yolo11n-seg.pt, data=crack-seg.yaml, epochs=3, time=None, patience=100, batch=64, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=64, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show

100%|██████████| 91.1M/91.1M [00:02<00:00, 40.2MB/s]
Unzipping /content/datasets/crack-seg.zip to /content/datasets/crack-seg...: 100%|██████████| 8061/8061 [00:01<00:00, 4783.99file/s]

Dataset download success ✅ (5.0s), saved to [1m/content/datasets[0m






Downloading https://ultralytics.com/assets/Arial.ttf to '/root/.config/Ultralytics/Arial.ttf'...


100%|██████████| 755k/755k [00:00<00:00, 41.6MB/s]


Overriding model.yaml nc=80 with nc=1

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      6640  ultralytics.nn.modules.block.C3k2            [32, 64, 1, False, 0.25]      
  3                  -1  1     36992  ultralytics.nn.modules.conv.Conv             [64, 64, 3, 2]                
  4                  -1  1     26080  ultralytics.nn.modules.block.C3k2            [64, 128, 1, False, 0.25]     
  5                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
  6                  -1  1     87040  ultralytics.nn.modules.block.C3k2            [128, 128, 1, True]           
  7                  -1  1    295424  ultralytics

100%|██████████| 5.35M/5.35M [00:00<00:00, 123MB/s]


[34m[1mAMP: [0mchecks passed ✅


[34m[1mtrain: [0mScanning /content/datasets/crack-seg/train/labels... 3717 images, 0 backgrounds, 0 corrupt: 100%|██████████| 3717/3717 [00:03<00:00, 1105.57it/s]


[34m[1mtrain: [0mNew cache created: /content/datasets/crack-seg/train/labels.cache
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, num_output_channels=3, method='weighted_average'), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))


  check_for_updates()
[34m[1mval: [0mScanning /content/datasets/crack-seg/valid/labels... 200 images, 1 backgrounds, 0 corrupt: 100%|██████████| 200/200 [00:00<00:00, 706.43it/s]


[34m[1mval: [0mNew cache created: /content/datasets/crack-seg/valid/labels.cache
Plotting labels to runs/segment/train/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 90 weight(decay=0.0), 101 weight(decay=0.0005), 100 bias(decay=0.0)
[34m[1mTensorBoard: [0mmodel graph visualization added ✅
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mruns/segment/train[0m
Starting training for 3 epochs...

      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size


        1/3      11.3G      1.349        2.3      2.286      1.316         15        640: 100%|██████████| 59/59 [01:34<00:00,  1.61s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:04<00:00,  2.14s/it]

                   all        200        249      0.405      0.437      0.361      0.129      0.293      0.317      0.208     0.0421






      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size


        2/3      11.3G      1.221      1.656      1.575      1.213          5        640: 100%|██████████| 59/59 [01:28<00:00,  1.49s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:06<00:00,  3.37s/it]

                   all        200        249      0.484       0.47      0.372      0.162      0.426      0.402      0.247     0.0643






      Epoch    GPU_mem   box_loss   seg_loss   cls_loss   dfl_loss  Instances       Size


        3/3      11.5G      1.155      1.627      1.323      1.185         11        640: 100%|██████████| 59/59 [01:27<00:00,  1.48s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:06<00:00,  3.41s/it]

                   all        200        249      0.386      0.481      0.369      0.141      0.267      0.333      0.156     0.0343






3 epochs completed in 0.083 hours.
Optimizer stripped from runs/segment/train/weights/last.pt, 6.0MB
Optimizer stripped from runs/segment/train/weights/best.pt, 6.0MB

Validating runs/segment/train/weights/best.pt...
Ultralytics 8.3.70 🚀 Python-3.11.11 torch-2.5.1+cu124 CUDA:0 (Tesla T4, 15095MiB)
YOLO11n-seg summary (fused): 265 layers, 2,834,763 parameters, 0 gradients, 10.2 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95)     Mask(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:04<00:00,  2.02s/it]


                   all        200        249      0.485      0.476      0.371      0.161       0.42      0.398      0.243     0.0637
Speed: 0.2ms preprocess, 3.1ms inference, 0.0ms loss, 3.7ms postprocess per image
Results saved to [1mruns/segment/train[0m


![Dataset sample image](https://github.com/ultralytics/docs/releases/download/0/crack-segmentation-sample.avif)

## Predict

YOLO11 may be used directly in the Command Line Interface (CLI) with a yolo command for a variety of tasks and modes and accepts additional arguments, i.e. imgsz=640. See a full list of available [yolo arguments](https://docs.ultralytics.com/usage/cfg/) and other details in the [YOLO11 Predict Docs](https://docs.ultralytics.com/modes/train/).

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("/path/to/best.pt")  # load a fine-tuned model

# Inference using the model (img/video/stream)
results = model.predict("https://github.com/ultralytics/assets/releases/download/v0.0.0/crack-on-wall.jpg", save=True)


Found https://github.com/ultralytics/assets/releases/download/v0.0.0/crack-on-wall.jpg locally at crack-on-wall.jpg
image 1/1 /content/crack-on-wall.jpg: 384x640 1 crack, 10.7ms
Speed: 2.3ms preprocess, 10.7ms inference, 3.3ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mruns/segment/predict[0m


&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<img align="left" src="https://github.com/user-attachments/assets/9044c148-c157-47d6-87b4-5608593f5b70" width="600">

## Export

Export a YOLO11 model to any supported format below with the `format` argument, i.e. `format=onnx`. See [YOLO11 Export Docs](https://docs.ultralytics.com/modes/export/) for more information.

- 💡 ProTip: Export to [ONNX](https://docs.ultralytics.com/integrations/onnx/) or [OpenVINO](https://docs.ultralytics.com/integrations/openvino/) for up to 3x CPU speedup.  
- 💡 ProTip: Export to [TensorRT](https://docs.ultralytics.com/integrations/tensorrt/) for up to 5x GPU speedup.

| Format                                                                   | `format` Argument | Model                     | Metadata | Arguments                                                            |
|--------------------------------------------------------------------------|-------------------|---------------------------|----------|----------------------------------------------------------------------|
| [PyTorch](https://pytorch.org/)                                          | -                 | `yolo11n.pt`              | ✅        | -                                                                    |
| [TorchScript](https://docs.ultralytics.com/integrations/torchscript)     | `torchscript`     | `yolo11n.torchscript`     | ✅        | `imgsz`, `optimize`, `batch`                                         |
| [ONNX](https://docs.ultralytics.com/integrations/onnx)                   | `onnx`            | `yolo11n.onnx`            | ✅        | `imgsz`, `half`, `dynamic`, `simplify`, `opset`, `batch`             |
| [OpenVINO](https://docs.ultralytics.com/integrations/openvino)           | `openvino`        | `yolo11n_openvino_model/` | ✅        | `imgsz`, `half`, `dynamic`, `int8`, `batch`                          |
| [TensorRT](https://docs.ultralytics.com/integrations/tensorrt)           | `engine`          | `yolo11n.engine`          | ✅        | `imgsz`, `half`, `dynamic`, `simplify`, `workspace`, `int8`, `batch` |
| [CoreML](https://docs.ultralytics.com/integrations/coreml)               | `coreml`          | `yolo11n.mlpackage`       | ✅        | `imgsz`, `half`, `int8`, `nms`, `batch`                              |
| [TF SavedModel](https://docs.ultralytics.com/integrations/tf-savedmodel) | `saved_model`     | `yolo11n_saved_model/`    | ✅        | `imgsz`, `keras`, `int8`, `batch`                                    |
| [TF GraphDef](https://docs.ultralytics.com/integrations/tf-graphdef)     | `pb`              | `yolo11n.pb`              | ❌        | `imgsz`, `batch`                                                     |
| [TF Lite](https://docs.ultralytics.com/integrations/tflite)              | `tflite`          | `yolo11n.tflite`          | ✅        | `imgsz`, `half`, `int8`, `batch`                                     |
| [TF Edge TPU](https://docs.ultralytics.com/integrations/edge-tpu)        | `edgetpu`         | `yolo11n_edgetpu.tflite`  | ✅        | `imgsz`                                                              |
| [TF.js](https://docs.ultralytics.com/integrations/tfjs)                  | `tfjs`            | `yolo11n_web_model/`      | ✅        | `imgsz`, `half`, `int8`, `batch`                                     |
| [PaddlePaddle](https://docs.ultralytics.com/integrations/paddlepaddle)   | `paddle`          | `yolo11n_paddle_model/`   | ✅        | `imgsz`, `batch`                                                     |
| [MNN](https://docs.ultralytics.com/integrations/mnn)                     | `mnn`             | `yolo11n.mnn`             | ✅        | `imgsz`, `batch`, `int8`, `half`                                     |
| [NCNN](https://docs.ultralytics.com/integrations/ncnn)                   | `ncnn`            | `yolo11n_ncnn_model/`     | ✅        | `imgsz`, `half`, `batch`                                             |
| [IMX500](https://docs.ultralytics.com/integrations/sony-imx500)          | `imx`             | `yolov8n_imx_model/`      | ✅        | `imgsz`, `int8`                                                      |
| [RKNN](https://docs.ultralytics.com/integrations/rockchip-rknn)          | `rknn`            | `yolo11n_rknn_model/`     | ✅        | `imgsz`, `batch`, `name`                                             |

In [None]:
from ultralytics import YOLO

# Load a model
model = YOLO("/path/to/best.pt")  # load a custom trained model

# Export the model
model.export(format="torchscript")

Ultralytics 8.3.70 🚀 Python-3.11.11 torch-2.5.1+cu124 CPU (Intel Xeon 2.00GHz)
YOLO11n-seg summary (fused): 265 layers, 2,834,763 parameters, 0 gradients, 10.2 GFLOPs

[34m[1mPyTorch:[0m starting from '/content/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) ((1, 37, 8400), (1, 32, 160, 160)) (5.7 MB)

[34m[1mTorchScript:[0m starting export with torch 2.5.1+cu124...
[34m[1mTorchScript:[0m export success ✅ 4.8s, saved as '/content/best.torchscript' (11.4 MB)

Export complete (5.7s)
Results saved to [1m/content[0m
Predict:         yolo predict task=segment model=/content/best.torchscript imgsz=640  
Validate:        yolo val task=segment model=/content/best.torchscript imgsz=640 data=/usr/local/lib/python3.11/dist-packages/ultralytics/cfg/datasets/crack-seg.yaml  
Visualize:       https://netron.app


'/content/best.torchscript'