# üöÄ Google Colab Setup

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ogautier1980/sandbox-ml/blob/main/cours/12_vision_avancee/12_demo_segmentation.ipynb)

**Si vous ex√©cutez ce notebook sur Google Colab**, ex√©cutez la cellule suivante pour installer les d√©pendances.

In [None]:
# Installation des d√©pendances (Google Colab uniquement)import sysIN_COLAB = 'google.colab' in sys.modulesif IN_COLAB:    print('üì¶ Installation des packages...')        # Packages ML de base    !pip install -q numpy pandas matplotlib seaborn scikit-learn        # D√©tection du chapitre et installation des d√©pendances sp√©cifiques    notebook_name = '12_demo_segmentation.ipynb'  # Sera remplac√© automatiquement        # Ch 06-08 : Deep Learning    if any(x in notebook_name for x in ['06_', '07_', '08_']):        !pip install -q torch torchvision torchaudio        # Ch 08 : NLP    if '08_' in notebook_name:        !pip install -q transformers datasets tokenizers        if 'rag' in notebook_name:            !pip install -q sentence-transformers faiss-cpu rank-bm25        # Ch 09 : Reinforcement Learning    if '09_' in notebook_name:        !pip install -q gymnasium[classic-control]        # Ch 04 : Boosting    if '04_' in notebook_name and 'boosting' in notebook_name:        !pip install -q xgboost lightgbm catboost        # Ch 05 : Clustering avanc√©    if '05_' in notebook_name:        !pip install -q umap-learn        # Ch 11 : S√©ries temporelles    if '11_' in notebook_name:        !pip install -q statsmodels prophet        # Ch 12 : Vision avanc√©e    if '12_' in notebook_name:        !pip install -q ultralytics timm segmentation-models-pytorch        # Ch 13 : Recommandation    if '13_' in notebook_name:        !pip install -q scikit-surprise implicit        # Ch 14 : MLOps    if '14_' in notebook_name:        !pip install -q mlflow fastapi pydantic        print('‚úÖ Installation termin√©e !')else:    print('‚ÑπÔ∏è  Environnement local d√©tect√©, les packages sont d√©j√† install√©s.')

# Chapitre 13 - Segmentation S√©mantique avec U-Net

Ce notebook explore la **segmentation s√©mantique** avec U-Net et autres architectures.

## Objectifs
- Comprendre la segmentation s√©mantique vs d√©tection
- Impl√©menter U-Net from scratch en PyTorch
- Entra√Æner sur un dataset de segmentation
- √âvaluer avec IoU, Dice coefficient
- Explorer DeepLab et Mask R-CNN

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
import torchvision
from torchvision import transforms
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import cv2
from tqdm import tqdm
import requests
from io import BytesIO

# Segmentation models pytorch (optionnel)
try:
    import segmentation_models_pytorch as smp
    SMP_AVAILABLE = True
except ImportError:
    print("‚ö†Ô∏è segmentation_models_pytorch not installed")
    print("Install with: pip install segmentation-models-pytorch")
    SMP_AVAILABLE = False

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Device: {device}")

## 1. U-Net Architecture

Impl√©mentation compl√®te de U-Net.

In [None]:
class DoubleConv(nn.Module):
    """(Conv2D -> BatchNorm -> ReLU) x 2"""
    
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.double_conv = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True),
            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1, bias=False),
            nn.BatchNorm2d(out_channels),
            nn.ReLU(inplace=True)
        )
    
    def forward(self, x):
        return self.double_conv(x)


class Down(nn.Module):
    """Downscaling avec MaxPooling puis DoubleConv"""
    
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.maxpool_conv = nn.Sequential(
            nn.MaxPool2d(2),
            DoubleConv(in_channels, out_channels)
        )
    
    def forward(self, x):
        return self.maxpool_conv(x)


class Up(nn.Module):
    """Upscaling puis DoubleConv"""
    
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.up = nn.ConvTranspose2d(in_channels, in_channels // 2, kernel_size=2, stride=2)
        self.conv = DoubleConv(in_channels, out_channels)
    
    def forward(self, x1, x2):
        """
        x1: features de l'encoder (skip connection)
        x2: features du decoder
        """
        x2 = self.up(x2)
        
        # Padding pour correspondre aux dimensions (si n√©cessaire)
        diffY = x1.size()[2] - x2.size()[2]
        diffX = x1.size()[3] - x2.size()[3]
        x2 = F.pad(x2, [diffX // 2, diffX - diffX // 2,
                       diffY // 2, diffY - diffY // 2])
        
        # Concat√©ner skip connection
        x = torch.cat([x1, x2], dim=1)
        return self.conv(x)


class UNet(nn.Module):
    """U-Net Architecture"""
    
    def __init__(self, in_channels=3, out_channels=1, features=[64, 128, 256, 512]):
        super().__init__()
        
        # Encoder (Contracting Path)
        self.inc = DoubleConv(in_channels, features[0])
        self.down1 = Down(features[0], features[1])
        self.down2 = Down(features[1], features[2])
        self.down3 = Down(features[2], features[3])
        
        # Bottleneck
        self.down4 = Down(features[3], features[3] * 2)
        
        # Decoder (Expansive Path)
        self.up1 = Up(features[3] * 2, features[3])
        self.up2 = Up(features[3], features[2])
        self.up3 = Up(features[2], features[1])
        self.up4 = Up(features[1], features[0])
        
        # Output
        self.outc = nn.Conv2d(features[0], out_channels, kernel_size=1)
    
    def forward(self, x):
        # Encoder avec skip connections
        x1 = self.inc(x)
        x2 = self.down1(x1)
        x3 = self.down2(x2)
        x4 = self.down3(x3)
        x5 = self.down4(x4)
        
        # Decoder avec skip connections
        x = self.up1(x4, x5)
        x = self.up2(x3, x)
        x = self.up3(x2, x)
        x = self.up4(x1, x)
        
        # Output
        logits = self.outc(x)
        return logits

# Tester l'architecture
model = UNet(in_channels=3, out_channels=1)
x = torch.randn(2, 3, 256, 256)
y = model(x)

print(f"Input shape: {x.shape}")
print(f"Output shape: {y.shape}")
print(f"\nNombre de param√®tres: {sum(p.numel() for p in model.parameters()) / 1e6:.2f}M")

## 2. Dataset Synth√©tique pour Tests

Cr√©er un dataset simple pour tester U-Net.

In [None]:
class SyntheticSegmentationDataset(Dataset):
    """
    Dataset synth√©tique avec formes g√©om√©triques.
    """
    
    def __init__(self, num_samples=1000, img_size=256, transform=None):
        self.num_samples = num_samples
        self.img_size = img_size
        self.transform = transform
    
    def __len__(self):
        return self.num_samples
    
    def __getitem__(self, idx):
        # G√©n√©rer image avec formes al√©atoires
        img = np.zeros((self.img_size, self.img_size, 3), dtype=np.uint8)
        mask = np.zeros((self.img_size, self.img_size), dtype=np.uint8)
        
        # Nombre de formes (1-3)
        n_shapes = np.random.randint(1, 4)
        
        for _ in range(n_shapes):
            shape_type = np.random.choice(['circle', 'rectangle', 'triangle'])
            color = tuple(np.random.randint(50, 255, 3).tolist())
            
            if shape_type == 'circle':
                center = (np.random.randint(50, self.img_size-50), 
                         np.random.randint(50, self.img_size-50))
                radius = np.random.randint(20, 60)
                cv2.circle(img, center, radius, color, -1)
                cv2.circle(mask, center, radius, 255, -1)
            
            elif shape_type == 'rectangle':
                x1, y1 = np.random.randint(20, self.img_size-80, 2)
                x2, y2 = x1 + np.random.randint(40, 100), y1 + np.random.randint(40, 100)
                cv2.rectangle(img, (x1, y1), (x2, y2), color, -1)
                cv2.rectangle(mask, (x1, y1), (x2, y2), 255, -1)
            
            elif shape_type == 'triangle':
                pts = np.random.randint(20, self.img_size-20, (3, 2))
                cv2.fillPoly(img, [pts], color)
                cv2.fillPoly(mask, [pts], 255)
        
        # Convertir en PIL
        img = Image.fromarray(img)
        mask = Image.fromarray(mask)
        
        if self.transform:
            img = self.transform(img)
            mask = transforms.ToTensor()(mask)
        
        return img, mask

# Cr√©er dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

train_dataset = SyntheticSegmentationDataset(num_samples=800, transform=transform)
val_dataset = SyntheticSegmentationDataset(num_samples=200, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=8, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=8, shuffle=False)

print(f"Train dataset: {len(train_dataset)} samples")
print(f"Val dataset: {len(val_dataset)} samples")

# Visualiser exemples
fig, axes = plt.subplots(2, 4, figsize=(16, 8))

for i in range(4):
    img, mask = train_dataset[i]
    
    # D√©normaliser pour affichage
    img_display = img.permute(1, 2, 0).numpy()
    img_display = img_display * np.array([0.229, 0.224, 0.225]) + np.array([0.485, 0.456, 0.406])
    img_display = np.clip(img_display, 0, 1)
    
    axes[0, i].imshow(img_display)
    axes[0, i].set_title(f'Image {i+1}')
    axes[0, i].axis('off')
    
    axes[1, i].imshow(mask.squeeze(), cmap='gray')
    axes[1, i].set_title(f'Mask {i+1}')
    axes[1, i].axis('off')

plt.tight_layout()
plt.show()

## 3. M√©triques de Segmentation

Impl√©menter IoU et Dice coefficient.

In [None]:
def dice_coefficient(pred, target, smooth=1e-6):
    """
    Calcule le Dice coefficient.
    
    Dice = 2 * |A ‚à© B| / (|A| + |B|)
    """
    pred = pred.view(-1)
    target = target.view(-1)
    
    intersection = (pred * target).sum()
    dice = (2. * intersection + smooth) / (pred.sum() + target.sum() + smooth)
    
    return dice


def iou_score(pred, target, smooth=1e-6):
    """
    Calcule l'IoU (Jaccard Index).
    
    IoU = |A ‚à© B| / |A ‚à™ B|
    """
    pred = pred.view(-1)
    target = target.view(-1)
    
    intersection = (pred * target).sum()
    union = pred.sum() + target.sum() - intersection
    iou = (intersection + smooth) / (union + smooth)
    
    return iou


def pixel_accuracy(pred, target):
    """
    Calcule l'accuracy pixel-level.
    """
    pred = pred.view(-1)
    target = target.view(-1)
    
    correct = (pred == target).sum()
    total = target.numel()
    
    return correct / total


# Test sur pr√©dictions synth√©tiques
pred = torch.rand(1, 1, 256, 256) > 0.5
target = torch.rand(1, 1, 256, 256) > 0.5

dice = dice_coefficient(pred.float(), target.float())
iou = iou_score(pred.float(), target.float())
acc = pixel_accuracy(pred, target)

print(f"Dice coefficient: {dice:.4f}")
print(f"IoU score: {iou:.4f}")
print(f"Pixel accuracy: {acc:.4f}")

# Relation Dice-IoU
print(f"\nRelation: Dice = 2*IoU / (1 + IoU)")
print(f"V√©rification: {2*iou / (1 + iou):.4f} ‚âà {dice:.4f}")

## 4. Loss Functions pour Segmentation

In [None]:
class DiceLoss(nn.Module):
    """Dice Loss (1 - Dice Coefficient)"""
    
    def __init__(self, smooth=1e-6):
        super().__init__()
        self.smooth = smooth
    
    def forward(self, pred, target):
        pred = torch.sigmoid(pred)
        
        pred = pred.view(-1)
        target = target.view(-1)
        
        intersection = (pred * target).sum()
        dice = (2. * intersection + self.smooth) / (pred.sum() + target.sum() + self.smooth)
        
        return 1 - dice


class DiceBCELoss(nn.Module):
    """Combinaison de Dice Loss et Binary Cross Entropy"""
    
    def __init__(self, dice_weight=0.5):
        super().__init__()
        self.dice = DiceLoss()
        self.bce = nn.BCEWithLogitsLoss()
        self.dice_weight = dice_weight
    
    def forward(self, pred, target):
        dice_loss = self.dice(pred, target)
        bce_loss = self.bce(pred, target)
        
        return self.dice_weight * dice_loss + (1 - self.dice_weight) * bce_loss


class FocalLoss(nn.Module):
    """Focal Loss pour g√©rer d√©s√©quilibre de classes"""
    
    def __init__(self, alpha=0.25, gamma=2.0):
        super().__init__()
        self.alpha = alpha
        self.gamma = gamma
    
    def forward(self, pred, target):
        bce_loss = F.binary_cross_entropy_with_logits(pred, target, reduction='none')
        pt = torch.exp(-bce_loss)
        focal_loss = self.alpha * (1 - pt) ** self.gamma * bce_loss
        
        return focal_loss.mean()


# Comparer les losses
pred = torch.randn(4, 1, 256, 256)
target = torch.randint(0, 2, (4, 1, 256, 256)).float()

dice_loss = DiceLoss()(pred, target)
dice_bce_loss = DiceBCELoss()(pred, target)
focal_loss = FocalLoss()(pred, target)
bce_loss = nn.BCEWithLogitsLoss()(pred, target)

print(f"Dice Loss: {dice_loss:.4f}")
print(f"Dice + BCE Loss: {dice_bce_loss:.4f}")
print(f"Focal Loss: {focal_loss:.4f}")
print(f"BCE Loss: {bce_loss:.4f}")

## 5. Entra√Ænement U-Net

In [None]:
def train_epoch(model, loader, criterion, optimizer, device):
    """Entra√Æne le mod√®le sur une epoch."""
    model.train()
    
    epoch_loss = 0
    epoch_dice = 0
    epoch_iou = 0
    
    pbar = tqdm(loader, desc='Training')
    for images, masks in pbar:
        images = images.to(device)
        masks = masks.to(device)
        
        # Forward
        outputs = model(images)
        loss = criterion(outputs, masks)
        
        # Backward
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # M√©triques
        with torch.no_grad():
            pred_masks = torch.sigmoid(outputs) > 0.5
            dice = dice_coefficient(pred_masks.float(), masks)
            iou = iou_score(pred_masks.float(), masks)
        
        epoch_loss += loss.item()
        epoch_dice += dice.item()
        epoch_iou += iou.item()
        
        pbar.set_postfix({'loss': f'{loss.item():.4f}', 
                         'dice': f'{dice.item():.4f}'})
    
    return epoch_loss / len(loader), epoch_dice / len(loader), epoch_iou / len(loader)


def validate(model, loader, criterion, device):
    """Valide le mod√®le."""
    model.eval()
    
    epoch_loss = 0
    epoch_dice = 0
    epoch_iou = 0
    
    with torch.no_grad():
        for images, masks in tqdm(loader, desc='Validation'):
            images = images.to(device)
            masks = masks.to(device)
            
            outputs = model(images)
            loss = criterion(outputs, masks)
            
            pred_masks = torch.sigmoid(outputs) > 0.5
            dice = dice_coefficient(pred_masks.float(), masks)
            iou = iou_score(pred_masks.float(), masks)
            
            epoch_loss += loss.item()
            epoch_dice += dice.item()
            epoch_iou += iou.item()
    
    return epoch_loss / len(loader), epoch_dice / len(loader), epoch_iou / len(loader)


# Initialiser mod√®le
model = UNet(in_channels=3, out_channels=1).to(device)
criterion = DiceBCELoss(dice_weight=0.5)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
scheduler = torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='max', 
                                                        factor=0.5, patience=3)

# Entra√Æner
num_epochs = 10
best_dice = 0

history = {
    'train_loss': [], 'train_dice': [], 'train_iou': [],
    'val_loss': [], 'val_dice': [], 'val_iou': []
}

print(f"\n{'='*60}")
print(f"Entra√Ænement U-Net sur {num_epochs} epochs")
print(f"{'='*60}\n")

for epoch in range(num_epochs):
    print(f"\nEpoch {epoch+1}/{num_epochs}")
    
    # Train
    train_loss, train_dice, train_iou = train_epoch(model, train_loader, 
                                                     criterion, optimizer, device)
    
    # Validation
    val_loss, val_dice, val_iou = validate(model, val_loader, criterion, device)
    
    # Scheduler
    scheduler.step(val_dice)
    
    # Sauvegarder historique
    history['train_loss'].append(train_loss)
    history['train_dice'].append(train_dice)
    history['train_iou'].append(train_iou)
    history['val_loss'].append(val_loss)
    history['val_dice'].append(val_dice)
    history['val_iou'].append(val_iou)
    
    # Sauvegarder meilleur mod√®le
    if val_dice > best_dice:
        best_dice = val_dice
        torch.save(model.state_dict(), '/tmp/best_unet.pth')
        print(f"‚úì Meilleur mod√®le sauvegard√© (Dice: {best_dice:.4f})")
    
    print(f"\nTrain - Loss: {train_loss:.4f}, Dice: {train_dice:.4f}, IoU: {train_iou:.4f}")
    print(f"Val   - Loss: {val_loss:.4f}, Dice: {val_dice:.4f}, IoU: {val_iou:.4f}")
    print(f"LR: {optimizer.param_groups[0]['lr']:.6f}")

print(f"\n{'='*60}")
print(f"Entra√Ænement termin√© ! Meilleur Dice: {best_dice:.4f}")
print(f"{'='*60}")

## 6. Visualisation des R√©sultats

In [None]:
# Courbes d'apprentissage
fig, axes = plt.subplots(1, 3, figsize=(18, 5))

# Loss
axes[0].plot(history['train_loss'], label='Train', marker='o')
axes[0].plot(history['val_loss'], label='Val', marker='s')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Loss')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

# Dice
axes[1].plot(history['train_dice'], label='Train', marker='o')
axes[1].plot(history['val_dice'], label='Val', marker='s')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Dice Coefficient')
axes[1].set_title('Dice Coefficient')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

# IoU
axes[2].plot(history['train_iou'], label='Train', marker='o')
axes[2].plot(history['val_iou'], label='Val', marker='s')
axes[2].set_xlabel('Epoch')
axes[2].set_ylabel('IoU Score')
axes[2].set_title('IoU Score')
axes[2].legend()
axes[2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Pr√©dictions sur √©chantillons
model.eval()

fig, axes = plt.subplots(4, 4, figsize=(16, 16))

with torch.no_grad():
    for i in range(4):
        img, mask = val_dataset[i]
        
        # Pr√©diction
        img_input = img.unsqueeze(0).to(device)
        output = model(img_input)
        pred_mask = torch.sigmoid(output) > 0.5
        
        # D√©normaliser image
        img_display = img.permute(1, 2, 0).cpu().numpy()
        img_display = img_display * np.array([0.229, 0.224, 0.225]) + np.array([0.485, 0.456, 0.406])
        img_display = np.clip(img_display, 0, 1)
        
        # M√©triques
        dice = dice_coefficient(pred_mask.cpu().float(), mask.unsqueeze(0))
        iou = iou_score(pred_mask.cpu().float(), mask.unsqueeze(0))
        
        # Affichage
        axes[i, 0].imshow(img_display)
        axes[i, 0].set_title('Image Original')
        axes[i, 0].axis('off')
        
        axes[i, 1].imshow(mask.squeeze(), cmap='gray')
        axes[i, 1].set_title('Ground Truth')
        axes[i, 1].axis('off')
        
        axes[i, 2].imshow(pred_mask.cpu().squeeze(), cmap='gray')
        axes[i, 2].set_title(f'Pr√©diction\nDice={dice:.3f}, IoU={iou:.3f}')
        axes[i, 2].axis('off')
        
        # Overlay
        overlay = img_display.copy()
        pred_mask_np = pred_mask.cpu().squeeze().numpy()
        overlay[pred_mask_np > 0] = [1, 0, 0]  # Rouge pour pr√©diction
        axes[i, 3].imshow(overlay, alpha=0.7)
        axes[i, 3].set_title('Overlay')
        axes[i, 3].axis('off')

plt.tight_layout()
plt.show()

## 7. Segmentation Models PyTorch (smp)

Utiliser la biblioth√®que `segmentation_models_pytorch` pour des architectures pr√©-entra√Æn√©es.

In [None]:
if SMP_AVAILABLE:
    # U-Net avec encoder ResNet34
    model_smp = smp.Unet(
        encoder_name='resnet34',
        encoder_weights='imagenet',
        in_channels=3,
        classes=1,
        activation=None
    )
    
    print("U-Net ResNet34 (smp) charg√©")
    print(f"Param√®tres: {sum(p.numel() for p in model_smp.parameters()) / 1e6:.2f}M")
    
    # Autres architectures disponibles
    architectures = [
        'Unet', 'UnetPlusPlus', 'MAnet', 'Linknet', 'FPN', 'PSPNet',
        'DeepLabV3', 'DeepLabV3Plus', 'PAN'
    ]
    
    encoders = [
        'resnet18', 'resnet34', 'resnet50', 'resnet101',
        'efficientnet-b0', 'efficientnet-b7',
        'mobilenet_v2', 'densenet121'
    ]
    
    print(f"\nArchitectures disponibles: {', '.join(architectures)}")
    print(f"Encoders disponibles: {', '.join(encoders[:5])}...")
else:
    print("segmentation_models_pytorch non install√©")

In [None]:
if SMP_AVAILABLE:
    # Exemple : DeepLabV3+ avec ResNet50
    model_deeplabv3 = smp.DeepLabV3Plus(
        encoder_name='resnet50',
        encoder_weights='imagenet',
        in_channels=3,
        classes=1
    )
    
    model_deeplabv3.to(device)
    
    print("DeepLabV3+ ResNet50 charg√©")
    print(f"Param√®tres: {sum(p.numel() for p in model_deeplabv3.parameters()) / 1e6:.2f}M")
    
    # Test inf√©rence
    with torch.no_grad():
        x = torch.randn(1, 3, 256, 256).to(device)
        y = model_deeplabv3(x)
        print(f"\nInput: {x.shape} -> Output: {y.shape}")

## 8. Mask R-CNN pour Segmentation d'Instances

Utiliser Mask R-CNN de torchvision.

In [None]:
from torchvision.models.detection import maskrcnn_resnet50_fpn
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision.models.detection.mask_rcnn import MaskRCNNPredictor

# Charger Mask R-CNN pr√©-entra√Æn√©
model_maskrcnn = maskrcnn_resnet50_fpn(pretrained=True)
model_maskrcnn.to(device)
model_maskrcnn.eval()

print("Mask R-CNN charg√© (pr√©-entra√Æn√© sur COCO)")
print(f"Param√®tres: {sum(p.numel() for p in model_maskrcnn.parameters()) / 1e6:.2f}M")

In [None]:
# Exemple inf√©rence Mask R-CNN
def load_image_from_url(url):
    response = requests.get(url)
    img = Image.open(BytesIO(response.content)).convert('RGB')
    return img

# Image test
img_url = "https://ultralytics.com/images/bus.jpg"
image = load_image_from_url(img_url)

# Transformation
transform = transforms.ToTensor()
img_tensor = transform(image).unsqueeze(0).to(device)

# Inf√©rence
with torch.no_grad():
    predictions = model_maskrcnn(img_tensor)

pred = predictions[0]
print(f"\nNombre de d√©tections: {len(pred['boxes'])}")
print(f"\nCl√©s: {pred.keys()}")
print(f"Boxes shape: {pred['boxes'].shape}")
print(f"Masks shape: {pred['masks'].shape}")
print(f"Labels: {pred['labels'][:5]}")
print(f"Scores: {pred['scores'][:5]}")

In [None]:
# Visualiser r√©sultats Mask R-CNN
threshold = 0.7
masks = pred['masks'][pred['scores'] > threshold].cpu().numpy()
boxes = pred['boxes'][pred['scores'] > threshold].cpu().numpy()
labels = pred['labels'][pred['scores'] > threshold].cpu().numpy()
scores = pred['scores'][pred['scores'] > threshold].cpu().numpy()

COCO_CLASSES = ['__background__', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus']

fig, axes = plt.subplots(1, 2, figsize=(16, 8))

# Image originale
axes[0].imshow(image)
axes[0].set_title('Image Originale')
axes[0].axis('off')

# Image avec masques
axes[1].imshow(image)

# Overlay masques
for i, (mask, box, label, score) in enumerate(zip(masks, boxes, labels, scores)):
    # Masque (probabilit√© > 0.5)
    mask = mask[0] > 0.5
    
    # Couleur al√©atoire
    color = np.random.rand(3)
    
    # Appliquer masque
    overlay = np.zeros_like(image)
    overlay[mask] = (color * 255).astype(np.uint8)
    axes[1].imshow(overlay, alpha=0.5)
    
    # Bounding box
    x1, y1, x2, y2 = box
    rect = plt.Rectangle((x1, y1), x2-x1, y2-y1, 
                         fill=False, edgecolor=color, linewidth=2)
    axes[1].add_patch(rect)
    
    # Label
    class_name = COCO_CLASSES[label] if label < len(COCO_CLASSES) else f'Class {label}'
    axes[1].text(x1, y1-5, f'{class_name} {score:.2f}',
               bbox=dict(facecolor=color, alpha=0.7), fontsize=10, color='white')

axes[1].set_title(f'Mask R-CNN Segmentation ({len(masks)} instances)')
axes[1].axis('off')

plt.tight_layout()
plt.show()

## R√©sum√©

Dans ce notebook, nous avons explor√© :

1. **U-Net Architecture** :
   - Encoder-decoder avec skip connections
   - Impl√©mentation compl√®te from scratch
   - Architecture embl√©matique pour segmentation m√©dicale

2. **M√©triques de Segmentation** :
   - IoU (Intersection over Union)
   - Dice coefficient (F1-score pour segmentation)
   - Pixel accuracy

3. **Loss Functions** :
   - Dice Loss
   - Dice + BCE Loss (combinaison)
   - Focal Loss (d√©s√©quilibre de classes)

4. **Entra√Ænement** :
   - Pipeline complet avec validation
   - Learning rate scheduling
   - Monitoring des m√©triques

5. **Segmentation Models PyTorch** :
   - Architectures pr√©-entra√Æn√©es (U-Net, DeepLabV3+, etc.)
   - Encoders vari√©s (ResNet, EfficientNet, etc.)

6. **Mask R-CNN** :
   - Segmentation d'instances (vs s√©mantique)
   - D√©tection + segmentation en un mod√®le

### Points Cl√©s
- **U-Net** : architecture de r√©f√©rence pour segmentation (m√©dical)
- **Skip connections** : essentielles pour pr√©server d√©tails spatiaux
- **Dice coefficient** : m√©trique privil√©gi√©e pour d√©s√©quilibre de classes
- **Transfer learning** : utiliser encoders ImageNet (ResNet, etc.)

### Prochaines √âtapes
- Notebook suivant : Vision Transformers (ViT)
- Appliquer sur datasets r√©els (Cityscapes, Medical Decathlon)
- Explorer atrous convolutions (DeepLab)