# üìì Notebook POC : Benchmark Scientifique (IoU & Latence)

## 1. Contexte et Objectifs
**Objectif :** √âvaluer quantitativement la pr√©cision (IoU) et la vitesse de deux mod√®les de d√©tourage sur un jeu de donn√©es r√©el.
**Dataset :** [Human Parsing Dataset](https://huggingface.co/datasets/mattmdjaga/human_parsing_dataset) (Images contenant des humains + Masques de segmentation segment√©s par parties du corps).

**Mod√®les compar√©s :**
1.  **DeepLabV3 (Baseline 2017) :** Segmentation s√©mantique classique.
2.  **RMBG-1.4 (SOTA 2024) :** Segmentation saillante optimis√©e pour le d√©tourage.

In [7]:
import torch
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image
from torchvision import transforms
from torchvision.models.segmentation import deeplabv3_resnet101, DeepLabV3_ResNet101_Weights
from transformers import AutoModelForImageSegmentation
from datasets import load_dataset

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

dataset_full = load_dataset("mattmdjaga/human_parsing_dataset", split="train")
dataset = dataset_full

Environnement d'ex√©cution : cpu
T√©l√©chargement du dataset...
Dataset charg√©. Nombre total d'images disponibles : 17706


## 3. Pr√©paration des M√©triques et Utilitaires

Le dataset fournit des masques o√π chaque partie du corps (bras, t√™te, jambes) a une valeur diff√©rente. Pour le d√©tourage, nous devons convertir cela en masque binaire : **0 = Fond, 1 = Humain**.

In [None]:
import torch
import time
import numpy as np
import pandas as pd
from PIL import Image
from torchvision import transforms
from torchvision.models.segmentation import deeplabv3_resnet101, DeepLabV3_ResNet101_Weights
from transformers import AutoModelForImageSegmentation

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

def get_binary_mask(mask_pil):
    return (np.array(mask_pil) > 0).astype(np.uint8)

def compute_iou(pred, target):
    if pred.shape != target.shape: return 0.0
    inter = np.logical_and(pred, target).sum()
    union = np.logical_or(pred, target).sum()
    return inter / union if union > 0 else 1.0

def extract_tensor(data):
    if isinstance(data, torch.Tensor): return data
    if isinstance(data, (list, tuple)):
        for item in data:
            found = extract_tensor(item)
            if found is not None: return found  # Correction ici
    return None

def run_deeplab(img, model, prep):
    start = time.time()
    input_batch = prep(img).unsqueeze(0).to(device)
    with torch.no_grad():
        out = model(input_batch)['out'][0]
        preds = out.argmax(0).byte().cpu().numpy()
    mask = (preds == 15).astype(np.uint8)
    if mask.sum() == 0: mask = (preds > 0).astype(np.uint8)
    mask_res = Image.fromarray(mask * 255).resize(img.size, resample=Image.NEAREST)
    return (np.array(mask_res) > 128).astype(np.uint8), time.time() - start

def run_rmbg(img, model):
    start = time.time()
    transform = transforms.Compose([
        transforms.Resize((1024, 1024)),
        transforms.ToTensor(),
        transforms.Normalize([0.5, 0.5, 0.5], [1.0, 1.0, 1.0])
    ])
    input_batch = transform(img).unsqueeze(0).to(device)
    with torch.no_grad():
        out = model(input_batch)
        # La correction dans extract_tensor va emp√™cher le crash ici
        preds = torch.sigmoid(extract_tensor(out)).cpu().squeeze()
    mask_res = transforms.ToPILImage()(preds).resize(img.size, resample=Image.BILINEAR)
    return (np.array(mask_res) > 128).astype(np.uint8), time.time() - start

# Mod√®les
weights = DeepLabV3_ResNet101_Weights.DEFAULT
model_dl = deeplabv3_resnet101(weights=weights).to(device).eval()
prep_dl = weights.transforms()

model_rmbg = AutoModelForImageSegmentation.from_pretrained("briaai/RMBG-1.4", trust_remote_code=True).to(device).eval()

# Benchmark
NUM_SAMPLES = 20
indices = np.random.choice(len(dataset), NUM_SAMPLES, replace=False)
metrics = []

for idx in indices:
    sample = dataset[int(idx)]
    img = sample['image'].convert("RGB")
    gt = get_binary_mask(sample['mask'])
    
    m_dl, t_dl = run_deeplab(img, model_dl, prep_dl)
    m_rmbg, t_rmbg = run_rmbg(img, model_rmbg)
    
    metrics.append({
        "id": idx,
        "iou_deeplab": compute_iou(m_dl, gt),
        "time_deeplab": t_dl,
        "iou_rmbg": compute_iou(m_rmbg, gt),
        "time_rmbg": t_rmbg
    })

df = pd.DataFrame(metrics)

In [14]:

df = pd.DataFrame(metrics)

means = df.mean(numeric_only=True)


# Affichage format√© : {:.2%} pour pourcentage, {:.4f} pour 4 d√©cimales
print(f"DeepLabV3 :")
print(f"- Pr√©cision (IoU) : {means['iou_deeplab']:.2%}")
print(f"- Temps moyen     : {means['time_deeplab']:.4f} secondes")

print(f"\n RMBG-1.4 (SOTA) :")
print(f"- Pr√©cision (IoU) : {means['iou_rmbg']:.2%}")
print(f"- Temps moyen     : {means['time_rmbg']:.4f} secondes")


DeepLabV3 :
- Pr√©cision (IoU) : 86.90%
- Temps moyen     : 2.7151 secondes

 RMBG-1.4 (SOTA) :
- Pr√©cision (IoU) : 90.53%
- Temps moyen     : 1.3381 secondes
