# Pipeline End-to-End: YOLO ‚Üí ResNet (Translation + Rotation) ‚Üí 6D Pose (Minimal Version)
Questa versione minimale esegue la pipeline end-to-end per la stima della posa 6D su LineMOD, ispirata al notebook test6_extension_endtoend_pipeline.ipynb.

**Nota:** Questa pipeline minimale esegue solo un batch di test e mostra la traduzione e la rotazione predette. Per valutazioni complete e metriche, vedere il notebook originale test6_extension_endtoend_pipeline.ipynb.

## 1. Import essenziali e setup minimal

In [1]:
# Import essenziali e setup minimal
import os
import sys
import pandas as pd
from pathlib import Path
import torch

sys.path.insert(0, str(Path.cwd().parent))
from config import Config
from models.pose_estimator_endtoend import PoseEstimator
from models.yolo_detector import YOLODetector
from dataset.linemod_pose import create_pose_dataloaders
from utils.validation import run_yolo_endtoend_pipeline, load_validation_results, calc_add_accuracy_per_class
from utils.visualization import plot_add_per_class


## 2. Carica modelli pre-addestrati (YOLO e PoseEstimator End-to-End)

In [2]:
train_loader, val_loader, test_loader = create_pose_dataloaders(
    dataset_root=Config.LINEMOD_ROOT,
    batch_size=Config.POSE_BATCH_SIZE,
    crop_margin=Config.POSE_CROP_MARGIN,
    output_size=Config.POSE_IMAGE_SIZE,
    num_workers=Config.NUM_WORKERS_POSE
)

In [4]:
# Carica modelli pre-addestrati (YOLO e PoseEstimator End-to-End)

yolo_finetuned_path = Config.CHECKPOINT_DIR / 'yolo' / 'yolo_train10' / 'weights' / 'best.pt'
yolo_detector = YOLODetector(model_name=str(yolo_finetuned_path), num_classes=Config.NUM_CLASSES) if yolo_finetuned_path.exists() else None
NAME = "pose_stable_train100"
pose_ckpt = Config.CHECKPOINT_DIR / 'pose' / NAME / 'weights' / 'best.pt'
model_endtoend = PoseEstimator(pretrained=True).to(Config.DEVICE)
if pose_ckpt.exists():
    checkpoint = torch.load(pose_ckpt, map_location=Config.DEVICE)
    model_endtoend.load_state_dict(checkpoint['model_state_dict'])
model_endtoend.eval()

‚úÖ Loading custom weights from: /Users/nicolotermine/zMellow/GitHub-Poli/Polito/polito-aml-6D_pose_estimation/checkpoints/yolo/yolo_train10/weights/best.pt
‚úÖ PoseEstimator initialized
   Backbone: resnet50 (pretrained=True, frozen=False)
   Feature dim: 2048
   Output: 7 values (4 quaternion + 3 translation)
   Dropout: 0.5


PoseEstimator(
  (backbone): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (4): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (downsample): Sequential(
          (0): Conv

## 3. Carica un batch di test e applica la pipeline end-to-end minimale

In [None]:
# ‚ö†Ô∏è NOTA: Questa cella usa GT bbox, non YOLO detection!
# Per testare la pipeline completa con YOLO, esegui la sezione 4

# Carica un batch di test e applica la pipeline end-to-end minimale CON GT CROPS
batch = next(iter(test_loader))

# ‚úÖ Usa immagini gi√† croppate dal dataset (GT crops)
image = batch['rgb_crop'].to(Config.DEVICE)

# ‚úÖ Pipeline minimale: GT crop ‚Üí ResNet (rotation + translation)
# (Questa NON usa YOLO, usa bbox GT)
pred_quat, pred_trans = model_endtoend(image)

# Ground truth
gt_translation = batch['translation'][0].cpu().numpy() if 'translation' in batch else None
gt_quat = batch['quaternion'][0].cpu().numpy() if 'quaternion' in batch else None

# Visualizza predizioni vs GT
import matplotlib.pyplot as plt
import numpy as np

img_vis = image.cpu().numpy()[0].transpose(1, 2, 0)
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
img_vis = img_vis * std + mean
img_vis = np.clip(img_vis, 0, 1)

fig, ax = plt.subplots(figsize=(5, 5))
ax.imshow(img_vis)
ax.axis('off')
title = "Pipeline End-to-End (GT bbox, no YOLO)\n"
title += f"Pred trans: {np.round(pred_trans.cpu().detach().numpy()[0], 2)}\n"
if gt_translation is not None:
    title += f"GT trans: {np.round(gt_translation, 2)}\n"
title += f"Pred quat: {np.round(pred_quat.cpu().detach().numpy()[0], 2)}"
if gt_quat is not None:
    title += f"\nGT quat: {np.round(gt_quat, 2)}"
ax.set_title(title, fontsize=10)
plt.show()

print("\nüí° Per testare la pipeline COMPLETA con YOLO detection, esegui la sezione 4!")

## 4. Pipeline completa YOLO + ResNet End-to-End

Valuta la pipeline end-to-end completa con YOLO detection su immagini full-size.  
Puoi scegliere tra:
- **Debug veloce**: 10 immagini (cella 4.1)
- **Validazione completa**: tutto il test set (cella 4.2)

In [5]:
# üêõ DEBUG VELOCE: Test pipeline YOLO end-to-end su solo 10 immagini

# Usa max_samples=10 per debug veloce
run_yolo_endtoend_pipeline(
    yolo_detector, 
    model_endtoend, 
    test_loader, 
    name=NAME,
    max_samples=10  # üîß Solo 10 immagini per debug rapido
)

print("\nüí° Per la validazione completa su tutto il test set, esegui la cella successiva!")

‚úÖ Usando modello pose_stable_train100 (gi√† caricato)!
Carico odelli 3D degli oggetti in memoria. 
Questi vengono usati per calcolare la metrica ADD.
‚úÖ Loaded model 01: 5841 points
‚úÖ Loaded model 02: 38325 points
‚úÖ Loaded model 04: 18995 points
‚úÖ Loaded model 05: 22831 points
‚úÖ Loaded model 06: 15736 points
‚úÖ Loaded model 08: 12655 points
‚úÖ Loaded model 09: 7912 points
‚úÖ Loaded model 10: 18473 points
‚úÖ Loaded model 11: 7479 points
‚úÖ Loaded model 12: 15972 points
‚úÖ Loaded model 13: 18216 points
‚úÖ Loaded model 14: 27435 points
‚úÖ Loaded model 15: 16559 points


Validazione YOLO pipeline (end-to-end):   0%|          | 0/210 [00:02<?, ?it/s]

üìä Campioni processati: 64
‚ö†Ô∏è  Detection failures: 0
Calcolo metriche: ADD full pose (YOLO pipeline end-to-end)
‚úÖ Risultati di validazione salvati in /Users/nicolotermine/zMellow/GitHub-Poli/Polito/polito-aml-6D_pose_estimation/checkpoints/pose/pose_stable_train100/validation_result.csv
‚úÖ Risultati salvati in /Users/nicolotermine/zMellow/GitHub-Poli/Polito/polito-aml-6D_pose_estimation/checkpoints/pose/pose_stable_train100/validation_result.csv

üí° Per la validazione completa su tutto il test set, esegui la cella successiva!





In [7]:
# üöÄ VALIDAZIONE COMPLETA: Esegui la pipeline end-to-end su TUTTO il test set
# ‚ö†Ô∏è Questa cella pu√≤ richiedere diversi minuti. Per debug veloce usa la cella precedente (10 immagini)

# Processa tutto il test set (nessun limite)
run_yolo_endtoend_pipeline(yolo_detector, model_endtoend, test_loader, name=NAME)

‚úÖ Usando modello pose_stable_train100 (gi√† caricato)!
Carico odelli 3D degli oggetti in memoria. 
Questi vengono usati per calcolare la metrica ADD.
‚úÖ Loaded model 01: 5841 points
‚úÖ Loaded model 02: 38325 points
‚úÖ Loaded model 04: 18995 points
‚úÖ Loaded model 05: 22831 points
‚úÖ Loaded model 06: 15736 points
‚úÖ Loaded model 08: 12655 points
‚úÖ Loaded model 09: 7912 points
‚úÖ Loaded model 10: 18473 points
‚úÖ Loaded model 11: 7479 points
‚úÖ Loaded model 12: 15972 points
‚úÖ Loaded model 13: 18216 points
‚úÖ Loaded model 14: 27435 points
‚úÖ Loaded model 15: 16559 points


Validazione YOLO pipeline (end-to-end): 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 210/210 [06:07<00:00,  1.75s/it]


üìä Campioni processati: 13388
‚ö†Ô∏è  Detection failures: 19
Calcolo metriche: ADD full pose (YOLO pipeline end-to-end)
‚úÖ Risultati di validazione salvati in /Users/nicolotermine/zMellow/GitHub-Poli/Polito/polito-aml-6D_pose_estimation/checkpoints/pose/pose_stable_train100/validation_result.csv
‚úÖ Risultati salvati in /Users/nicolotermine/zMellow/GitHub-Poli/Polito/polito-aml-6D_pose_estimation/checkpoints/pose/pose_stable_train100/validation_result.csv


## 5. Carica e mostra tabella delle metriche per classe (ADD full pose)

In [8]:
# Carica e mostra tabella delle metriche per classe (ADD full pose)
val_csv_path = os.path.join(Config.CHECKPOINT_DIR, 'pose', NAME, 'validation_result.csv')
results_full_pose, _ = load_validation_results(val_csv_path)

data, global_add, global_acc = calc_add_accuracy_per_class(results_full_pose, Config.LINEMOD_OBJECTS)

df = pd.DataFrame(data)
display(df)
print(f"\nMedia globale ADD (full pose): {global_add:.2f}")
print(f"Accuracy globale (full pose) (%): {global_acc:.1f}")

Unnamed: 0,Classe,Media ADD (rot-only),Accuracy (%)
0,01 - ape,21.24,20.2
1,02 - benchvise,30.29,51.3
2,04 - camera,25.73,39.7
3,05 - can,27.2,41.8
4,06 - cat,16.23,58.4
5,08 - driller,28.9,52.5
6,09 - duck,21.87,21.8
7,10 - eggbox,6.62,100.0
8,11 - glue,6.15,98.7
9,12 - holepuncher,25.83,32.1



Media globale ADD (full pose): 23.63
Accuracy globale (full pose) (%): 50.7


## 6. Grafico a barre delle medie ADD per classe
Mostra le medie della metrica ADD per ciascuna classe oggetto, calcolate sulla rotazione-traslazione predetta.

In [None]:
# Grafico a barre delle medie ADD per classe (full pose end-to-end)
results_full_pose = globals().get('results_full_pose', None)
if results_full_pose is None:
    print("‚ö†Ô∏è  Devi prima calcolare la metrica ADD full pose su tutto il test set e salvare i risultati in 'results_full_pose'.")
else:
    plot_add_per_class(results_full_pose, Config.LINEMOD_OBJECTS)