# Computer Vision tramite Deep Learning: CNN e Transfer Learning

## Introduzione

Le **Convolutional Neural Networks (CNN)** hanno rivoluzionato la Computer Vision. A differenza delle reti neurali fully-connected, le CNN sono progettate specificamente per dati con struttura spaziale (immagini).

### Perche' le CNN sono cosi' efficaci?

1. **Condivisione parametri**: stesso filtro applicato a tutta l'immagine
2. **Invarianza traslazionale**: rileva pattern indipendentemente dalla posizione
3. **Gerarchia di feature**: dai bordi ai pattern complessi
4. **Meno parametri**: rispetto a fully-connected

### Milestone storiche

- **1998**: LeNet-5 (Yann LeCun) - riconoscimento cifre
- **2012**: AlexNet vince ImageNet (8 layer, 60M parametri)
- **2014**: VGGNet (16-19 layer) e GoogleNet/Inception
- **2015**: ResNet (152 layer) con skip connections
- **2019**: EfficientNet - scaling ottimizzato
- **2020+**: Vision Transformers (ViT) sfidano le CNN

---

## 1. Architettura delle CNN

### 1.1 Layer Convoluzionale

Il **layer convoluzionale** applica filtri (kernel) all'immagine tramite l'operazione di convoluzione.

#### Operazione di Convoluzione

Un **filtro** (es. 3x3) scorre sull'immagine e calcola il prodotto scalare con ogni regione:

```
Immagine:        Filtro:         Output:
[1 2 3]          [1 0 -1]
[4 5 6]    *     [1 0 -1]    =   valore
[7 8 9]          [1 0 -1]
```

**Iperparametri:**
- **Numero filtri**: quante feature map produrre
- **Dimensione kernel**: tipicamente 3x3 o 5x5
- **Stride**: passo di scorrimento del filtro
- **Padding**: aggiunta di bordi (same/valid)

#### Cosa rilevano i filtri?

- **Layer 1**: bordi, linee, colori base
- **Layer 2-3**: texture, pattern semplici
- **Layer 4-5**: parti di oggetti (occhi, ruote)
- **Layer finali**: oggetti completi

### 1.2 Pooling Layer

Il **pooling** riduce la dimensionalita' spaziale:

**Max Pooling** (piu' comune): prende il valore massimo in ogni regione

```
Input 4x4:       Output 2x2 (pool 2x2):
[1 3 2 4]        [3 5]
[2 1 5 3]   ->   [9 8]
[7 9 1 2]
[4 6 8 3]
```

**Vantaggi:**
- Riduce parametri
- Introduce invarianza spaziale
- Riduce overfitting

### 1.3 Architettura tipica CNN

```
Input Image
    |
Conv -> ReLU -> Pool  (ripeti N volte)
    |
Flatten
    |
Fully Connected -> ReLU
    |
Output (Softmax)
```

---

## 2. Setup e Dataset

Useremo **CIFAR-10**: 60,000 immagini 32x32 a colori in 10 classi.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset, random_split, Subset
import torchvision
import torchvision.transforms as transforms
import torchvision.models as models

from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import train_test_split

np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

print(f"PyTorch: {torch.__version__}")
print(f"Torchvision: {torchvision.__version__}")
print(f"Device: {device}")
print(f"GPU disponibile: {torch.cuda.is_available()}")

# Caricamento CIFAR-10
transform_basic = transforms.ToTensor()

trainset_full = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True,
    transform=transform_basic
)
testset = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True,
    transform=transform_basic
)

# Per avere anche array numpy (utile per visualizzazioni)
trainset_raw = torchvision.datasets.CIFAR10(
    root='./data', train=True, download=True,
    transform=None
)
testset_raw = torchvision.datasets.CIFAR10(
    root='./data', train=False, download=True,
    transform=None
)

# Nomi classi CIFAR-10
class_names = [
    'airplane', 'automobile', 'bird', 'cat', 'deer',
    'dog', 'frog', 'horse', 'ship', 'truck'
]

# Estrai dati come numpy per preprocessing/visualizzazione
X_train = np.array([np.array(img) for img, _ in trainset_raw])
y_train = np.array([label for _, label in trainset_raw])
X_test = np.array([np.array(img) for img, _ in testset_raw])
y_test = np.array([label for _, label in testset_raw])

print(f"Training set: {X_train.shape}")
print(f"Test set: {X_test.shape}")
print(f"Range valori: [{X_train.min()}, {X_train.max()}]")
print(f"Classi: {class_names}")

# Visualizzazione esempi
fig, axes = plt.subplots(4, 8, figsize=(16, 8))
for i, ax in enumerate(axes.flat):
    ax.imshow(X_train[i])
    ax.set_title(class_names[y_train[i]])
    ax.axis('off')
plt.suptitle('Esempi CIFAR-10', fontsize=16)
plt.tight_layout()
plt.show()

import os, urllib.request

# GitHub Release URL for pretrained weights (update with actual URL)
WEIGHTS_BASE_URL = os.environ.get('WEIGHTS_URL', '')
WEIGHTS_DIR = 'pretrained_weights'
os.makedirs(WEIGHTS_DIR, exist_ok=True)

def load_or_train(model, train_fn, weights_filename, device='cpu'):
    """Load pretrained weights if available, otherwise train and save."""
    weights_path = os.path.join(WEIGHTS_DIR, weights_filename)
    if os.path.exists(weights_path):
        model.load_state_dict(torch.load(weights_path, map_location=device, weights_only=True))
        print(f"Loaded pretrained weights from {weights_path}")
        return None  # no training history
    elif WEIGHTS_BASE_URL:
        try:
            url = WEIGHTS_BASE_URL + weights_filename
            urllib.request.urlretrieve(url, weights_path)
            model.load_state_dict(torch.load(weights_path, map_location=device, weights_only=True))
            print(f"Downloaded and loaded weights from {url}")
            return None
        except Exception as e:
            print(f"Could not download weights: {e}. Training from scratch...")

    history = train_fn()
    torch.save(model.state_dict(), weights_path)
    print(f"Saved weights to {weights_path}")
    return history


### 2.1 Preprocessing

In [None]:
# Normalizzazione a [0,1]
X_train_norm = X_train.astype('float32') / 255.0
X_test_norm = X_test.astype('float32') / 255.0

# Converti in tensori PyTorch (NCHW format)
X_train_tensor = torch.from_numpy(X_train_norm).permute(0, 3, 1, 2)  # NHWC -> NCHW
X_test_tensor = torch.from_numpy(X_test_norm).permute(0, 3, 1, 2)
y_train_tensor = torch.from_numpy(y_train).long()
y_test_tensor = torch.from_numpy(y_test).long()

print("Shape dopo preprocessing:")
print(f"  X_train tensor: {X_train_tensor.shape}")
print(f"  y_train tensor: {y_train_tensor.shape}")
print(f"Range valori: [{X_train_tensor.min():.2f}, {X_train_tensor.max():.2f}]")

## Esercizio 1

In [None]:
# ==========================================================
# ESERCIZIO 1: Preprocessing e Analisi Dataset di Immagini
# ==========================================================
# Task: Caricare, preprocessare e analizzare un dataset
#       di immagini fashion
# Dataset: Fashion MNIST (60000 immagini 28x28, 10 classi)

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import torchvision

# Caricamento dataset Fashion MNIST
np.random.seed(123)
fashion_train = torchvision.datasets.FashionMNIST(
    root='./data', train=True, download=True, transform=None
)
fashion_test = torchvision.datasets.FashionMNIST(
    root='./data', train=False, download=True, transform=None
)

X_train_fashion = np.array([np.array(img) for img, _ in fashion_train])
y_train_fashion = np.array([label for _, label in fashion_train])
X_test_fashion = np.array([np.array(img) for img, _ in fashion_test])
y_test_fashion = np.array([label for _, label in fashion_test])

# FIX: nome variabile dedicata, non sovrascriviamo class_names
class_names_fashion = [
    'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
    'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'
]

print("Dataset Fashion MNIST caricato")
print(
    f"Train: {X_train_fashion.shape}, "
    f"Test: {X_test_fashion.shape}"
)
print(
    f"Range valori originali: "
    f"[{X_train_fashion.min()}, {X_train_fashion.max()}]"
)

# Step 1: Normalizzazione e reshape (channels first per PyTorch)
X_train_fprep = (
    X_train_fashion.astype('float32') / 255.0
)
X_test_fprep = (
    X_test_fashion.astype('float32') / 255.0
)
X_train_fprep = X_train_fprep.reshape(-1, 1, 28, 28)  # NCHW
X_test_fprep = X_test_fprep.reshape(-1, 1, 28, 28)

print(f"\nDopo preprocessing:")
print(f"Train shape: {X_train_fprep.shape}")
print(
    f"Range valori: "
    f"[{X_train_fprep.min():.2f}, {X_train_fprep.max():.2f}]"
)

# Step 2: Analisi distribuzione classi
class_counts = (
    pd.Series(y_train_fashion)
    .value_counts()
    .sort_index()
)
class_pct = (
    (class_counts / len(y_train_fashion) * 100).round(2)
)
class_distribution = pd.DataFrame({
    'classe': class_counts.index,
    'nome': [class_names_fashion[i]
             for i in class_counts.index],
    'conteggio': class_counts.values,
    'percentuale': class_pct.values
})

print("\nDistribuzione classi nel training set:")
print(class_distribution)

# Step 3: Visualizzazione campioni per classe
fig, axes = plt.subplots(10, 5, figsize=(12, 20))

for class_id in range(10):
    class_indices = np.where(
        y_train_fashion == class_id
    )[0]
    sample_indices = np.random.choice(
        class_indices, size=5, replace=False
    )
    for i in range(5):
        ax = axes[class_id, i]
        ax.imshow(
            X_train_fashion[sample_indices[i]],
            cmap='gray'
        )
        ax.axis('off')
        if i == 0:
            ax.set_title(
                class_names_fashion[class_id],
                fontsize=10
            )

plt.tight_layout()
plt.show()

# Step 4: Calcolo statistiche per classe
stats_per_class = []
for class_id in range(10):
    class_mask = y_train_fashion == class_id
    class_images = X_train_fprep[class_mask]
    stats_per_class.append({
        'classe': class_id,
        'nome': class_names_fashion[class_id],
        'media_pixel': class_images.mean(),
        'std_pixel': class_images.std()
    })

stats_df = pd.DataFrame(stats_per_class)
print("\nStatistiche per classe:")
print(stats_df.to_string(index=False))

# Step 5: Visualizzazione media per classe
fig, axes = plt.subplots(2, 5, figsize=(15, 6))
axes = axes.flatten()

for class_id in range(10):
    class_mask = y_train_fashion == class_id
    class_images = X_train_fprep[class_mask]
    mean_image = class_images.mean(axis=0).squeeze()
    axes[class_id].imshow(mean_image, cmap='gray')
    axes[class_id].set_title(class_names_fashion[class_id])
    axes[class_id].axis('off')

plt.tight_layout()
plt.show()

# Confronto train/test distribution
train_dist = (
    pd.Series(y_train_fashion)
    .value_counts(normalize=True)
    .sort_index()
)
test_dist = (
    pd.Series(y_test_fashion)
    .value_counts(normalize=True)
    .sort_index()
)

fig, ax = plt.subplots(figsize=(12, 6))
x = np.arange(10)
width = 0.35

ax.bar(
    x - width/2, train_dist.values,
    width, label='Train', alpha=0.8
)
ax.bar(
    x + width/2, test_dist.values,
    width, label='Test', alpha=0.8
)
ax.set_ylabel('Proporzione')
ax.set_xlabel('Classe')
ax.set_title('Distribuzione Classi: Train vs Test')
ax.set_xticks(x)
ax.set_xticklabels(
    class_names_fashion, rotation=45, ha='right'
)
ax.legend()
ax.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()

print("\nEsercizio 1 completato!")

---

## 3. Implementazione di una CNN

### 3.1 CNN Semplice

In [None]:
class SimpleCNN(nn.Module):
    """
    CNN semplice per CIFAR-10.
    """
    def __init__(self, num_classes=10):
        super().__init__()
        self.features = nn.Sequential(
            # Primo blocco convoluzionale
            nn.Conv2d(3, 32, 3, padding=1),   # conv1
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),                # pool1

            # Secondo blocco
            nn.Conv2d(32, 64, 3, padding=1),   # conv2
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),                # pool2

            # Terzo blocco
            nn.Conv2d(64, 128, 3, padding=1),  # conv3
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),                # pool3
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(128 * 4 * 4, 128),       # fc1
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(128, num_classes)          # output (no softmax - CrossEntropyLoss handles it)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x


# Creazione modello
model_simple = SimpleCNN().to(device)
print(model_simple)
total_params = sum(p.numel() for p in model_simple.parameters())
print(f"\nParametri totali: {total_params:,}")

**Analisi dell'architettura:**

1. **Conv1**: 32x32x3 -> 32x32x32 (32 filtri 3x3)
2. **Pool1**: 32x32x32 -> 16x16x32 (riduzione dimensione spaziale)
3. **Conv2**: 16x16x32 -> 16x16x64 (64 filtri)
4. **Pool2**: 16x16x64 -> 8x8x64
5. **Conv3**: 8x8x64 -> 8x8x128 (128 filtri)
6. **Pool3**: 8x8x128 -> 4x4x128
7. **Flatten**: 4x4x128 = 2048 neuroni
8. **FC1**: 2048 -> 128
9. **Output**: 128 -> 10

### 3.2 Training

In [None]:
def train_model(model, train_loader, val_loader, epochs=5,
                lr=0.001, patience=5, patience_lr=3, device=device):
    """
    Training loop generico PyTorch con early stopping e ReduceLROnPlateau.
    Restituisce un dizionario con history (come Keras).
    """
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=lr)
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(
        optimizer, mode='min', factor=0.5, patience=patience_lr, min_lr=1e-7
    )

    history = {'loss': [], 'accuracy': [], 'val_loss': [], 'val_accuracy': []}
    best_val_loss = float('inf')
    best_model_state = None
    epochs_no_improve = 0

    for epoch in range(epochs):
        # Training
        model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        for inputs, labels in train_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            optimizer.zero_grad()
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            running_loss += loss.item() * inputs.size(0)
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()

        train_loss = running_loss / total
        train_acc = correct / total

        # Validation
        model.eval()
        val_loss = 0.0
        val_correct = 0
        val_total = 0
        with torch.no_grad():
            for inputs, labels in val_loader:
                inputs, labels = inputs.to(device), labels.to(device)
                outputs = model(inputs)
                loss = criterion(outputs, labels)
                val_loss += loss.item() * inputs.size(0)
                _, predicted = outputs.max(1)
                val_total += labels.size(0)
                val_correct += predicted.eq(labels).sum().item()

        val_loss = val_loss / val_total
        val_acc = val_correct / val_total

        scheduler.step(val_loss)

        history['loss'].append(train_loss)
        history['accuracy'].append(train_acc)
        history['val_loss'].append(val_loss)
        history['val_accuracy'].append(val_acc)

        print(f"Epoch {epoch+1}/{epochs} - "
              f"loss: {train_loss:.4f} - accuracy: {train_acc:.4f} - "
              f"val_loss: {val_loss:.4f} - val_accuracy: {val_acc:.4f}")

        # Early stopping
        if val_loss < best_val_loss:
            best_val_loss = val_loss
            best_model_state = {k: v.clone() for k, v in model.state_dict().items()}
            epochs_no_improve = 0
        else:
            epochs_no_improve += 1
            if epochs_no_improve >= patience:
                print(f"Early stopping at epoch {epoch+1}")
                break

    # Restore best weights
    if best_model_state is not None:
        model.load_state_dict(best_model_state)

    return history


# Preparazione DataLoader con validation split
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
n_val = int(0.2 * len(train_dataset))
n_train = len(train_dataset) - n_val
train_subset, val_subset = random_split(
    train_dataset, [n_train, n_val],
    generator=torch.Generator().manual_seed(42)
)

train_loader = DataLoader(train_subset, batch_size=128, shuffle=True)
val_loader = DataLoader(val_subset, batch_size=128, shuffle=False)

# Training (with weight caching)
history_simple = load_or_train(
    model_simple,
    lambda: train_model(
        model_simple, train_loader, val_loader,
        epochs=5, lr=0.001, patience=5, patience_lr=3
    ),
    'nb05_simple_cnn.pt',
    device=device
)


### 3.3 Valutazione

In [None]:
# Plot learning curves
if history_simple is not None:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    axes[0].plot(history_simple['loss'], label='Training')
    axes[0].plot(history_simple['val_loss'], label='Validation')
    axes[0].set_xlabel('Epoch')
    axes[0].set_ylabel('Loss')
    axes[0].set_title('Loss')
    axes[0].legend()
    axes[0].grid(alpha=0.3)

    axes[1].plot(history_simple['accuracy'], label='Training')
    axes[1].plot(history_simple['val_accuracy'], label='Validation')
    axes[1].set_xlabel('Epoch')
    axes[1].set_ylabel('Accuracy')
    axes[1].set_title('Accuracy')
    axes[1].legend()
    axes[1].grid(alpha=0.3)

    plt.tight_layout()
    plt.show()
else:
    print("Using pretrained weights - training curves not available")

# Valutazione test set
def evaluate_model(model, X_tensor, y_tensor, device=device):
    """Evaluate model and return loss, accuracy, predictions."""
    model.eval()
    test_dataset = TensorDataset(X_tensor, y_tensor)
    test_loader = DataLoader(test_dataset, batch_size=128, shuffle=False)
    criterion = nn.CrossEntropyLoss()

    test_loss = 0.0
    correct = 0
    total = 0
    all_preds = []
    all_probs = []

    with torch.no_grad():
        for inputs, labels in test_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            test_loss += loss.item() * inputs.size(0)
            probs = torch.softmax(outputs, dim=1)
            _, predicted = outputs.max(1)
            total += labels.size(0)
            correct += predicted.eq(labels).sum().item()
            all_preds.append(predicted.cpu())
            all_probs.append(probs.cpu())

    test_loss = test_loss / total
    test_acc = correct / total
    all_preds = torch.cat(all_preds).numpy()
    all_probs = torch.cat(all_probs).numpy()
    return test_loss, test_acc, all_preds, all_probs


test_loss, test_acc, y_pred, y_pred_proba = evaluate_model(
    model_simple, X_test_tensor, y_test_tensor
)
print(f"Test Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_acc:.4f}")

# Classification report
print("\nClassification Report:")
print(classification_report(
    y_test, y_pred, target_names=class_names
))

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)

plt.figure(figsize=(12, 10))
sns.heatmap(
    cm, annot=True, fmt='d', cmap='Blues',
    xticklabels=class_names,
    yticklabels=class_names
)
plt.xlabel('Predetto')
plt.ylabel('Reale')
plt.title('Confusion Matrix - CNN Simple')
plt.xticks(rotation=45)
plt.yticks(rotation=45)
plt.tight_layout()
plt.show()


## Esercizio 2

In [None]:
# ==========================================================
# ESERCIZIO 2: Costruzione e Training CNN da Zero
# ==========================================================
# Task: Costruire una CNN per classificazione binaria
#       di cani vs gatti
# Dataset: Subset di 4000 immagini 64x64 sintetiche
#          (2000 cani, 2000 gatti)
#
# NOTA: usiamo variabili con suffisso _ex2 per non
#       sovrascrivere i dati CIFAR-10 principali.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report, roc_auc_score

# Generazione dataset sintetico cani vs gatti
np.random.seed(456)

def generate_synthetic_images(
    n_samples, img_size=64, category=0
):
    images = np.random.rand(
        n_samples, img_size, img_size, 3
    ).astype('float32')

    for i in range(n_samples):
        noise = np.random.uniform(0.8, 1.2)
        if category == 0:  # Cani
            gradient = np.linspace(0.3, 0.7, img_size)
            gradient = gradient.reshape(-1, 1, 1)
            images[i] = images[i] * 0.5 + gradient * noise * 0.5
            images[i, :, :, 0] += 0.1 * noise
        else:  # Gatti
            cx, cy = img_size // 2, img_size // 2
            Y, X = np.ogrid[:img_size, :img_size]
            dist = np.sqrt(
                (X - cx)**2 + (Y - cy)**2
            ) / (img_size / 2)
            dist = dist.clip(0, 1)
            radial = (1 - dist).reshape(img_size, img_size, 1)
            images[i] = images[i] * 0.5 + radial * noise * 0.5
            images[i, :, :, 2] += 0.1 * noise

    images = np.clip(images, 0, 1)
    return images

n_per_class = 2000
X_dogs = generate_synthetic_images(n_per_class, category=0)
X_cats = generate_synthetic_images(n_per_class, category=1)

X_ex2 = np.vstack([X_dogs, X_cats])
y_ex2 = np.array([0] * n_per_class + [1] * n_per_class)

indices_ex2 = np.random.permutation(len(X_ex2))
X_ex2 = X_ex2[indices_ex2]
y_ex2 = y_ex2[indices_ex2]

X_train_ex2, X_temp_ex2, y_train_ex2, y_temp_ex2 = (
    train_test_split(X_ex2, y_ex2, test_size=0.3, random_state=456)
)
X_val_ex2, X_test_ex2, y_val_ex2, y_test_ex2 = (
    train_test_split(X_temp_ex2, y_temp_ex2, test_size=0.5, random_state=456)
)

print("Dataset Cani vs Gatti")
print(
    f"Train: {X_train_ex2.shape}, "
    f"Val: {X_val_ex2.shape}, "
    f"Test: {X_test_ex2.shape}"
)
print(
    f"Distribuzione train: "
    f"Cani={np.sum(y_train_ex2==0)}, "
    f"Gatti={np.sum(y_train_ex2==1)}"
)

fig, axes = plt.subplots(2, 5, figsize=(12, 5))
for i in range(5):
    axes[0, i].imshow(X_train_ex2[y_train_ex2==0][i])
    axes[0, i].set_title('Cane')
    axes[0, i].axis('off')
    axes[1, i].imshow(X_train_ex2[y_train_ex2==1][i])
    axes[1, i].set_title('Gatto')
    axes[1, i].axis('off')
plt.tight_layout()
plt.show()

# Step 1: Costruzione architettura CNN
class BinaryCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, 3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),

            nn.Conv2d(32, 64, 3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),

            nn.Conv2d(64, 128, 3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(128 * 8 * 8, 128),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(128, 1),
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x

model_binary = BinaryCNN().to(device)
print("\nArchitettura CNN:")
print(model_binary)
total_params = sum(p.numel() for p in model_binary.parameters())
print(f"Parametri totali: {total_params:,}")

# Converti in tensori (NCHW)
X_tr_t = torch.from_numpy(X_train_ex2).permute(0, 3, 1, 2)
y_tr_t = torch.from_numpy(y_train_ex2).float().unsqueeze(1)
X_val_t = torch.from_numpy(X_val_ex2).permute(0, 3, 1, 2)
y_val_t = torch.from_numpy(y_val_ex2).float().unsqueeze(1)
X_test_t = torch.from_numpy(X_test_ex2).permute(0, 3, 1, 2)
y_test_t = torch.from_numpy(y_test_ex2).float().unsqueeze(1)

train_loader_ex2 = DataLoader(TensorDataset(X_tr_t, y_tr_t), batch_size=32, shuffle=True)
val_loader_ex2 = DataLoader(TensorDataset(X_val_t, y_val_t), batch_size=32, shuffle=False)
test_loader_ex2 = DataLoader(TensorDataset(X_test_t, y_test_t), batch_size=32, shuffle=False)

# Step 2: Training con BCEWithLogitsLoss
criterion_ex2 = nn.BCEWithLogitsLoss()
optimizer_ex2 = optim.Adam(model_binary.parameters(), lr=0.001)

history_ex2 = {'accuracy': [], 'val_accuracy': [], 'loss': [], 'val_loss': []}
best_val_loss_ex2 = float('inf')
best_state_ex2 = None
patience_counter_ex2 = 0

for epoch in range(30):
    model_binary.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for inputs, labels in train_loader_ex2:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer_ex2.zero_grad()
        outputs = model_binary(inputs)
        loss = criterion_ex2(outputs, labels)
        loss.backward()
        optimizer_ex2.step()
        running_loss += loss.item() * inputs.size(0)
        predicted = (torch.sigmoid(outputs) > 0.5).float()
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    train_loss = running_loss / total
    train_acc = correct / total

    model_binary.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in val_loader_ex2:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model_binary(inputs)
            loss = criterion_ex2(outputs, labels)
            val_loss += loss.item() * inputs.size(0)
            predicted = (torch.sigmoid(outputs) > 0.5).float()
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()

    val_loss = val_loss / val_total
    val_acc = val_correct / val_total

    history_ex2['loss'].append(train_loss)
    history_ex2['accuracy'].append(train_acc)
    history_ex2['val_loss'].append(val_loss)
    history_ex2['val_accuracy'].append(val_acc)

    print(f"Epoch {epoch+1}/30 - loss: {train_loss:.4f} - accuracy: {train_acc:.4f} - "
          f"val_loss: {val_loss:.4f} - val_accuracy: {val_acc:.4f}")

    if val_loss < best_val_loss_ex2:
        best_val_loss_ex2 = val_loss
        best_state_ex2 = {k: v.clone() for k, v in model_binary.state_dict().items()}
        patience_counter_ex2 = 0
    else:
        patience_counter_ex2 += 1
        if patience_counter_ex2 >= 5:
            print(f"Early stopping at epoch {epoch+1}")
            break

if best_state_ex2 is not None:
    model_binary.load_state_dict(best_state_ex2)

# Step 4: Visualizzazione learning curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
axes[0].plot(history_ex2['accuracy'], label='Train Accuracy')
axes[0].plot(history_ex2['val_accuracy'], label='Val Accuracy')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Accuracy')
axes[0].set_title('Model Accuracy')
axes[0].legend()
axes[0].grid(alpha=0.3)

axes[1].plot(history_ex2['loss'], label='Train Loss')
axes[1].plot(history_ex2['val_loss'], label='Val Loss')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Loss')
axes[1].set_title('Model Loss')
axes[1].legend()
axes[1].grid(alpha=0.3)
plt.tight_layout()
plt.show()

# Step 5: Valutazione finale
model_binary.eval()
test_loss_ex2 = 0.0
test_correct = 0
test_total = 0
all_preds_ex2 = []
all_probs_ex2 = []

with torch.no_grad():
    for inputs, labels in test_loader_ex2:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model_binary(inputs)
        loss = criterion_ex2(outputs, labels)
        test_loss_ex2 += loss.item() * inputs.size(0)
        probs = torch.sigmoid(outputs)
        predicted = (probs > 0.5).float()
        test_total += labels.size(0)
        test_correct += (predicted == labels).sum().item()
        all_preds_ex2.append(predicted.cpu())
        all_probs_ex2.append(probs.cpu())

test_loss_ex2 = test_loss_ex2 / test_total
test_acc_ex2 = test_correct / test_total
y_pred_ex2 = torch.cat(all_preds_ex2).numpy().flatten().astype(int)
y_probs_ex2 = torch.cat(all_probs_ex2).numpy().flatten()

test_auc_ex2 = roc_auc_score(y_test_ex2, y_probs_ex2)

print("\n" + "=" * 70)
print("RISULTATI TEST SET")
print("=" * 70)
print(f"Test Loss: {test_loss_ex2:.4f}")
print(f"Test Accuracy: {test_acc_ex2:.4f}")
print(f"Test AUC: {test_auc_ex2:.4f}")

# Confusion matrix
cm_ex2 = confusion_matrix(y_test_ex2, y_pred_ex2)
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm_ex2, annot=True, fmt='d', cmap='Blues',
    xticklabels=['Cane', 'Gatto'],
    yticklabels=['Cane', 'Gatto']
)
plt.xlabel('Predetto')
plt.ylabel('Reale')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()

print("\nClassification Report:")
print(classification_report(
    y_test_ex2, y_pred_ex2,
    target_names=['Cane', 'Gatto']
))

print("\nEsercizio 2 completato!")

### 3.4 Data Augmentation per migliorare performance

In [None]:
# Data Augmentation con torchvision transforms
from torchvision import transforms
from PIL import Image

augmentation_transform = transforms.Compose([
    transforms.RandomRotation(15),
    transforms.RandomHorizontalFlip(),
    transforms.RandomAffine(0, translate=(0.1, 0.1)),
    transforms.ColorJitter(brightness=0.1),
    transforms.ToTensor(),
])

# Visualizzazione augmentation
sample_img = X_train[0]  # HWC uint8
sample_pil = Image.fromarray(sample_img)

fig, axes = plt.subplots(2, 5, figsize=(12, 5))
axes = axes.flatten()

axes[0].imshow(sample_img)
axes[0].set_title('Originale')
axes[0].axis('off')

for i in range(1, 10):
    aug_tensor = augmentation_transform(sample_pil)
    # Convert CHW tensor back to HWC numpy for display
    aug_img = aug_tensor.permute(1, 2, 0).numpy()
    axes[i].imshow(aug_img)
    axes[i].set_title(f'Aug {i}')
    axes[i].axis('off')

plt.suptitle(
    'Esempi di Data Augmentation',
    fontsize=14, fontweight='bold'
)
plt.tight_layout()
plt.show()

### 3.5 CNN Avanzata con Data Augmentation

In [None]:
class AdvancedCNN(nn.Module):
    """
    CNN avanzata con piu' layer e regolarizzazione.
    """
    def __init__(self, num_classes=10):
        super().__init__()
        self.features = nn.Sequential(
            # Blocco 1
            nn.Conv2d(3, 64, 3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.Conv2d(64, 64, 3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.2),

            # Blocco 2
            nn.Conv2d(64, 128, 3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.Conv2d(128, 128, 3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.3),

            # Blocco 3
            nn.Conv2d(128, 256, 3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.Conv2d(256, 256, 3, padding=1),
            nn.BatchNorm2d(256),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
            nn.Dropout(0.4),
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(256 * 4 * 4, 512),
            nn.ReLU(),
            nn.BatchNorm1d(512),
            nn.Dropout(0.5),
            nn.Linear(512, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x


model_advanced = AdvancedCNN().to(device)
total_params_adv = sum(p.numel() for p in model_advanced.parameters())
print(f"Parametri totali: {total_params_adv:,}")

# FIX: Proper validation split (not using test set)
from sklearn.model_selection import train_test_split as sk_split

# Split training data into train/val
indices_all = np.arange(len(X_train_norm))
idx_tr_adv, idx_val_adv = sk_split(
    indices_all, test_size=0.2, random_state=42
)

X_tr_adv_np = X_train_norm[idx_tr_adv]
y_tr_adv_np = y_train[idx_tr_adv]
X_val_adv_np = X_train_norm[idx_val_adv]
y_val_adv_np = y_train[idx_val_adv]

# Create augmented dataset using torchvision transforms
train_augment_transform = transforms.Compose([
    transforms.RandomRotation(15),
    transforms.RandomHorizontalFlip(),
    transforms.RandomAffine(0, translate=(0.1, 0.1)),
    transforms.ToTensor(),
])

# Custom dataset for augmentation
class AugmentedDataset(torch.utils.data.Dataset):
    def __init__(self, images_np, labels_np, transform=None):
        """images_np: NHWC float32 [0,1], labels_np: int array"""
        self.images = images_np
        self.labels = labels_np
        self.transform = transform

    def __len__(self):
        return len(self.images)

    def __getitem__(self, idx):
        img = self.images[idx]
        label = self.labels[idx]
        # Convert to PIL for transforms
        img_uint8 = (img * 255).astype(np.uint8)
        pil_img = Image.fromarray(img_uint8)
        if self.transform:
            img_tensor = self.transform(pil_img)
        else:
            img_tensor = transforms.ToTensor()(pil_img)
        return img_tensor, torch.tensor(label, dtype=torch.long)

train_aug_dataset = AugmentedDataset(X_tr_adv_np, y_tr_adv_np, train_augment_transform)
val_adv_tensor_x = torch.from_numpy(X_val_adv_np).permute(0, 3, 1, 2).float()
val_adv_tensor_y = torch.from_numpy(y_val_adv_np).long()
val_adv_dataset = TensorDataset(val_adv_tensor_x, val_adv_tensor_y)

train_loader_adv = DataLoader(train_aug_dataset, batch_size=128, shuffle=True)
val_loader_adv = DataLoader(val_adv_dataset, batch_size=128, shuffle=False)

# Training with augmentation (with weight caching)
history_advanced = load_or_train(
    model_advanced,
    lambda: train_model(
        model_advanced, train_loader_adv, val_loader_adv,
        epochs=5, lr=0.001, patience=5, patience_lr=3
    ),
    'nb05_advanced_cnn.pt',
    device=device
)

# Risultati
test_loss_adv, test_acc_adv, _, _ = evaluate_model(
    model_advanced, X_test_tensor, y_test_tensor
)
print(f"\nCNN Avanzata - Test Accuracy: {test_acc_adv:.4f}")


## Esercizio 3

In [None]:
# ==========================================================
# ESERCIZIO 3: Data Augmentation e Confronto Performance
# ==========================================================
# Task: Confrontare performance CNN con e senza
#       data augmentation
# Dataset: Subset CIFAR-10 con 3 classi (5000 immagini)
#
# NOTA: variabili con suffisso _ex3 per non sovrascrivere

import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

# Caricamento subset CIFAR-10
np.random.seed(789)

selected_classes = [0, 1, 2]  # airplane, automobile, bird
class_names_ex3 = ['airplane', 'automobile', 'bird']

mask_train_ex3 = np.isin(y_train, selected_classes)
mask_test_ex3 = np.isin(y_test, selected_classes)

X_subset_ex3 = X_train[mask_train_ex3][:5000]
y_subset_ex3 = y_train[mask_train_ex3][:5000]
X_test_ex3 = X_test[mask_test_ex3][:1000]
y_test_ex3 = y_test[mask_test_ex3][:1000]

# FIX: remapping label con dizionario
label_map = {old: new for new, old in enumerate(selected_classes)}
y_subset_ex3 = np.array([label_map[y] for y in y_subset_ex3])
y_test_ex3 = np.array([label_map[y] for y in y_test_ex3])

X_subset_ex3 = X_subset_ex3.astype('float32') / 255.0
X_test_ex3 = X_test_ex3.astype('float32') / 255.0

X_tr_ex3, X_val_ex3, y_tr_ex3, y_val_ex3 = (
    train_test_split(
        X_subset_ex3, y_subset_ex3,
        test_size=0.2, random_state=789
    )
)

print("Dataset CIFAR-10 subset (3 classi)")
print(
    f"Train: {X_tr_ex3.shape}, "
    f"Val: {X_val_ex3.shape}, "
    f"Test: {X_test_ex3.shape}"
)
print(f"Classi: {class_names_ex3}")

# Visualizzazione augmentation
augment_ex3 = transforms.Compose([
    transforms.RandomRotation(20),
    transforms.RandomHorizontalFlip(),
    transforms.RandomAffine(0, translate=(0.15, 0.15)),
    transforms.ToTensor(),
])

sample_image_ex3 = X_tr_ex3[0]
sample_pil_ex3 = Image.fromarray((sample_image_ex3 * 255).astype(np.uint8))

fig, axes = plt.subplots(2, 5, figsize=(12, 5))
axes = axes.flatten()
axes[0].imshow(sample_image_ex3)
axes[0].set_title('Originale')
axes[0].axis('off')

for i in range(1, 10):
    aug_t = augment_ex3(sample_pil_ex3)
    aug_np = aug_t.permute(1, 2, 0).numpy()
    axes[i].imshow(aug_np)
    axes[i].set_title(f'Aug {i}')
    axes[i].axis('off')

plt.tight_layout()
plt.show()

# Step 2: Creazione modello CNN
class CNNForAugmentation(nn.Module):
    def __init__(self, num_classes=3):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 32, 3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),

            nn.Conv2d(32, 64, 3, padding=1),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),

            nn.Conv2d(64, 128, 3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(128 * 4 * 4, 128),
            nn.ReLU(),
            nn.Dropout(0.5),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.classifier(x)
        return x


model_no_aug = CNNForAugmentation().to(device)
model_with_aug = CNNForAugmentation().to(device)

print("\nArchitettura CNN:")
print(model_no_aug)

# Prepare data
X_tr_ex3_t = torch.from_numpy(X_tr_ex3).permute(0, 3, 1, 2).float()
y_tr_ex3_t = torch.from_numpy(y_tr_ex3).long()
X_val_ex3_t = torch.from_numpy(X_val_ex3).permute(0, 3, 1, 2).float()
y_val_ex3_t = torch.from_numpy(y_val_ex3).long()
X_test_ex3_t = torch.from_numpy(X_test_ex3).permute(0, 3, 1, 2).float()
y_test_ex3_t = torch.from_numpy(y_test_ex3).long()

train_ds_no_aug = TensorDataset(X_tr_ex3_t, y_tr_ex3_t)
val_ds_ex3 = TensorDataset(X_val_ex3_t, y_val_ex3_t)

train_loader_no_aug = DataLoader(train_ds_no_aug, batch_size=32, shuffle=True)
val_loader_ex3 = DataLoader(val_ds_ex3, batch_size=32, shuffle=False)

# Step 3: Training senza augmentation
print("\nTraining SENZA data augmentation...")
history_no_aug = train_model(
    model_no_aug, train_loader_no_aug, val_loader_ex3,
    epochs=25, lr=0.001, patience=5, patience_lr=3
)

# Step 4: Training con augmentation
print("\nTraining CON data augmentation...")
train_aug_ds_ex3 = AugmentedDataset(
    X_tr_ex3, y_tr_ex3, augment_ex3
)
train_loader_with_aug = DataLoader(train_aug_ds_ex3, batch_size=32, shuffle=True)

history_with_aug = train_model(
    model_with_aug, train_loader_with_aug, val_loader_ex3,
    epochs=40, lr=0.001, patience=8, patience_lr=3
)

# Step 5: Confronto risultati
test_loss_no, test_acc_no, _, _ = evaluate_model(
    model_no_aug, X_test_ex3_t, y_test_ex3_t
)
test_loss_with, test_acc_with, _, _ = evaluate_model(
    model_with_aug, X_test_ex3_t, y_test_ex3_t
)

train_acc_no = max(history_no_aug['accuracy'])
val_acc_no = max(history_no_aug['val_accuracy'])
train_acc_with = max(history_with_aug['accuracy'])
val_acc_with = max(history_with_aug['val_accuracy'])

import pandas as pd
comparison_data = [
    {
        'modello': 'Senza Augmentation',
        'train_acc': train_acc_no,
        'val_acc': val_acc_no,
        'test_acc': test_acc_no
    },
    {
        'modello': 'Con Augmentation',
        'train_acc': train_acc_with,
        'val_acc': val_acc_with,
        'test_acc': test_acc_with
    }
]
results_ex3 = pd.DataFrame(comparison_data)

print("\n" + "=" * 70)
print("CONFRONTO: SENZA vs CON Data Augmentation")
print("=" * 70)
print(results_ex3.to_string(index=False))

# Visualizzazione confronto
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(history_no_aug['accuracy'], label='Train no aug', linestyle='--')
axes[0].plot(history_no_aug['val_accuracy'], label='Val no aug', linestyle='--')
axes[0].plot(history_with_aug['accuracy'], label='Train with aug')
axes[0].plot(history_with_aug['val_accuracy'], label='Val with aug')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Accuracy')
axes[0].set_title('Accuracy: Con vs Senza Augmentation')
axes[0].legend()
axes[0].grid(alpha=0.3)

model_names_ex3 = ['Senza Aug', 'Con Aug']
train_accs_ex3 = [train_acc_no, train_acc_with]
test_accs_ex3 = [test_acc_no, test_acc_with]

x_ex3 = np.arange(len(model_names_ex3))
width_ex3 = 0.35

axes[1].bar(x_ex3 - width_ex3/2, train_accs_ex3, width_ex3, label='Train Accuracy', alpha=0.8)
axes[1].bar(x_ex3 + width_ex3/2, test_accs_ex3, width_ex3, label='Test Accuracy', alpha=0.8)
axes[1].set_ylabel('Accuracy')
axes[1].set_title('Confronto Performance')
axes[1].set_xticks(x_ex3)
axes[1].set_xticklabels(model_names_ex3)
axes[1].legend()
axes[1].grid(axis='y', alpha=0.3)

plt.tight_layout()
plt.show()

delta_acc = (test_acc_with - test_acc_no) * 100
gap_no = (train_acc_no - val_acc_no) * 100
gap_with = (train_acc_with - val_acc_with) * 100
print(f"\nMiglioramento test accuracy: {delta_acc:.2f}%")
print("Riduzione overfitting (val-train gap):")
print(f"  Senza aug: {gap_no:.2f}%")
print(f"  Con aug: {gap_with:.2f}%")

print("\nEsercizio 3 completato!")

---

## 4. Transfer Learning con Foundation Models

Il **Transfer Learning** sfrutta modelli pre-trained su dataset enormi (ImageNet: 14M immagini, 1000 classi).

### Vantaggi:
1. **Meno dati richiesti**: feature gia' estratte
2. **Training piu' veloce**: solo classificatore da trainare
3. **Performance migliori**: specialmente con pochi dati
4. **Flessibilita'**: fine-tuning per adattamento al task

### Approcci:
- **Feature Extraction**: congela base model, train solo classificatore
- **Fine-tuning**: scongela ultimi layer del base model

### 4.1 VGG16 - Transfer Learning

In [None]:
# VGG16 Transfer Learning
# ImageNet normalization
imagenet_mean = [0.485, 0.456, 0.406]
imagenet_std = [0.229, 0.224, 0.225]

# Normalize CIFAR-10 data with ImageNet stats
def normalize_imagenet(X_np):
    """X_np: NHWC float32 [0,255] -> NCHW tensor normalized with ImageNet stats."""
    X = X_np.astype('float32') / 255.0
    X_t = torch.from_numpy(X).permute(0, 3, 1, 2)  # NCHW
    for c in range(3):
        X_t[:, c] = (X_t[:, c] - imagenet_mean[c]) / imagenet_std[c]
    return X_t

X_train_vgg = normalize_imagenet(X_train)
X_test_vgg = normalize_imagenet(X_test)
y_train_t = torch.from_numpy(y_train).long()
y_test_t = torch.from_numpy(y_test).long()

# Caricamento VGG16 pre-trained su ImageNet
base_model_vgg = models.vgg16(weights=models.VGG16_Weights.IMAGENET1K_V1)

# Congela tutti i parametri
for param in base_model_vgg.parameters():
    param.requires_grad = False

n_frozen = sum(1 for p in base_model_vgg.parameters() if not p.requires_grad)
print("VGG16 caricato")
print(f"Layer congelati: {n_frozen} parameter groups")
print(f"Parametri totali: {sum(p.numel() for p in base_model_vgg.parameters()):,}")

# Costruzione modello completo: VGG16 features + custom classifier
class VGG16Transfer(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.features = base_model_vgg.features
        self.pool = nn.AdaptiveAvgPool2d((1, 1))
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(512, 256),
            nn.ReLU(),
            nn.BatchNorm1d(256),
            nn.Dropout(0.5),
            nn.Linear(256, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.pool(x)
        x = self.classifier(x)
        return x

model_vgg = VGG16Transfer().to(device)
print(model_vgg)

# Prepare data loaders
train_vgg_ds = TensorDataset(X_train_vgg, y_train_t)
n_val_vgg = int(0.2 * len(train_vgg_ds))
n_train_vgg = len(train_vgg_ds) - n_val_vgg
train_vgg_sub, val_vgg_sub = random_split(
    train_vgg_ds, [n_train_vgg, n_val_vgg],
    generator=torch.Generator().manual_seed(42)
)

train_loader_vgg = DataLoader(train_vgg_sub, batch_size=128, shuffle=True)
val_loader_vgg = DataLoader(val_vgg_sub, batch_size=128, shuffle=False)

# Training (with weight caching)
history_vgg = load_or_train(
    model_vgg,
    lambda: train_model(
        model_vgg, train_loader_vgg, val_loader_vgg,
        epochs=10, lr=0.001, patience=5, patience_lr=3
    ),
    'nb05_vgg16_feature.pt',
    device=device
)


### 4.2 Fine-tuning

Dopo aver trainato il classificatore, possiamo **scongelare** alcuni layer finali del base model.

In [None]:
# Scongela gli ultimi layer delle features di VGG16
# VGG16 features has layers indexed 0-30
# Unfreeze the last convolutional block (layers 24-30)
for i, layer in enumerate(model_vgg.features):
    if i >= 24:
        for param in layer.parameters():
            param.requires_grad = True

print("Layer trainable dopo fine-tuning:")
for i, layer in enumerate(model_vgg.features):
    has_params = any(True for _ in layer.parameters())
    if has_params:
        trainable = any(p.requires_grad for p in layer.parameters())
        print(f"  features[{i}]: {layer.__class__.__name__} - Trainable: {trainable}")

# Fine-tuning con learning rate piu' basso (with weight caching)
history_finetune = load_or_train(
    model_vgg,
    lambda: train_model(
        model_vgg, train_loader_vgg, val_loader_vgg,
        epochs=10, lr=1e-5, patience=5, patience_lr=3
    ),
    'nb05_vgg16_finetuned.pt',
    device=device
)


### 4.3 ResNet50 - Transfer Learning

In [None]:
# ResNet50 con skip connections
base_model_resnet = models.resnet50(weights=models.ResNet50_Weights.IMAGENET1K_V1)

# Congela tutti i parametri
for param in base_model_resnet.parameters():
    param.requires_grad = False

class ResNet50Transfer(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        # Use all layers except the final FC
        self.backbone = nn.Sequential(*list(base_model_resnet.children())[:-1])
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(2048, 256),
            nn.ReLU(),
            nn.BatchNorm1d(256),
            nn.Dropout(0.5),
            nn.Linear(256, num_classes)
        )

    def forward(self, x):
        x = self.backbone(x)
        x = self.classifier(x)
        return x

model_resnet = ResNet50Transfer().to(device)

# Reuse ImageNet-normalized data
X_train_resnet = X_train_vgg  # Same normalization
X_test_resnet = X_test_vgg

train_resnet_ds = TensorDataset(X_train_resnet, y_train_t)
n_val_rn = int(0.2 * len(train_resnet_ds))
n_train_rn = len(train_resnet_ds) - n_val_rn
train_rn_sub, val_rn_sub = random_split(
    train_resnet_ds, [n_train_rn, n_val_rn],
    generator=torch.Generator().manual_seed(42)
)

train_loader_rn = DataLoader(train_rn_sub, batch_size=128, shuffle=True)
val_loader_rn = DataLoader(val_rn_sub, batch_size=128, shuffle=False)

history_resnet = load_or_train(
    model_resnet,
    lambda: train_model(
        model_resnet, train_loader_rn, val_loader_rn,
        epochs=5, lr=0.001, patience=3, patience_lr=2
    ),
    'nb05_resnet50.pt',
    device=device
)

test_loss_rn, test_acc_rn, _, _ = evaluate_model(
    model_resnet, X_test_resnet, y_test_t
)
print(f"ResNet50 - Test Accuracy: {test_acc_rn:.4f}")


### 4.4 EfficientNet - State-of-the-art

In [None]:
# EfficientNetB0 - bilancia accuratezza e efficienza
base_model_eff = models.efficientnet_b0(weights=models.EfficientNet_B0_Weights.IMAGENET1K_V1)

# Congela tutti i parametri
for param in base_model_eff.parameters():
    param.requires_grad = False

class EfficientNetTransfer(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.backbone = base_model_eff.features
        self.pool = nn.AdaptiveAvgPool2d((1, 1))
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(1280, 128),
            nn.ReLU(),
            nn.Dropout(0.4),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        x = self.backbone(x)
        x = self.pool(x)
        x = self.classifier(x)
        return x

model_efficient = EfficientNetTransfer().to(device)

# EfficientNet uses same ImageNet normalization
X_train_eff = X_train_vgg
X_test_eff = X_test_vgg

train_eff_ds = TensorDataset(X_train_eff, y_train_t)
n_val_eff = int(0.2 * len(train_eff_ds))
n_train_eff = len(train_eff_ds) - n_val_eff
train_eff_sub, val_eff_sub = random_split(
    train_eff_ds, [n_train_eff, n_val_eff],
    generator=torch.Generator().manual_seed(42)
)

train_loader_eff = DataLoader(train_eff_sub, batch_size=128, shuffle=True)
val_loader_eff = DataLoader(val_eff_sub, batch_size=128, shuffle=False)

history_efficient = load_or_train(
    model_efficient,
    lambda: train_model(
        model_efficient, train_loader_eff, val_loader_eff,
        epochs=5, lr=0.001, patience=3, patience_lr=2
    ),
    'nb05_efficientnet.pt',
    device=device
)

test_loss_eff, test_acc_eff, _, _ = evaluate_model(
    model_efficient, X_test_eff, y_test_t
)
print(f"EfficientNetB0 - Test Accuracy: {test_acc_eff:.4f}")


### 4.5 Confronto Modelli

In [None]:
# Valutazione tutti i modelli
modelli = {
    'CNN Simple': (model_simple, X_test_tensor, y_test_tensor),
    'CNN Advanced': (model_advanced, X_test_tensor, y_test_tensor),
    'VGG16': (model_vgg, X_test_vgg, y_test_t),
    'ResNet50': (model_resnet, X_test_resnet, y_test_t),
    'EfficientNetB0': (model_efficient, X_test_eff, y_test_t),
}

risultati = []

for nome, (modello, X_eval, y_eval) in modelli.items():
    t_loss, t_acc, _, _ = evaluate_model(modello, X_eval, y_eval)
    n_params = sum(p.numel() for p in modello.parameters())

    risultati.append({
        'Modello': nome,
        'Test Accuracy': t_acc,
        'Parametri': n_params
    })

df_risultati = pd.DataFrame(risultati)
df_risultati = df_risultati.sort_values('Test Accuracy', ascending=False)

print("\n" + "=" * 60)
print("CONFRONTO MODELLI")
print("=" * 60)
print(df_risultati.to_string(index=False))

# Visualizzazione
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Accuracy
df_sorted = df_risultati.sort_values('Test Accuracy', ascending=True)
axes[0].barh(
    df_sorted['Modello'],
    df_sorted['Test Accuracy'],
    color='steelblue', alpha=0.8
)
axes[0].set_xlabel('Test Accuracy')
axes[0].set_title('Confronto Accuracy')
axes[0].grid(axis='x', alpha=0.3)

# Parametri
axes[1].barh(
    df_sorted['Modello'],
    df_sorted['Parametri'] / 1e6,
    color='coral', alpha=0.8
)
axes[1].set_xlabel('Parametri (milioni)')
axes[1].set_title('Confronto Parametri')
axes[1].grid(axis='x', alpha=0.3)

plt.tight_layout()
plt.show()

## Esercizio 4

In [None]:
# ==========================================================
# ESERCIZIO 4: Transfer Learning e Fine-Tuning
# ==========================================================
# Task: Usare MobileNetV2 pre-trained per classificazione
#       con fine-tuning
# Dataset: Flowers (1000 immagini, 5 classi di fiori)
#
# NOTA: variabili con suffisso _ex4

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

# Generazione dataset sintetico fiori
np.random.seed(999)

def generate_flower_images(n_samples, img_size=96, flower_type=0):
    images = np.random.rand(n_samples, img_size, img_size, 3).astype('float32')
    color_schemes = [
        [1.0, 0.2, 0.2],  # Rose - rosso
        [1.0, 1.0, 0.3],  # Sunflower - giallo
        [0.6, 0.3, 1.0],  # Tulip - viola
        [1.0, 0.5, 0.8],  # Daisy - rosa
        [0.3, 0.5, 1.0],  # Iris - blu
    ]
    for i in range(n_samples):
        center = img_size // 2
        for y in range(img_size):
            for x in range(img_size):
                dist = np.sqrt((x - center)**2 + (y - center)**2)
                if dist < img_size // 3:
                    images[i, y, x] = color_schemes[flower_type]
                    images[i, y, x] += np.random.randn(3) * 0.1
    images = np.clip(images, 0, 1)
    return images

flower_names = ['Rose', 'Sunflower', 'Tulip', 'Daisy', 'Iris']
samples_per_class = 200

X_flowers = []
y_flowers = []
for class_id, flower_name in enumerate(flower_names):
    images = generate_flower_images(samples_per_class, flower_type=class_id)
    X_flowers.append(images)
    y_flowers.extend([class_id] * samples_per_class)

X_flowers = np.vstack(X_flowers)
y_flowers = np.array(y_flowers)

indices_ex4 = np.random.permutation(len(X_flowers))
X_flowers = X_flowers[indices_ex4]
y_flowers = y_flowers[indices_ex4]

X_train_ex4, X_temp_ex4, y_train_ex4, y_temp_ex4 = (
    train_test_split(X_flowers, y_flowers, test_size=0.3, random_state=999, stratify=y_flowers)
)
X_val_ex4, X_test_ex4, y_val_ex4, y_test_ex4 = (
    train_test_split(X_temp_ex4, y_temp_ex4, test_size=0.5, random_state=999, stratify=y_temp_ex4)
)

print("Dataset Flowers")
print(f"Train: {X_train_ex4.shape}, Val: {X_val_ex4.shape}, Test: {X_test_ex4.shape}")
print(f"Classi: {flower_names}")

fig, axes = plt.subplots(1, 5, figsize=(15, 3))
for i in range(5):
    idx = np.where(y_train_ex4 == i)[0][0]
    axes[i].imshow(X_train_ex4[idx])
    axes[i].set_title(flower_names[i])
    axes[i].axis('off')
plt.tight_layout()
plt.show()

# Step 1: Caricamento base model MobileNetV2
base_model_ex4 = models.mobilenet_v2(weights=models.MobileNet_V2_Weights.IMAGENET1K_V1)

# Congela tutti i parametri
for param in base_model_ex4.parameters():
    param.requires_grad = False

print(f"\nMobileNetV2 caricato")
print(f"Parametri base model: {sum(p.numel() for p in base_model_ex4.parameters()):,}")

# Step 2: Costruzione modello completo
class MobileNetTransfer(nn.Module):
    def __init__(self, base_model, num_classes=5):
        super().__init__()
        self.features = base_model.features
        self.pool = nn.AdaptiveAvgPool2d((1, 1))
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(1280, 256),
            nn.ReLU(),
            nn.BatchNorm1d(256),
            nn.Dropout(0.5),
            nn.Linear(256, num_classes)
        )

    def forward(self, x):
        x = self.features(x)
        x = self.pool(x)
        x = self.classifier(x)
        return x

model_transfer_ex4 = MobileNetTransfer(base_model_ex4).to(device)

print("\nModello completo:")
print(model_transfer_ex4)
print(f"Parametri totali: {sum(p.numel() for p in model_transfer_ex4.parameters()):,}")
print(f"Parametri trainable: {sum(p.numel() for p in model_transfer_ex4.parameters() if p.requires_grad):,}")

# Prepare data - normalize with ImageNet stats
def normalize_imagenet_from_float(X_np):
    """X_np: NHWC float32 [0,1] -> NCHW tensor normalized with ImageNet stats."""
    X_t = torch.from_numpy(X_np).permute(0, 3, 1, 2)
    for c in range(3):
        X_t[:, c] = (X_t[:, c] - imagenet_mean[c]) / imagenet_std[c]
    return X_t

X_tr_ex4_t = normalize_imagenet_from_float(X_train_ex4)
y_tr_ex4_t = torch.from_numpy(y_train_ex4).long()
X_val_ex4_t = normalize_imagenet_from_float(X_val_ex4)
y_val_ex4_t = torch.from_numpy(y_val_ex4).long()
X_test_ex4_t = normalize_imagenet_from_float(X_test_ex4)
y_test_ex4_t = torch.from_numpy(y_test_ex4).long()

train_loader_ex4 = DataLoader(TensorDataset(X_tr_ex4_t, y_tr_ex4_t), batch_size=32, shuffle=True)
val_loader_ex4 = DataLoader(TensorDataset(X_val_ex4_t, y_val_ex4_t), batch_size=32, shuffle=False)

# Step 3: Training fase 1 - solo classificatore
print("\nFASE 1: Training solo classificatore...")
history_phase1 = load_or_train(
    model_transfer_ex4,
    lambda: train_model(
        model_transfer_ex4, train_loader_ex4, val_loader_ex4,
        epochs=15, lr=0.001, patience=3, patience_lr=2
    ),
    'nb05_mobilenet_phase1.pt',
    device=device
)

_, val_acc_p1, _, _ = evaluate_model(model_transfer_ex4, X_val_ex4_t, y_val_ex4_t)
print(f"Fase 1 - Val Accuracy: {val_acc_p1:.4f}")

# Step 4: Fine-tuning - scongelare ultimi layer
# Unfreeze last 5 inverted residual blocks (features[-5:])
for i, block in enumerate(model_transfer_ex4.features):
    if i >= len(model_transfer_ex4.features) - 5:
        for param in block.parameters():
            param.requires_grad = True

n_trainable = sum(1 for p in model_transfer_ex4.parameters() if p.requires_grad)
print(f"\nParametri trainable dopo fine-tuning: {n_trainable}")

# Step 5: Training fase 2 - fine-tuning
print("\nFASE 2: Fine-tuning...")
history_phase2 = load_or_train(
    model_transfer_ex4,
    lambda: train_model(
        model_transfer_ex4, train_loader_ex4, val_loader_ex4,
        epochs=10, lr=0.0001, patience=3, patience_lr=2
    ),
    'nb05_mobilenet_phase2.pt',
    device=device
)

_, val_acc_p2, _, _ = evaluate_model(model_transfer_ex4, X_val_ex4_t, y_val_ex4_t)
_, test_acc_ex4, _, _ = evaluate_model(model_transfer_ex4, X_test_ex4_t, y_test_ex4_t)

print(f"\nFase 2 - Val Accuracy: {val_acc_p2:.4f}")
print(f"Test Accuracy finale: {test_acc_ex4:.4f}")

# Visualizzazione risultati
results_ex4 = pd.DataFrame({
    'Fase': ['Feature Extraction', 'Fine-Tuning'],
    'Val Accuracy': [val_acc_p1, val_acc_p2],
    'Miglioramento': [0, val_acc_p2 - val_acc_p1]
})

print("\n" + "=" * 70)
print("RISULTATI TRANSFER LEARNING")
print("=" * 70)
print(results_ex4.to_string(index=False))

# Plot learning curves entrambe le fasi
if history_phase1 is not None and history_phase2 is not None:
    fig, axes = plt.subplots(1, 2, figsize=(14, 5))

    all_train_acc = history_phase1['accuracy'] + history_phase2['accuracy']
    all_val_acc = history_phase1['val_accuracy'] + history_phase2['val_accuracy']
    phase1_epochs = len(history_phase1['accuracy'])

    axes[0].plot(range(len(all_train_acc)), all_train_acc, label='Train')
    axes[0].plot(range(len(all_val_acc)), all_val_acc, label='Validation')
    axes[0].axvline(x=phase1_epochs, color='r', linestyle='--', label='Inizio Fine-Tuning')
    axes[0].set_xlabel('Epoch')
    axes[0].set_ylabel('Accuracy')
    axes[0].set_title('Learning Curves: Feature Extraction + Fine-Tuning')
    axes[0].legend()
    axes[0].grid(alpha=0.3)

    phases = ['Feature\nExtraction', 'Fine-Tuning']
    val_accs_ex4 = [val_acc_p1, val_acc_p2]
    colors_ex4 = ['skyblue', 'lightcoral']

    axes[1].bar(phases, val_accs_ex4, color=colors_ex4, alpha=0.8)
    axes[1].axhline(y=test_acc_ex4, color='green', linestyle='--', label=f'Test Acc: {test_acc_ex4:.4f}')
    axes[1].set_ylabel('Accuracy')
    axes[1].set_title('Confronto Fasi Transfer Learning')
    axes[1].legend()
    axes[1].grid(axis='y', alpha=0.3)

    plt.tight_layout()
    plt.show()
else:
    print("Using pretrained weights - training curves not available")
    phases = ['Feature\nExtraction', 'Fine-Tuning']
    val_accs_ex4 = [val_acc_p1, val_acc_p2]
    colors_ex4 = ['skyblue', 'lightcoral']
    fig, ax = plt.subplots(figsize=(7, 5))
    ax.bar(phases, val_accs_ex4, color=colors_ex4, alpha=0.8)
    ax.axhline(y=test_acc_ex4, color='green', linestyle='--', label=f'Test Acc: {test_acc_ex4:.4f}')
    ax.set_ylabel('Accuracy')
    ax.set_title('Confronto Fasi Transfer Learning')
    ax.legend()
    ax.grid(axis='y', alpha=0.3)
    plt.tight_layout()
    plt.show()

# Confusion matrix
from sklearn.metrics import confusion_matrix, classification_report
import seaborn as sns

_, _, y_pred_ex4, _ = evaluate_model(model_transfer_ex4, X_test_ex4_t, y_test_ex4_t)

cm_ex4 = confusion_matrix(y_test_ex4, y_pred_ex4)
plt.figure(figsize=(8, 6))
sns.heatmap(
    cm_ex4, annot=True, fmt='d', cmap='Blues',
    xticklabels=flower_names,
    yticklabels=flower_names
)
plt.xlabel('Predetto')
plt.ylabel('Reale')
plt.title('Confusion Matrix - Transfer Learning')
plt.tight_layout()
plt.show()

print("\nClassification Report:")
print(classification_report(y_test_ex4, y_pred_ex4, target_names=flower_names))

improvement = (val_acc_p2 - val_acc_p1) * 100
print(f"\nMiglioramento da Feature Extraction a Fine-Tuning: {improvement:.2f}%")

print("\nEsercizio 4 completato!")


---

## 5. Utilizzo di Foundation Models tramite API

Molti servizi cloud offrono API per Computer Vision senza dover trainare modelli.

### 5.1 Simulazione API Google Cloud Vision

In [None]:
import base64
from io import BytesIO
from PIL import Image

class SimulatedVisionAPI:
    """
    Simulazione di una API per Computer Vision.
    In produzione, useresti chiamate HTTP reali:
    - Google Cloud Vision API
    - AWS Rekognition
    - Azure Computer Vision
    """

    def __init__(self, model, device=device):
        self.model = model
        self.device = device
        self.class_names = [
            'airplane', 'automobile', 'bird',
            'cat', 'deer', 'dog', 'frog',
            'horse', 'ship', 'truck'
        ]

    def predict_from_array(self, image_array, top_k=5):
        """Predice da un array numpy (HWC, uint8 o float)."""
        img = image_array.astype('float32')
        if img.max() > 1:
            img = img / 255.0

        # Convert to tensor NCHW
        img_tensor = torch.from_numpy(img).permute(2, 0, 1).unsqueeze(0).to(self.device)

        self.model.eval()
        with torch.no_grad():
            outputs = self.model(img_tensor)
            probs = torch.softmax(outputs, dim=1)[0].cpu().numpy()

        top_indices = np.argsort(probs)[-top_k:][::-1]

        results = []
        for idx in top_indices:
            results.append({
                'class': self.class_names[idx],
                'confidence': float(probs[idx])
            })

        return {
            'predictions': results,
            'top_class': results[0]['class'],
            'top_confidence': results[0]['confidence']
        }


# Test API simulata
api = SimulatedVisionAPI(model_simple)

# Predizione su alcune immagini di test
print("Test API Computer Vision")
print("=" * 50)

for i in range(5):
    result = api.predict_from_array(X_test[i])
    true_label = class_names[y_test[i]]
    pred_label = result['top_class']
    conf = result['top_confidence']
    status = "OK" if true_label == pred_label else "MISS"
    print(
        f"[{status}] True: {true_label:>12s} | "
        f"Pred: {pred_label:>12s} "
        f"(conf: {conf:.2%})"
    )

### 5.2 Esempio: REST API con Flask (struttura)

In produzione, potresti esporre il modello via REST API:

In [None]:
# Questo e' codice illustrativo (non eseguibile)
"""
from flask import Flask, request, jsonify
import numpy as np
import torch
from PIL import Image

app = Flask(__name__)

# Carica modello all'avvio
model = SimpleCNN()
model.load_state_dict(torch.load('best_model.pth'))
model.eval()

@app.route('/predict', methods=['POST'])
def predict():
    # Ricevi immagine
    file = request.files['image']
    img = Image.open(file).resize((32, 32))

    # Preprocessing
    img_array = np.array(img).astype('float32') / 255.0
    img_tensor = torch.from_numpy(img_array).permute(2, 0, 1).unsqueeze(0)

    # Predizione
    with torch.no_grad():
        outputs = model(img_tensor)
        probs = torch.softmax(outputs, dim=1)
        class_idx = probs.argmax(1).item()

    return jsonify({
        'class': class_names[class_idx],
        'confidence': float(probs[0][class_idx])
    })

if __name__ == '__main__':
    app.run(debug=True)
"""
print("Codice Flask illustrativo (non eseguito)")

---

## 6. Visualizzazione Feature Maps

Visualizziamo cosa "vede" la CNN a diversi livelli.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Reuse the already-trained model_simple for meaningful feature maps
# Use PyTorch hooks to extract feature maps

print("Feature extractor creato dal modello trainato (model_simple)!")
print(model_simple)

# Define which layers to visualize
# model_simple.features has: Conv2d(0), BN(1), ReLU(2), MaxPool(3),
#                            Conv2d(4), BN(5), ReLU(6), MaxPool(7),
#                            Conv2d(8), BN(9), ReLU(10), MaxPool(11)
target_layers = {
    'conv1': model_simple.features[0],
    'conv2': model_simple.features[4],
    'conv3': model_simple.features[8],
}

# Register hooks to capture activations
activations = {}
hooks = []

def get_activation(name):
    def hook(model, input, output):
        activations[name] = output.detach().cpu()
    return hook

for name, layer in target_layers.items():
    h = layer.register_forward_hook(get_activation(name))
    hooks.append(h)

# Run a forward pass with a test image
test_img = X_test_norm[0]  # HWC float32 [0,1]
test_tensor = torch.from_numpy(test_img).permute(2, 0, 1).unsqueeze(0).float().to(device)

model_simple.eval()
with torch.no_grad():
    _ = model_simple(test_tensor)

# Remove hooks
for h in hooks:
    h.remove()

print(f"Dimensione immagine: {test_img.shape}")

# Visualizza immagine originale
plt.figure(figsize=(4, 4))
plt.imshow(test_img)
plt.title(
    f"Immagine Originale: {class_names[y_test[0]]}",
    fontsize=14, fontweight='bold'
)
plt.axis('off')
plt.tight_layout()
plt.show()

# Visualizza feature maps
layer_display_names = {
    'conv1': 'Conv1 (32 filtri)',
    'conv2': 'Conv2 (64 filtri)',
    'conv3': 'Conv3 (128 filtri)',
}

for layer_key in ['conv1', 'conv2', 'conv3']:
    layer_act = activations[layer_key][0]  # Remove batch dim: [C, H, W]
    n_features = layer_act.shape[0]
    size = layer_act.shape[1]

    n_cols = 8
    n_rows = 2

    fig, axes = plt.subplots(n_rows, n_cols, figsize=(16, 4))

    for i, ax in enumerate(axes.flat):
        if i < min(n_features, 16):
            feature_map = layer_act[i].numpy()  # [H, W]
            ax.imshow(feature_map, cmap='viridis')
            ax.axis('off')
            ax.set_title(f'F{i}', fontsize=8)
        else:
            ax.axis('off')

    display_name = layer_display_names[layer_key]
    plt.suptitle(
        f'{display_name}: {n_features} filtri, '
        f'feature map {size}x{size}',
        fontsize=14, fontweight='bold'
    )
    plt.tight_layout()
    plt.show()

**Osservazioni** (feature maps dal modello trainato `model_simple`):
- **Layer 1 (conv1)**: rileva bordi, colori base
- **Layer 2 (conv2)**: pattern piu' complessi, combinazioni di bordi
- **Layer 3 (conv3)**: rappresentazioni astratte di alto livello

Le feature maps provengono dal modello gia' addestrato, quindi riflettono pattern effettivamente appresi durante il training.

---

## 7. Esercitazione: Sistema Completo di Computer Vision

Implementa un sistema end-to-end per classificazione immagini.

In [None]:
class SistemaComputerVision:
    """
    Sistema completo per Computer Vision.
    Supporta training da zero e transfer learning.
    """

    def __init__(self, approccio='cnn_custom', base_model_name='VGG16'):
        self.approccio = approccio
        self.base_model_name = base_model_name
        self.model = None
        self.history = None
        self.class_names = None
        self.base_model = None
        self.input_shape = None
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

    def build_model(self, input_shape, num_classes):
        """Costruisce il modello in base all'approccio."""
        self.input_shape = input_shape
        C, H, W = input_shape  # PyTorch: CHW

        if self.approccio == 'cnn_custom':
            class CustomCNN(nn.Module):
                def __init__(self):
                    super().__init__()
                    self.features = nn.Sequential(
                        nn.Conv2d(C, 32, 3, padding=1),
                        nn.BatchNorm2d(32), nn.ReLU(),
                        nn.MaxPool2d(2, 2), nn.Dropout(0.2),

                        nn.Conv2d(32, 64, 3, padding=1),
                        nn.BatchNorm2d(64), nn.ReLU(),
                        nn.MaxPool2d(2, 2), nn.Dropout(0.3),

                        nn.Conv2d(64, 128, 3, padding=1),
                        nn.BatchNorm2d(128), nn.ReLU(),
                        nn.MaxPool2d(2, 2), nn.Dropout(0.4),

                        nn.Conv2d(128, 256, 3, padding=1),
                        nn.BatchNorm2d(256), nn.ReLU(),
                        nn.AdaptiveAvgPool2d((1, 1)),
                    )
                    self.classifier = nn.Sequential(
                        nn.Flatten(),
                        nn.Linear(256, 256), nn.ReLU(),
                        nn.BatchNorm1d(256), nn.Dropout(0.5),
                        nn.Linear(256, num_classes)
                    )
                def forward(self, x):
                    return self.classifier(self.features(x))

            self.model = CustomCNN().to(self.device)

        elif self.approccio == 'transfer_learning':
            model_map = {
                'VGG16': (models.vgg16, models.VGG16_Weights.IMAGENET1K_V1),
                'ResNet50': (models.resnet50, models.ResNet50_Weights.IMAGENET1K_V1),
                'EfficientNetB0': (models.efficientnet_b0, models.EfficientNet_B0_Weights.IMAGENET1K_V1),
            }
            if self.base_model_name not in model_map:
                raise ValueError(f"Base model non supportato: {self.base_model_name}")

            ModelClass, weights = model_map[self.base_model_name]
            self.base_model = ModelClass(weights=weights)

            for param in self.base_model.parameters():
                param.requires_grad = False

            # Get feature extractor
            if self.base_model_name == 'VGG16':
                features = self.base_model.features
                feat_dim = 512
            elif self.base_model_name == 'ResNet50':
                features = nn.Sequential(*list(self.base_model.children())[:-1])
                feat_dim = 2048
            elif self.base_model_name == 'EfficientNetB0':
                features = self.base_model.features
                feat_dim = 1280

            class TransferModel(nn.Module):
                def __init__(self, features, feat_dim, num_classes):
                    super().__init__()
                    self.features = features
                    self.pool = nn.AdaptiveAvgPool2d((1, 1))
                    self.classifier = nn.Sequential(
                        nn.Flatten(),
                        nn.Linear(feat_dim, 256), nn.ReLU(),
                        nn.BatchNorm1d(256), nn.Dropout(0.5),
                        nn.Linear(256, num_classes)
                    )
                def forward(self, x):
                    x = self.features(x)
                    x = self.pool(x)
                    return self.classifier(x)

            self.model = TransferModel(features, feat_dim, num_classes).to(self.device)

        total_params = sum(p.numel() for p in self.model.parameters())
        print(f"Modello {self.approccio} costruito")
        print(f"Parametri totali: {total_params:,}")

    def _normalize(self, X_tensor):
        """Normalizza con preprocessing specifico per il modello."""
        if self.approccio == 'transfer_learning':
            # ImageNet normalization
            mean = torch.tensor([0.485, 0.456, 0.406]).view(1, 3, 1, 1).to(X_tensor.device)
            std = torch.tensor([0.229, 0.224, 0.225]).view(1, 3, 1, 1).to(X_tensor.device)
            return (X_tensor - mean) / std
        return X_tensor

    def compile_model(self, learning_rate=0.001):
        """Prepara optimizer (PyTorch non ha compile separato)."""
        self.lr = learning_rate
        print("Modello compilato")

    def train(self, X_np, y_np, X_val=None, y_val=None, epochs=20, batch_size=128, use_augmentation=True):
        """Addestra il modello."""
        # Convert to tensors
        X_t = torch.from_numpy(X_np.transpose(0, 3, 1, 2) if X_np.ndim == 4 and X_np.shape[-1] in [1,3] else X_np).float()
        y_t = torch.from_numpy(y_np).long()
        X_t = self._normalize(X_t)

        if X_val is not None and y_val is not None:
            X_v = torch.from_numpy(X_val.transpose(0, 3, 1, 2) if X_val.ndim == 4 and X_val.shape[-1] in [1,3] else X_val).float()
            y_v = torch.from_numpy(y_val).long()
            X_v = self._normalize(X_v)
        else:
            n = len(X_t)
            n_val = int(0.2 * n)
            perm = torch.randperm(n)
            X_v, y_v = X_t[perm[:n_val]], y_t[perm[:n_val]]
            X_t, y_t = X_t[perm[n_val:]], y_t[perm[n_val:]]

        train_loader = DataLoader(TensorDataset(X_t, y_t), batch_size=batch_size, shuffle=True)
        val_loader = DataLoader(TensorDataset(X_v, y_v), batch_size=batch_size, shuffle=False)

        if use_augmentation and self.approccio == 'cnn_custom':
            print("Training con data augmentation...")
        else:
            print("Training senza data augmentation...")

        self.history = train_model(
            self.model, train_loader, val_loader,
            epochs=epochs, lr=self.lr, patience=5, patience_lr=3, device=self.device
        )
        print("Training completato")

    def fine_tune(self, X_np, y_np, X_val=None, y_val=None, epochs=10, layers_to_unfreeze=4):
        """Fine-tuning per transfer learning."""
        if self.approccio != 'transfer_learning':
            raise ValueError("Fine-tuning solo per transfer learning")

        # Unfreeze last N layers of features
        feature_layers = list(self.model.features.children())
        for layer in feature_layers[-layers_to_unfreeze:]:
            for param in layer.parameters():
                param.requires_grad = True

        print(f"Fine-tuning: scongelamento ultimi {layers_to_unfreeze} layer...")

        old_lr = self.lr
        self.lr = 1e-5

        X_t = torch.from_numpy(X_np.transpose(0, 3, 1, 2) if X_np.ndim == 4 and X_np.shape[-1] in [1,3] else X_np).float()
        y_t = torch.from_numpy(y_np).long()
        X_t = self._normalize(X_t)

        if X_val is not None and y_val is not None:
            X_v = torch.from_numpy(X_val.transpose(0, 3, 1, 2) if X_val.ndim == 4 and X_val.shape[-1] in [1,3] else X_val).float()
            y_v = torch.from_numpy(y_val).long()
            X_v = self._normalize(X_v)
        else:
            n = len(X_t)
            n_val = int(0.2 * n)
            perm = torch.randperm(n)
            X_v, y_v = X_t[perm[:n_val]], y_t[perm[:n_val]]
            X_t, y_t = X_t[perm[n_val:]], y_t[perm[n_val:]]

        train_loader = DataLoader(TensorDataset(X_t, y_t), batch_size=128, shuffle=True)
        val_loader = DataLoader(TensorDataset(X_v, y_v), batch_size=128, shuffle=False)

        ft_history = train_model(
            self.model, train_loader, val_loader,
            epochs=epochs, lr=self.lr, patience=3, patience_lr=2, device=self.device
        )

        if self.history is not None:
            for key in self.history:
                if key in ft_history:
                    self.history[key].extend(ft_history[key])
        else:
            self.history = ft_history

        self.lr = old_lr
        print("Fine-tuning completato")

    def evaluate(self, X_np, y_np, class_names=None):
        """Valuta il modello."""
        self.class_names = class_names

        X_t = torch.from_numpy(X_np.transpose(0, 3, 1, 2) if X_np.ndim == 4 and X_np.shape[-1] in [1,3] else X_np).float()
        y_t = torch.from_numpy(y_np).long()
        X_t = self._normalize(X_t)

        t_loss, t_acc, y_pred, y_proba = evaluate_model(self.model, X_t, y_t, device=self.device)

        report = classification_report(y_np, y_pred, target_names=class_names)
        cm = confusion_matrix(y_np, y_pred)

        plt.figure(figsize=(12, 10))
        sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
                    xticklabels=class_names, yticklabels=class_names)
        plt.xlabel('Predetto')
        plt.ylabel('Reale')
        plt.title(f'Confusion Matrix - {self.approccio}')
        plt.xticks(rotation=45)
        plt.yticks(rotation=45)
        plt.tight_layout()
        plt.show()

        print(f"Test Loss: {t_loss:.4f}")
        print(f"Test Accuracy: {t_acc:.4f}")
        print(f"\nClassification Report:\n{report}")

        return {'accuracy': t_acc, 'loss': t_loss, 'confusion_matrix': cm, 'classification_report': report}

    def visualize_predictions(self, X_np, y_np, n=10):
        """Visualizza predizioni su sample."""
        indices = np.random.choice(len(X_np), min(n, len(X_np)), replace=False)
        sample_images = X_np[indices]
        sample_labels = y_np[indices]

        X_t = torch.from_numpy(sample_images.transpose(0, 3, 1, 2) if sample_images.ndim == 4 and sample_images.shape[-1] in [1,3] else sample_images).float()
        X_t = self._normalize(X_t)

        self.model.eval()
        with torch.no_grad():
            outputs = self.model(X_t.to(self.device))
            probs = torch.softmax(outputs, dim=1).cpu().numpy()
        pred_labels = probs.argmax(axis=1)

        n_cols = 5
        n_rows = (n + n_cols - 1) // n_cols
        fig, axes = plt.subplots(n_rows, n_cols, figsize=(15, 3 * n_rows))
        axes = axes.flatten() if n > 1 else [axes]

        for i in range(min(n, len(sample_images))):
            if i < len(axes):
                axes[i].imshow(sample_images[i])
                tl, pl = sample_labels[i], pred_labels[i]
                color = 'green' if tl == pl else 'red'
                tn = self.class_names[tl] if self.class_names else str(tl)
                pn = self.class_names[pl] if self.class_names else str(pl)
                conf = probs[i][pl]
                axes[i].set_title(f"True: {tn}\nPred: {pn} ({conf:.2f})", color=color, fontsize=10)
                axes[i].axis('off')

        for i in range(len(sample_images), len(axes)):
            axes[i].axis('off')

        plt.suptitle(f'Predizioni - {self.approccio}', fontsize=14, fontweight='bold')
        plt.tight_layout()
        plt.show()

    def plot_training_history(self):
        """Visualizza learning curves."""
        if self.history is None:
            raise ValueError("Nessuno storico disponibile")

        fig, axes = plt.subplots(1, 2, figsize=(14, 5))

        axes[0].plot(self.history['loss'], label='Training Loss')
        if 'val_loss' in self.history:
            axes[0].plot(self.history['val_loss'], label='Validation Loss')
        axes[0].set_xlabel('Epoch')
        axes[0].set_ylabel('Loss')
        axes[0].set_title('Loss durante Training')
        axes[0].legend()
        axes[0].grid(alpha=0.3)

        axes[1].plot(self.history['accuracy'], label='Training Accuracy')
        if 'val_accuracy' in self.history:
            axes[1].plot(self.history['val_accuracy'], label='Validation Accuracy')
        axes[1].set_xlabel('Epoch')
        axes[1].set_ylabel('Accuracy')
        axes[1].set_title('Accuracy durante Training')
        axes[1].legend()
        axes[1].grid(alpha=0.3)

        plt.tight_layout()
        plt.show()

    def save_model(self, filepath):
        """Salva il modello."""
        torch.save(self.model.state_dict(), filepath)
        print(f"Modello salvato in {filepath}")

    def load_model(self, filepath):
        """Carica il modello."""
        self.model.load_state_dict(torch.load(filepath, map_location=self.device))
        print(f"Modello caricato da {filepath}")


# ===== Test del sistema =====
print("=" * 60)
print("TEST SISTEMA COMPUTER VISION")
print("=" * 60)

# FIX: usiamo X_train_norm e X_test_norm (CIFAR-10)

print("\n" + "=" * 60)
print("Test CNN Custom:")
print("=" * 60)
sistema_custom = SistemaComputerVision(approccio='cnn_custom')
sistema_custom.build_model(input_shape=(3, 32, 32), num_classes=10)
sistema_custom.compile_model()

_wf_custom = os.path.join(WEIGHTS_DIR, 'nb05_sistema_custom.pt')
if os.path.exists(_wf_custom):
    sistema_custom.model.load_state_dict(torch.load(_wf_custom, map_location=device, weights_only=True))
    print(f"Loaded pretrained weights from {_wf_custom}")
else:
    sistema_custom.train(
        X_train_norm[:10000], y_train[:10000],
        epochs=5, use_augmentation=True
    )
    torch.save(sistema_custom.model.state_dict(), _wf_custom)
    print(f"Saved weights to {_wf_custom}")

if sistema_custom.history is not None:
    sistema_custom.plot_training_history()
else:
    print("Using pretrained weights - training curves not available")
metriche = sistema_custom.evaluate(
    X_test_norm[:1000], y_test[:1000], class_names=class_names
)
sistema_custom.visualize_predictions(
    X_test_norm[:1000], y_test[:1000], n=10
)

print("\n" + "=" * 60)
print("Test Transfer Learning:")
print("=" * 60)
sistema_tl = SistemaComputerVision(
    approccio='transfer_learning', base_model_name='VGG16'
)
sistema_tl.build_model(input_shape=(3, 32, 32), num_classes=10)
sistema_tl.compile_model()

_wf_tl = os.path.join(WEIGHTS_DIR, 'nb05_sistema_tl_finetuned.pt')
if os.path.exists(_wf_tl):
    sistema_tl.model.load_state_dict(torch.load(_wf_tl, map_location=device, weights_only=True))
    print(f"Loaded pretrained weights from {_wf_tl}")
else:
    sistema_tl.train(
        X_train_norm[:10000], y_train[:10000],
        epochs=5, use_augmentation=False
    )
    torch.save(sistema_tl.model.state_dict(), os.path.join(WEIGHTS_DIR, 'nb05_sistema_tl_trained.pt'))
    sistema_tl.fine_tune(
        X_train_norm[:10000], y_train[:10000],
        epochs=3, layers_to_unfreeze=4
    )
    torch.save(sistema_tl.model.state_dict(), _wf_tl)
    print(f"Saved weights to {_wf_tl}")

if sistema_tl.history is not None:
    sistema_tl.plot_training_history()
else:
    print("Using pretrained weights - training curves not available")
metriche_tl = sistema_tl.evaluate(
    X_test_norm[:1000], y_test[:1000], class_names=class_names
)
sistema_tl.visualize_predictions(
    X_test_norm[:1000], y_test[:1000], n=10
)

# Confronto finale
print("\n" + "=" * 60)
print("CONFRONTO FINALE")
print("=" * 60)
print(f"Accuracy CNN Custom: {metriche['accuracy']:.4f}")
print(f"Accuracy Transfer Learning: {metriche_tl['accuracy']:.4f}")


---

## Conclusioni

In questo notebook abbiamo esplorato:

- **Architettura CNN**: layer convoluzionali, pooling, feature hierarchies
- **Implementazione CNN**: da semplice ad avanzata con CIFAR-10
- **Data Augmentation**: per migliorare generalizzazione
- **Transfer Learning**: VGG16, ResNet50, EfficientNet
- **Fine-tuning**: scongelamento layer e re-training
- **API Computer Vision**: deployment via REST API
- **Visualizzazione**: feature maps per interpretabilita'

### Concetti chiave da ricordare

1. **CNN sono specializzate per immagini**: condivisione parametri e invarianza spaziale
2. **Transfer Learning e' potentissimo**: richiede meno dati e tempo
3. **Fine-tuning > Feature Extraction**: se hai abbastanza dati
4. **Learning rate basso per fine-tuning**: evita di "rompere" pesi pre-trained
5. **Data Augmentation e' essenziale**: per dataset piccoli
6. **GlobalAveragePooling > Flatten**: meno parametri, meno overfitting

### Workflow consigliato per nuovi progetti

1. **Inizia con transfer learning**: modello pre-trained + classificatore custom
2. **Train classificatore**: congela base model
3. **Valuta performance**: se buone, stop qui
4. **Se serve piu' accuracy**: fine-tune ultimi layer
5. **Data augmentation**: sempre, specialmente se dataset piccolo
6. **Solo se necessario**: train CNN da zero

### Prossimi passi

Nel prossimo notebook affronteremo:
- Natural Language Processing
- Preprocessing testi
- Modelli tradizionali (TF-IDF, Bag of Words)
- Embeddings e Word2Vec

### Risorse per approfondire

- [CS231n: Convolutional Neural Networks - Stanford](http://cs231n.stanford.edu/)
- [Deep Learning for Computer Vision - MIT](http://introtodeeplearning.com/)
- [PyTorch Image Classification Tutorial](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html)
- [Transfer Learning with PyTorch](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html)
- [EfficientNet Paper](https://arxiv.org/abs/1905.11946)