# Leerbenaderingen in Computer Vision

Het trainen van computer vision modellen vereist specifieke strategieën en technieken die rekening houden met de unieke eigenschappen van visuele data. In dit notebook verkennen we verschillende leerbenaderingen, van traditioneel supervised learning tot geavanceerde self-supervised technieken.

## Supervised Learning

**Supervised learning** is de meest gebruikte benadering voor computer vision taken:

### Data Annotatie

Het proces van het labelen van trainingsgegevens:

- **Handmatige annotatie**: Experts labelen data met specifieke tools
- **Crowd-sourcing**: Platforms zoals Amazon Mechanical Turk
- **Semi-automatische**: Model-geassisteerde labeling

### Uitdagingen bij Annotatie

```python
# Voorbeelden van annotatie formaten
annotations = {
    'classification': ['cat', 'dog', 'bird', 'car'],
    'detection': [
        {'class': 'person', 'bbox': [x1, y1, x2, y2], 'confidence': 0.9},
        {'class': 'car', 'bbox': [x1, y1, x2, y2], 'confidence': 0.8}
    ],
    'segmentation': {
        'mask': np.array([[0, 1, 1, 0], [1, 1, 0, 0]]),  # Binary mask
        'class': 'person'
    }
}
```

### Training Process

Het supervised learning proces bestaat uit verschillende fasen:

1. **Data Preparation**: Laden en voorbewerken van data
2. **Model Initialisatie**: Starten met pre-trained weights of random initialisatie
3. **Forward Pass**: Voorspellingen genereren
4. **Loss Calculation**: Vergelijken met ground truth
5. **Backward Pass**: Gradienten berekenen
6. **Parameter Update**: Weights aanpassen

## Transfer Learning

**Transfer learning** benut pre-trained modellen voor nieuwe taken:

### Feature Extraction

Gebruik pre-trained modellen als feature extractors:

```python
import torchvision.models as models
import torch.nn as nn

# Laad pre-trained ResNet
resnet = models.resnet50(pretrained=True)

# Freeze alle lagen behalve de laatste
for param in resnet.parameters():
    param.requires_grad = False

# Vervang laatste laag voor nieuwe taak
num_ftrs = resnet.fc.in_features
resnet.fc = nn.Linear(num_ftrs, num_classes)

# Alleen de laatste laag trainen
optimizer = torch.optim.Adam(resnet.fc.parameters(), lr=0.001)
```

### Fine-tuning

Pas het hele model aan voor de nieuwe taak:

```python
# Fine-tuning strategieën
strategies = {
    'freeze_backbone': {
        'description': 'Freeze vroege lagen, train latere lagen',
        'lr_backbone': 0.0,
        'lr_head': 0.001
    },
    'discriminative_lr': {
        'description': 'Lagere learning rate voor vroege lagen',
        'lr_backbone': 1e-5,
        'lr_head': 1e-3
    },
    'full_finetune': {
        'description': 'Train alle lagen met zelfde learning rate',
        'lr_backbone': 1e-4,
        'lr_head': 1e-4
    }
}
```

## Self-Supervised Learning

**Self-supervised learning** leert zonder menselijke labels:

### Contrastive Learning

Leer door positieve en negatieve paren te vergelijken:

```python
# SimCLR implementatie
class SimCLR(nn.Module):
    def __init__(self, base_encoder, feature_dim=128):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder
        self.projection_head = nn.Sequential(
            nn.Linear(2048, 512),
            nn.ReLU(),
            nn.Linear(512, feature_dim)
        )
    
    def forward(self, x):
        features = self.encoder(x)
        features = torch.flatten(features, 1)
        projections = self.projection_head(features)
        return projections

# Data augmentaties voor contrastive learning
def get_augmentations():
    return nn.Sequential(
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ColorJitter(0.4, 0.4, 0.4, 0.1),
        transforms.RandomGrayscale(p=0.2),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], 
                           std=[0.229, 0.224, 0.225])
    )
```

### NT-Xent Loss

De **Normalized Temperature-scaled Cross Entropy** loss:

$$\ell_{i,j} = -\log\frac{\exp(\text{sim}(z_i, z_j)/\tau)}{\sum_{k=1}^{2N}\mathbb{1}_{k\neq i}\exp(\text{sim}(z_i, z_k)/\tau)}$$

### Masked Image Modeling

Voorspel gemaskeerde delen van afbeeldingen:

```python
# MAE: Masked Autoencoder
class MaskedAutoencoder(nn.Module):
    def __init__(self, encoder, decoder, mask_ratio=0.75):
        super().__init__()
        self.encoder = encoder
        self.decoder = decoder
        self.mask_ratio = mask_ratio
    
    def forward(self, x, mask=None):
        # Genereer random mask als niet opgegeven
        if mask is None:
            mask = self.generate_mask(x)
        
        # Encode alleen zichtbare patches
        encoded = self.encoder(x, ~mask)  # ~mask = visible patches
        
        # Decode naar originele resolutie
        reconstructed = self.decoder(encoded)
        
        return reconstructed, mask
```

## Data Augmentation

**Data augmentation** vergroot de trainingsdata kunstmatig:

### Basic Augmentations

- **Geometric**: Rotatie, spiegeling, cropping, scaling
- **Photometric**: Kleuraanpassingen, belichting, contrast
- **Occlusion**: Cutout, mixup, cutmix

### Advanced Techniques

```python
# CutMix implementatie
def cutmix_batch(images, labels, alpha=1.0):
    """
    CutMix: Combineert twee afbeeldingen en labels
    """
    batch_size = images.size(0)
    
    # Genereer random bounding box
    lam = np.random.beta(alpha, alpha)
    rand_index = torch.randperm(batch_size)
    
    bbx1, bby1, bbx2, bby2 = rand_bbox(images.size(), lam)
    
    # Combineer afbeeldingen
    images[:, :, bbx1:bbx2, bby1:bby2] = images[rand_index, :, bbx1:bbx2, bby1:bby2]
    
    # Combineer labels
    lam = 1 - ((bbx2 - bbx1) * (bby2 - bby1) / (images.size(-1) * images.size(-2)))
    mixed_labels = lam * labels + (1 - lam) * labels[rand_index]
    
    return images, mixed_labels
```

## Learning Rate Scheduling

**Learning rate scheduling** past de learning rate tijdens training aan:

### Warmup en Decay

```python
# Cosine annealing met warmup
def cosine_with_warmup(epoch, num_epochs, warmup_epochs=5, base_lr=0.001, min_lr=1e-6):
    if epoch < warmup_epochs:
        # Lineaire warmup
        return base_lr * (epoch + 1) / warmup_epochs
    else:
        # Cosine annealing
        progress = (epoch - warmup_epochs) / (num_epochs - warmup_epochs)
        return min_lr + 0.5 * (base_lr - min_lr) * (1 + np.cos(np.pi * progress))
```

### Adaptive Scheduling

- **ReduceLROnPlateau**: Verminder LR bij plateau in validatie loss
- **OneCycleLR**: Varieer LR cyclisch tijdens training
- **ExponentialLR**: Exponentiële decay van learning rate

## Regularisatie Technieken

### Weight Decay

L2 regularisatie op model parameters:

$$\mathcal{L}_{total} = \mathcal{L}_{task} + \lambda \|\theta\|_2^2$$

### Dropout

Random deactivatie van neuronen tijdens training:

```python
# Spatial dropout voor vision
class SpatialDropout(nn.Module):
    def __init__(self, p=0.5):
        super().__init__()
        self.dropout = nn.Dropout2d(p)
    
    def forward(self, x):
        return self.dropout(x)
```

### Batch Normalization

Normaliseer activaties per mini-batch:

$$\hat{x}_i = \frac{x_i - \mu_B}{\sqrt{\sigma_B^2 + \epsilon}} \cdot \gamma + \beta$$

## Multi-Task Learning

Train modellen op meerdere taken tegelijkertijd:

### Hard Parameter Sharing

Deel backbone tussen verschillende taken:

```python
class MultiTaskModel(nn.Module):
    def __init__(self, num_classes_task1, num_classes_task2):
        super().__init__()
        
        # Shared backbone
        self.backbone = nn.Sequential(
            nn.Conv2d(3, 64, 3, padding=1),
            nn.ReLU(),
            # ... meer lagen
        )
        
        # Task-specific heads
        self.classification_head = nn.Linear(512, num_classes_task1)
        self.detection_head = nn.Linear(512, num_classes_task2)
    
    def forward(self, x, task='classification'):
        features = self.backbone(x)
        features = torch.flatten(features, 1)
        
        if task == 'classification':
            return self.classification_head(features)
        elif task == 'detection':
            return self.detection_head(features)
```

### Uncertainty Weighting

Balanceer verschillende taken gebaseerd op uncertainty:

$$\mathcal{L}_{total} = \sum_{i=1}^T \frac{1}{2\sigma_i^2} \mathcal{L}_i + \log \sigma_i$$

## Domain Adaptation

Pas modellen aan voor verschillende domeinen:

### Unsupervised Domain Adaptation

Adapteer naar target domain zonder labels:

- **Adversarial adaptation**: Train discriminator om domain te onderscheiden
- **Maximum mean discrepancy**: Minimaliseer verschil in feature distributies
- **Self-training**: Gebruik confident predictions als pseudo-labels

### Few-Shot Learning

Leer nieuwe concepten met weinig voorbeelden:

- **Prototypical networks**: Vergelijk met class prototypes
- **Relation networks**: Leer similarity metrics
- **Meta-learning**: Leer om snel te leren

## Training Stability

### Gradient Clipping

Beperk gradient grootte om exploding gradients te voorkomen:

```python
# Gradient clipping
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
```

### Mixed Precision Training

Gebruik float16 voor sneller en geheugen-efficiënter trainen:

```python
# Automatic mixed precision
from torch.cuda.amp import GradScaler, autocast

scaler = GradScaler()

for inputs, labels in dataloader:
    optimizer.zero_grad()
    
    with autocast():
        outputs = model(inputs)
        loss = criterion(outputs, labels)
    
    scaler.scale(loss).backward()
    scaler.step(optimizer)
    scaler.update()
```

## Evaluatie Tijdens Training

### Validation Strategies

- **Hold-out validation**: Reserveer deel van data voor validatie
- **K-fold cross-validation**: Train K modellen met verschillende splits
- **Stratified validation**: Behoud class distributie in splits

### Early Stopping

Stop training wanneer validatie performance verslechtert:

```python
class EarlyStopping:
    def __init__(self, patience=10, min_delta=0.001):
        self.patience = patience
        self.min_delta = min_delta
        self.best_loss = float('inf')
        self.counter = 0
    
    def __call__(self, val_loss):
        if val_loss < self.best_loss - self.min_delta:
            self.best_loss = val_loss
            self.counter = 0
        else:
            self.counter += 1
        
        return self.counter >= self.patience
```

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np

# Vergelijk verschillende leerbenaderingen
def compare_learning_approaches():
    """Vergelijk verschillende leerbenaderingen"""
    
    approaches = {
        'Supervised Learning': {
            'data_needed': 'Large labeled dataset',
            'annotation_cost': 'High',
            'performance': 'High',
            'applications': ['Medical imaging', 'Autonomous driving'],
            'challenges': ['Expensive annotation', 'Limited to labeled data']
        },
        'Transfer Learning': {
            'data_needed': 'Small labeled dataset',
            'annotation_cost': 'Medium',
            'performance': 'High',
            'applications': ['Custom object detection', 'Domain-specific tasks'],
            'challenges': ['Requires pre-trained models', 'Domain gap']
        },
        'Self-Supervised Learning': {
            'data_needed': 'Large unlabeled dataset',
            'annotation_cost': 'Low',
            'performance': 'Medium-High',
            'applications': ['Representation learning', 'Pre-training'],
            'challenges': ['Complex training', 'Task-specific fine-tuning needed']
        },
        'Few-Shot Learning': {
            'data_needed': 'Very small labeled dataset',
            'annotation_cost': 'Very Low',
            'performance': 'Medium',
            'applications': ['Novel class detection', 'Rapid prototyping'],
            'challenges': ['Lower performance', 'Complex meta-learning']
        }
    }
    
    # Visualisatie
    fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 12))
    
    # Data needed vs Annotation cost
    data_levels = ['Very small', 'Small', 'Large', 'Very large']
    cost_levels = ['Very Low', 'Low', 'Medium', 'High']
    
    approach_names = list(approaches.keys())
    data_indices = [data_levels.index(approaches[app]['data_needed'].split()[0]) for app in approach_names]
    cost_indices = [cost_levels.index(approaches[app]['annotation_cost'].split()[0]) for app in approach_names]
    
    ax1.scatter(data_indices, cost_indices, s=200, alpha=0.7)
    for i, name in enumerate(approach_names):
        ax1.annotate(name, (data_indices[i], cost_indices[i]), 
                    xytext=(5, 5), textcoords='offset points')
    ax1.set_xlabel('Data Needed')
    ax1.set_ylabel('Annotation Cost')
    ax1.set_title('Data vs Annotation Cost Trade-off')
    ax1.set_xticks(range(len(data_levels)))
    ax1.set_xticklabels(data_levels)
    ax1.set_yticks(range(len(cost_levels)))
    ax1.set_yticklabels(cost_levels)
    ax1.grid(True, alpha=0.3)
    
    # Performance comparison
    perf_levels = ['Low', 'Medium', 'High', 'Very High']
    perf_indices = [perf_levels.index(approaches[app]['performance'].split('-')[0]) for app in approach_names]
    
    bars = ax2.bar(range(len(approach_names)), perf_indices, alpha=0.7)
    ax2.set_xlabel('Approach')
    ax2.set_ylabel('Performance Level')
    ax2.set_title('Performance Comparison')
    ax2.set_xticks(range(len(approach_names)))
    ax2.set_xticklabels(approach_names, rotation=45)
    ax2.set_yticks(range(len(perf_levels)))
    ax2.set_yticklabels(perf_levels)
    
    # Applications word cloud
    ax3.axis('off')
    ax3.set_title('Applications by Approach')
    
    y_pos = 0.9
    for approach, info in approaches.items():
        ax3.text(0.1, y_pos, f"{approach}:", fontweight='bold', fontsize=10)
        for app in info['applications']:
            y_pos -= 0.08
            ax3.text(0.15, y_pos, f"• {app}", fontsize=9)
        y_pos -= 0.05
    
    # Challenges
    ax4.axis('off')
    ax4.set_title('Challenges by Approach')
    
    y_pos = 0.9
    for approach, info in approaches.items():
        ax4.text(0.1, y_pos, f"{approach}:", fontweight='bold', fontsize=10)
        for challenge in info['challenges']:
            y_pos -= 0.08
            ax4.text(0.15, y_pos, f"• {challenge}", fontsize=9)
        y_pos -= 0.05
    
    plt.tight_layout()
    plt.show()
    
    return approaches

# Vergelijk leerbenaderingen
approaches = compare_learning_approaches()

print("\nGedetailleerde Vergelijking van Leerbenaderingen:")
for approach, info in approaches.items():
    print(f"\n{approach}:")
    print(f"  Data needed: {info['data_needed']}")
    print(f"  Annotation cost: {info['annotation_cost']}")
    print(f"  Performance: {info['performance']}")
    print(f"  Applications: {', '.join(info['applications'])}")
    print(f"  Challenges: {', '.join(info['challenges'])}")

## Praktische Training Tips

Laatste tips voor effectieve model training:

In [None]:
# Handige utilities voor computer vision training

class VisionTrainingUtils:
    """Utility class voor computer vision training"""
    
    @staticmethod
    def calculate_iou(box1, box2):
        """Bereken Intersection over Union"""
        # box format: [x1, y1, x2, y2]
        x1_inter = max(box1[0], box2[0])
        y1_inter = max(box1[1], box2[1])
        x2_inter = min(box1[2], box2[2])
        y2_inter = min(box1[3], box2[3])
        
        # Geen overlap
        if x2_inter <= x1_inter or y2_inter <= y1_inter:
            return 0.0
        
        # Bereken overlap area
        inter_area = (x2_inter - x1_inter) * (y2_inter - y1_inter)
        
        # Union area
        box1_area = (box1[2] - box1[0]) * (box1[3] - box1[1])
        box2_area = (box2[2] - box2[0]) * (box2[3] - box2[1])
        union_area = box1_area + box2_area - inter_area
        
        return inter_area / union_area
    
    @staticmethod
    def apply_augmentations(image, augmentations):
        """Pas augmentaties toe op een afbeelding"""
        if isinstance(augmentations, list):
            for aug in augmentations:
                image = aug(image)
        else:
            image = augmentations(image)
        return image
    
    @staticmethod
    def visualize_training_curves(train_losses, val_losses, train_metrics=None, val_metrics=None):
        """Visualiseer training curves"""
        fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 10))
        
        epochs = range(1, len(train_losses) + 1)
        
        # Loss curves
        ax1.plot(epochs, train_losses, 'b-', label='Training Loss')
        ax1.plot(epochs, val_losses, 'r-', label='Validation Loss')
        ax1.set_xlabel('Epoch')
        ax1.set_ylabel('Loss')
        ax1.set_title('Training and Validation Loss')
        ax1.legend()
        ax1.grid(True, alpha=0.3)
        
        # Metrics curves (als opgegeven)
        if train_metrics and val_metrics:
            ax2.plot(epochs, train_metrics, 'b-', label='Training Metric')
            ax2.plot(epochs, val_metrics, 'r-', label='Validation Metric')
            ax2.set_xlabel('Epoch')
            ax2.set_ylabel('Metric')
            ax2.set_title('Training and Validation Metrics')
            ax2.legend()
            ax2.grid(True, alpha=0.3)
        
        # Learning rate (placeholder)
        ax3.plot(epochs, [0.001 * 0.95**e for e in epochs], 'g-', label='Learning Rate')
        ax3.set_xlabel('Epoch')
        ax3.set_ylabel('Learning Rate')
        ax3.set_title('Learning Rate Schedule')
        ax3.legend()
        ax3.grid(True, alpha=0.3)
        ax3.set_yscale('log')
        
        ax4.axis('off')
        ax4.text(0.1, 0.8, 'Training Summary:', fontweight='bold', fontsize=12)
        ax4.text(0.1, 0.6, f'Final training loss: {train_losses[-1]:.4f}', fontsize=10)
        ax4.text(0.1, 0.4, f'Final validation loss: {val_losses[-1]:.4f}', fontsize=10)
        ax4.text(0.1, 0.2, f'Best validation loss: {min(val_losses):.4f}', fontsize=10)
        
        plt.tight_layout()
        plt.show()

# Voorbeeld gebruik
utils = VisionTrainingUtils()

# Test IoU berekening
box1 = [0.1, 0.1, 0.5, 0.5]  # Ground truth
box2 = [0.15, 0.15, 0.55, 0.55]  # Prediction
iou = utils.calculate_iou(box1, box2)
print(f"IoU between boxes: {iou:.3f}")

# Simuleer training curves
epochs = 20
train_losses = [1.0 * 0.9**i + 0.1*np.random.random() for i in range(epochs)]
val_losses = [1.2 * 0.85**i + 0.15*np.random.random() for i in range(epochs)]

utils.visualize_training_curves(train_losses, val_losses)

print("\nTraining utilities beschikbaar voor:")
print("- IoU berekening voor object detection")
print("- Data augmentation pipelines")
print("- Training curve visualisatie")
print("- Model performance monitoring")