# Les 12: Capstone Project - Van Wiskunde naar AI

**Mathematical Foundations - IT & Artificial Intelligence**

---

## 12.0 Welkom bij de Finale!

Gefeliciteerd! Je hebt een indrukwekkende reis gemaakt door de wiskunde van deep learning. In deze laatste les brengen we alles samen in een **capstone project** waar je zelf een volledig machine learning systeem ontwerpt, implementeert en evalueert.

### Wat je hebt geleerd

| Deel | Onderwerp | Toepassing in Neural Networks |
|------|-----------|------------------------------|
| 1 | Lineaire Algebra | Data representatie, forward pass |
| 2 | Calculus | Optimalisatie, backpropagation |
| 3 | Statistiek | Loss functies, output interpretatie |
| 4 | Integratie | Complete systemen bouwen |

## 12.1 Leerdoelen

Na deze les kun je zelfstandig een machine learning project uitvoeren. Je kunt de juiste wiskundige tools kiezen voor een probleem. Je kunt een model ontwerpen, trainen en evalueren. Je kunt je resultaten interpreteren en presenteren.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from abc import ABC, abstractmethod

np.set_printoptions(precision=4, suppress=True)
np.random.seed(42)

print("Libraries geladen!")
print("Klaar voor het capstone project!")

## 12.2 De Complete Neural Network Library

Hier is onze volledige library uit Les 11, klaar voor gebruik:

In [None]:
# ===== BASE CLASSES =====
class Layer(ABC):
    def __init__(self):
        self.params = {}
        self.grads = {}
        self.training = True
    @abstractmethod
    def forward(self, x): pass
    @abstractmethod
    def backward(self, dout): pass
    def __call__(self, x): return self.forward(x)
    def train(self): self.training = True
    def eval(self): self.training = False

class Loss(ABC):
    @abstractmethod
    def forward(self, y_pred, y_true): pass
    @abstractmethod
    def backward(self): pass
    def __call__(self, y_pred, y_true): return self.forward(y_pred, y_true)

# ===== LAYERS =====
class Linear(Layer):
    def __init__(self, in_features, out_features):
        super().__init__()
        std = np.sqrt(2.0 / in_features)
        self.params['W'] = np.random.randn(in_features, out_features) * std
        self.params['b'] = np.zeros(out_features)
    
    def forward(self, x):
        self.x = x
        return x @ self.params['W'] + self.params['b']
    
    def backward(self, dout):
        n = self.x.shape[0]
        self.grads['W'] = self.x.T @ dout / n
        self.grads['b'] = np.mean(dout, axis=0)
        return dout @ self.params['W'].T

class ReLU(Layer):
    def forward(self, x):
        self.mask = (x > 0)
        return np.maximum(0, x)
    def backward(self, dout):
        return dout * self.mask

class Sigmoid(Layer):
    def forward(self, x):
        self.out = 1 / (1 + np.exp(-np.clip(x, -500, 500)))
        return self.out
    def backward(self, dout):
        return dout * self.out * (1 - self.out)

class Tanh(Layer):
    def forward(self, x):
        self.out = np.tanh(x)
        return self.out
    def backward(self, dout):
        return dout * (1 - self.out ** 2)

class Dropout(Layer):
    def __init__(self, p=0.5):
        super().__init__()
        self.p = p
    def forward(self, x):
        if self.training:
            self.mask = (np.random.random(x.shape) > self.p) / (1 - self.p)
            return x * self.mask
        return x
    def backward(self, dout):
        return dout * self.mask if self.training else dout

class BatchNorm(Layer):
    def __init__(self, n_features, momentum=0.9, epsilon=1e-5):
        super().__init__()
        self.momentum = momentum
        self.epsilon = epsilon
        self.params['gamma'] = np.ones(n_features)
        self.params['beta'] = np.zeros(n_features)
        self.running_mean = np.zeros(n_features)
        self.running_var = np.ones(n_features)
    
    def forward(self, x):
        if self.training:
            self.mu = np.mean(x, axis=0)
            self.var = np.var(x, axis=0)
            self.running_mean = self.momentum * self.running_mean + (1 - self.momentum) * self.mu
            self.running_var = self.momentum * self.running_var + (1 - self.momentum) * self.var
        else:
            self.mu = self.running_mean
            self.var = self.running_var
        self.x_centered = x - self.mu
        self.std = np.sqrt(self.var + self.epsilon)
        self.x_norm = self.x_centered / self.std
        return self.params['gamma'] * self.x_norm + self.params['beta']
    
    def backward(self, dout):
        n = dout.shape[0]
        self.grads['gamma'] = np.sum(dout * self.x_norm, axis=0)
        self.grads['beta'] = np.sum(dout, axis=0)
        dx_norm = dout * self.params['gamma']
        dvar = np.sum(dx_norm * self.x_centered * -0.5 * (self.var + self.epsilon)**(-1.5), axis=0)
        dmu = np.sum(dx_norm * -1 / self.std, axis=0) + dvar * np.mean(-2 * self.x_centered, axis=0)
        return dx_norm / self.std + dvar * 2 * self.x_centered / n + dmu / n

# ===== LOSS FUNCTIONS =====
class MSELoss(Loss):
    def forward(self, y_pred, y_true):
        self.y_pred, self.y_true = y_pred, y_true
        return np.mean((y_pred - y_true) ** 2)
    def backward(self):
        return 2 * (self.y_pred - self.y_true) / self.y_pred.shape[0]

class CrossEntropyLoss(Loss):
    def forward(self, logits, y_true):
        self.y_true = y_true
        n = logits.shape[0]
        exp_logits = np.exp(logits - np.max(logits, axis=1, keepdims=True))
        self.probs = exp_logits / np.sum(exp_logits, axis=1, keepdims=True)
        return -np.mean(np.log(self.probs[np.arange(n), y_true] + 1e-10))
    def backward(self):
        n = self.probs.shape[0]
        grad = self.probs.copy()
        grad[np.arange(n), self.y_true] -= 1
        return grad / n

class BinaryCrossEntropyLoss(Loss):
    def forward(self, y_pred, y_true):
        self.y_pred = np.clip(y_pred, 1e-10, 1-1e-10)
        self.y_true = y_true
        return -np.mean(y_true * np.log(self.y_pred) + (1-y_true) * np.log(1-self.y_pred))
    def backward(self):
        return (self.y_pred - self.y_true) / (self.y_pred * (1 - self.y_pred) * len(self.y_true))

# ===== OPTIMIZERS =====
class SGD:
    def __init__(self, layers, lr=0.01, momentum=0, weight_decay=0):
        self.layers = layers
        self.lr, self.momentum, self.weight_decay = lr, momentum, weight_decay
        self.velocity = {}
    
    def step(self):
        for i, layer in enumerate(self.layers):
            for name, param in layer.params.items():
                if name not in layer.grads: continue
                key = (i, name)
                grad = layer.grads[name] + self.weight_decay * param if self.weight_decay else layer.grads[name]
                if self.momentum:
                    if key not in self.velocity: self.velocity[key] = np.zeros_like(param)
                    self.velocity[key] = self.momentum * self.velocity[key] - self.lr * grad
                    layer.params[name] += self.velocity[key]
                else:
                    layer.params[name] -= self.lr * grad
    
    def zero_grad(self):
        for layer in self.layers: layer.grads = {}

class Adam:
    def __init__(self, layers, lr=0.001, beta1=0.9, beta2=0.999, epsilon=1e-8):
        self.layers, self.lr = layers, lr
        self.beta1, self.beta2, self.epsilon = beta1, beta2, epsilon
        self.m, self.v, self.t = {}, {}, 0
    
    def step(self):
        self.t += 1
        for i, layer in enumerate(self.layers):
            for name, param in layer.params.items():
                if name not in layer.grads: continue
                key = (i, name)
                grad = layer.grads[name]
                if key not in self.m:
                    self.m[key], self.v[key] = np.zeros_like(param), np.zeros_like(param)
                self.m[key] = self.beta1 * self.m[key] + (1 - self.beta1) * grad
                self.v[key] = self.beta2 * self.v[key] + (1 - self.beta2) * grad**2
                m_hat = self.m[key] / (1 - self.beta1**self.t)
                v_hat = self.v[key] / (1 - self.beta2**self.t)
                layer.params[name] -= self.lr * m_hat / (np.sqrt(v_hat) + self.epsilon)
    
    def zero_grad(self):
        for layer in self.layers: layer.grads = {}

# ===== SEQUENTIAL MODEL =====
class Sequential:
    def __init__(self, layers):
        self.layers = layers
    def forward(self, x):
        for layer in self.layers: x = layer.forward(x)
        return x
    def backward(self, dout):
        for layer in reversed(self.layers): dout = layer.backward(dout)
        return dout
    def __call__(self, x): return self.forward(x)
    def train(self):
        for layer in self.layers: layer.training = True
    def eval(self):
        for layer in self.layers: layer.training = False
    def parameters(self):
        return [l for l in self.layers if l.params]

print("Neural Network Library geladen!")

## 12.3 Capstone Project: Fashion-MNIST Classifier

We bouwen een classifier voor Fashion-MNIST: 10 categorieÃ«n kleding.

Dit project demonstreert alle concepten uit de cursus:
- **Data preprocessing**: normalisatie (Les 9)
- **Model design**: lagen kiezen (Les 2-4, 11)
- **Training**: loss, optimizer, backprop (Les 5-7, 10)
- **Evaluation**: accuracy, confusion matrix (Les 8)

In [None]:
# Laad Fashion-MNIST
from sklearn.datasets import fetch_openml

print("Fashion-MNIST laden...")
fashion = fetch_openml('Fashion-MNIST', version=1, as_frame=False, parser='auto')
X, y = fashion.data / 255.0, fashion.target.astype(int)

# Split
X_train, X_test = X[:60000], X[60000:]
y_train, y_test = y[:60000], y[60000:]

# Labels
class_names = ['T-shirt', 'Broek', 'Trui', 'Jurk', 'Jas', 
               'Sandaal', 'Shirt', 'Sneaker', 'Tas', 'Laars']

print(f"Training: {X_train.shape}")
print(f"Test: {X_test.shape}")
print(f"Klassen: {class_names}")

In [None]:
# Visualiseer voorbeelden
fig, axes = plt.subplots(2, 5, figsize=(14, 6))
for i, ax in enumerate(axes.flatten()):
    idx = np.random.randint(len(X_train))
    ax.imshow(X_train[idx].reshape(28, 28), cmap='gray')
    ax.set_title(class_names[y_train[idx]])
    ax.axis('off')
plt.suptitle('Fashion-MNIST Voorbeelden', fontsize=14)
plt.tight_layout()
plt.show()

### Stap 1: Model Ontwerpen

We ontwerpen een netwerk met:
- Input: 784 (28Ã—28 pixels)
- Hidden layers met BatchNorm en Dropout
- Output: 10 klassen

In [None]:
# Model architectuur
model = Sequential([
    # Layer 1: 784 â†’ 512
    Linear(784, 512),
    BatchNorm(512),
    ReLU(),
    Dropout(0.3),
    
    # Layer 2: 512 â†’ 256
    Linear(512, 256),
    BatchNorm(256),
    ReLU(),
    Dropout(0.3),
    
    # Layer 3: 256 â†’ 128
    Linear(256, 128),
    BatchNorm(128),
    ReLU(),
    Dropout(0.2),
    
    # Output: 128 â†’ 10
    Linear(128, 10)
])

print("Model: 784 â†’ 512 â†’ 256 â†’ 128 â†’ 10")
print(f"Aantal lagen: {len(model.layers)}")

### Stap 2: Training Setup

In [None]:
# Hyperparameters
batch_size = 128
n_epochs = 15
learning_rate = 0.001

# Loss en optimizer
criterion = CrossEntropyLoss()
optimizer = Adam(model.parameters(), lr=learning_rate)

# Helper functies
def compute_accuracy(model, X, y, batch_size=1000):
    model.eval()
    correct = 0
    for i in range(0, len(X), batch_size):
        X_batch = X[i:i+batch_size]
        y_batch = y[i:i+batch_size]
        logits = model(X_batch)
        preds = np.argmax(logits, axis=1)
        correct += np.sum(preds == y_batch)
    model.train()
    return correct / len(X)

print(f"Batch size: {batch_size}")
print(f"Epochs: {n_epochs}")
print(f"Learning rate: {learning_rate}")

### Stap 3: Training Loop

In [None]:
# Training
n_batches = len(X_train) // batch_size
train_losses = []
train_accs = []
test_accs = []

print("Training starten...\n")

for epoch in range(n_epochs):
    model.train()
    
    # Shuffle
    idx = np.random.permutation(len(X_train))
    X_shuffled = X_train[idx]
    y_shuffled = y_train[idx]
    
    epoch_loss = 0
    
    for batch in range(n_batches):
        start = batch * batch_size
        X_batch = X_shuffled[start:start+batch_size]
        y_batch = y_shuffled[start:start+batch_size]
        
        # Forward
        logits = model(X_batch)
        loss = criterion(logits, y_batch)
        epoch_loss += loss
        
        # Backward
        optimizer.zero_grad()
        model.backward(criterion.backward())
        optimizer.step()
    
    # Evaluate
    avg_loss = epoch_loss / n_batches
    train_acc = compute_accuracy(model, X_train[:5000], y_train[:5000])
    test_acc = compute_accuracy(model, X_test, y_test)
    
    train_losses.append(avg_loss)
    train_accs.append(train_acc)
    test_accs.append(test_acc)
    
    print(f"Epoch {epoch+1:2d}: Loss={avg_loss:.4f}, Train={train_acc:.4f}, Test={test_acc:.4f}")

print(f"\nâœ“ Training compleet!")
print(f"âœ“ Beste test accuracy: {max(test_accs)*100:.2f}%")

### Stap 4: Resultaten Visualiseren

In [None]:
# Learning curves
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

axes[0].plot(train_losses, 'b-', linewidth=2)
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Loss')
axes[0].set_title('Training Loss')
axes[0].grid(True, alpha=0.3)

axes[1].plot(train_accs, 'b-', linewidth=2, label='Train')
axes[1].plot(test_accs, 'r-', linewidth=2, label='Test')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Accuracy')
axes[1].set_title('Accuracy')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

In [None]:
# Confusion matrix
model.eval()
all_preds = []
for i in range(0, len(X_test), 1000):
    logits = model(X_test[i:i+1000])
    all_preds.extend(np.argmax(logits, axis=1))
all_preds = np.array(all_preds)

# Bereken confusion matrix
conf_matrix = np.zeros((10, 10), dtype=int)
for true, pred in zip(y_test, all_preds):
    conf_matrix[true, pred] += 1

# Plot
plt.figure(figsize=(12, 10))
plt.imshow(conf_matrix, cmap='Blues')
plt.colorbar()

for i in range(10):
    for j in range(10):
        plt.text(j, i, conf_matrix[i, j], ha='center', va='center', fontsize=10)

plt.xticks(range(10), class_names, rotation=45, ha='right')
plt.yticks(range(10), class_names)
plt.xlabel('Voorspelling')
plt.ylabel('Werkelijk')
plt.title('Confusion Matrix')
plt.tight_layout()
plt.show()

# Per-class accuracy
print("\nAccuracy per klasse:")
for i, name in enumerate(class_names):
    class_acc = conf_matrix[i, i] / np.sum(conf_matrix[i, :])
    print(f"  {name:10s}: {class_acc*100:.1f}%")

In [None]:
# Visualiseer voorspellingen
model.eval()

fig, axes = plt.subplots(3, 5, figsize=(14, 9))
indices = np.random.choice(len(X_test), 15, replace=False)

for ax, idx in zip(axes.flatten(), indices):
    img = X_test[idx].reshape(28, 28)
    logits = model(X_test[idx:idx+1])
    probs = np.exp(logits) / np.sum(np.exp(logits))
    pred = np.argmax(logits)
    true = y_test[idx]
    
    ax.imshow(img, cmap='gray')
    color = 'green' if pred == true else 'red'
    ax.set_title(f'Pred: {class_names[pred]}\nTrue: {class_names[true]}\nConf: {probs[0, pred]*100:.1f}%', 
                 color=color, fontsize=10)
    ax.axis('off')

plt.suptitle('Model Voorspellingen (groen=correct, rood=fout)', fontsize=14)
plt.tight_layout()
plt.show()

## 12.4 Reflectie: De Wiskunde Achter het Succes

Laten we terugkijken op welke wiskunde we hebben gebruikt:

### Lineaire Algebra (Les 1-4)
- **Matrixvermenigvuldiging**: X @ W in elke Linear layer
- **Vectoroptelling**: + b voor de bias
- **Batch processing**: meerdere samples tegelijk

### Calculus (Les 5-7)
- **Afgeleiden**: elke layer heeft een backward() methode
- **Kettingregel**: backpropagation door alle lagen
- **Gradient descent**: optimizer.step() update de parameters

### Statistiek (Les 8-10)
- **Softmax**: zet logits om in kansen
- **Cross-entropy**: meet verschil tussen voorspelde en echte verdeling
- **Batch normalization**: normaliseert naar Î¼=0, Ïƒ=1
- **Maximum Likelihood**: cross-entropy = negative log-likelihood

## 12.5 Conclusie

### Wat je hebt bereikt

Je hebt:
1. âœ… De wiskundige basis van deep learning geleerd
2. âœ… Een complete neural network library from scratch gebouwd
3. âœ… Een classifier getraind die ~88% accuracy haalt op Fashion-MNIST
4. âœ… Begrepen waarom elke component werkt

### Volgende stappen

Met deze basis kun je:
- Frameworks zoals PyTorch/TensorFlow begrijpen op een dieper niveau
- Nieuwe architecturen implementeren
- Papers lezen en begrijpen
- Problemen debuggen met wiskundig inzicht

### Afsluiting

**Gefeliciteerd met het voltooien van Mathematical Foundations!** ðŸŽ‰

Je hebt nu de wiskundige fundamenten om verder te gaan in AI en Machine Learning.

---

**Mathematical Foundations** | Les 12 van 12 | IT & Artificial Intelligence

**Cursus Compleet!** ðŸŽ“

---