# 01 - Training & Export

## Learning Goals

* Understand the pieces of a PyTorch training loop (model → loss → optimizer → data loader → epochs).
* Implement/inspect a tiny CNN and see how accuracy changes with hyper-parameters.
* Export to ONNX and verify the file loads and produces outputs of the right shape/dtype.

## You Should Be Able To...

- Implement a basic CNN using PyTorch layers
- Write a training loop with loss calculation and accuracy tracking
- Export a trained model to ONNX format with proper input/output specifications
- Explain why ONNX export is useful for edge deployment
- Identify key hyperparameters that affect model performance

---

## Concepts

**Training loop**: forward → compute loss → backward → optimizer step → repeat.

**Evaluation mode**: `model.eval()` disables dropout/batchnorm updates for deterministic inference.

**Export to ONNX**: we trace the model with a sample input and save `models/model.onnx`.

**Preprocessing contract**: whatever normalization/resizing you used during training **must be used at inference** (outside the ONNX graph).

## Common Pitfalls

* Forgetting `model.eval()` before export (exporting training behavior).
* Mismatch between training normalization and inference normalization (bad accuracy).
* Exporting with the wrong input shape.

## Success Criteria

* ✅ Training runs and prints accuracy
* ✅ `models/model.onnx` exists and can be loaded
* ✅ Checker says shapes/dtypes are valid

---

## Setup & Environment Check


In [None]:
# ruff: noqa: E401
import os
import sys
from pathlib import Path

# Ensure repo root in path if opened from labs/
if Path.cwd().name == "labs":
    os.chdir(Path.cwd().parent)
    print("→ Working dir set to repo root:", os.getcwd())
if os.getcwd() not in sys.path:
    sys.path.insert(0, os.getcwd())

# Core deps
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
import onnx
from onnx import checker  # noqa: F401
import onnxruntime as ort  # noqa: F401
import matplotlib.pyplot as plt
from torchvision.datasets import FakeData as TVFakeData

# Project package
from piedge_edukit.preprocess import FakeData as PEDFakeData

# Hints & Solutions helper (pure Jupyter, no extra deps)
from IPython.display import Markdown, display

def hints(*lines, solution: str | None = None, title="Need a nudge?"):
    """Render progressive hints + optional collapsible solution."""
    md = [f"### {title}"]
    for i, txt in enumerate(lines, start=1):
        md.append(f"<details><summary>Hint {i}</summary>\n\n{txt}\n\n</details>")
    if solution:
        # keep code fenced as python for readability
        md.append(
            "<details><summary><b>Show solution</b></summary>\n\n"
            f"```python\n{solution.strip()}\n```\n"
            "</details>"
        )
    display(Markdown("\n\n".join(md)))


In [None]:
# Environment self-heal (Python 3.12 + editable install)
import subprocess
import importlib

print(f"Python: {sys.version.split()[0]} (need 3.12)")

try:
    import piedge_edukit  # noqa: F401
    print("✅ PiEdge EduKit package OK")
except ModuleNotFoundError:
    print("ℹ️ Installing package in editable mode …")
    root = os.getcwd()
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-e", root])
    importlib.invalidate_caches()
    import piedge_edukit  # noqa: F401
    print("✅ Package installed")


In [None]:
# All imports are now in the first cell above
print("✅ All imports successful")


## Concept: Convolutional Neural Networks

CNNs are designed to process grid-like data (images) by:
- **Convolutional layers**: Learn spatial patterns (edges, textures, shapes)
- **Pooling layers**: Reduce spatial dimensions while preserving important features
- **Fully connected layers**: Make final classification decisions

For 64×64 RGB images, a typical architecture flows: `[3,64,64] → Conv → ReLU → Pool → Conv → ReLU → Pool → Flatten → Linear → Linear → [num_classes]`


## Task A: Implement a Simple CNN

Your task is to implement a `TinyCNN` class that can classify 64×64 RGB images into 2 classes.

### TODO A1 — Implement `TinyCNN`
**Goal:** Build a minimal Conv → ReLU → MaxPool stack ending in a linear head.

<details><summary>Hint 1</summary>
Start with 3×3 conv, stride=1, padding=1. Use MaxPool 2×2 to downsample.
</details>

<details><summary>Hint 2</summary>
Two conv blocks are enough for FakeData. Flatten before the linear layer.
</details>

<details><summary>Hint 3</summary>
`forward(x)` should return logits (no softmax).
</details>

<details><summary>Solution</summary>

```python
import torch.nn as nn

class TinyCNN(nn.Module):
    def __init__(self, num_classes=10):
        super().__init__()
        self.net = nn.Sequential(
            nn.Conv2d(3, 16, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
            nn.Conv2d(16, 32, 3, padding=1), nn.ReLU(),
            nn.MaxPool2d(2),
        )
        self.head = nn.Linear(32*8*8, num_classes)

    def forward(self, x):
        x = self.net(x)
        x = x.view(x.size(0), -1)
        return self.head(x)
```

</details>


In [None]:
# TODO A1: implement TinyCNN here (or edit if already stubbed)

# Create model instance
model = TinyCNN(num_classes=2)
print(f"Model created with {sum(p.numel() for p in model.parameters())} parameters")


In [None]:
# TEST: model should accept [1,3,64,64] and output [1,2]
# (torch already imported in first cell)
x = torch.randn(1,3,64,64)
y = model(x)
assert y.shape == (1,2), f"Expected (1,2), got {tuple(y.shape)}"
print("✅ Shape test passed")
print(f"Input shape: {x.shape}")
print(f"Output shape: {y.shape}")
print(f"Output range: [{y.min().item():.3f}, {y.max().item():.3f}]")


## Concept: Training Loop Components

A typical training loop includes:
1. **Forward pass**: Compute predictions
2. **Loss calculation**: Compare predictions to ground truth
3. **Backward pass**: Compute gradients
4. **Optimizer step**: Update model parameters
5. **Metrics tracking**: Monitor loss and accuracy


## Task B: Write the Training Step

Implement a `train_one_epoch` function that trains the model for one epoch and returns loss and accuracy metrics.

### TODO B1 — Implement one training step
**Goal:** zero_grad → forward → compute loss → backward → step

<details><summary>Hint 1</summary>
`optimizer.zero_grad()` måste kallas före `loss.backward()`.
</details>

<details><summary>Hint 2</summary>
Använd `model.train()` under träning.
</details>

<details><summary>Solution</summary>

```python
def train_step(model, batch, optimizer, criterion):
    model.train()
    x, y = batch
    optimizer.zero_grad()
    logits = model(x)
    loss = criterion(logits, y)
    loss.backward()
    optimizer.step()
    return float(loss.detach().cpu())
```

</details>


In [None]:
# TODO B1: implement train_step(...)

# Test the function signature
print("✅ Function signature looks correct")


In [None]:
# Create test data and test the training function
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

# Create fake data loader
fake_data = FakeData(num_samples=100, image_size=64, num_classes=2)
train_loader = DataLoader(fake_data, batch_size=16, shuffle=True)

# Create optimizer
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Test training function
metrics = train_one_epoch(model, train_loader, optimizer, device)
assert "loss" in metrics and "acc" in metrics
print("✅ Training loop smoke test passed")
print(f"Loss: {metrics['loss']:.4f}, Accuracy: {metrics['acc']:.2f}%")


## Concept: ONNX Export

ONNX (Open Neural Network Exchange) is a format that allows models to run on different platforms:
- **Cross-platform**: Same model runs on CPU, GPU, mobile, edge devices
- **Optimized inference**: ONNX Runtime provides optimized execution
- **Language agnostic**: Models can be used from Python, C++, C#, JavaScript, etc.

Key requirements for export:
- Model must be in evaluation mode (`model.eval()`)
- Provide a dummy input with correct shape
- Specify input/output names for clarity


## Task C: Export to ONNX

Export your trained model to ONNX format for edge deployment.

### TODO C1 — Export to ONNX with dynamic axes
**Goal:** Put model in `eval()`, feed a dummy input, export to `models/model.onnx`.

<details><summary>Hint 1</summary>
Use `torch.onnx.export(model, dummy, "models/model.onnx", opset_version=17, dynamic_axes=...)`.
</details>

<details><summary>Hint 2</summary>
`dynamic_axes={"input": {0: "batch"}, "output": {0: "batch"}}`.
</details>

<details><summary>Solution</summary>

```python
import torch, os
os.makedirs("models", exist_ok=True)
model.eval()
dummy = torch.randn(1, 3, 32, 32)
torch.onnx.export(
    model, dummy, "models/model.onnx",
    input_names=["input"], output_names=["output"],
    dynamic_axes={"input": {0: "batch"}, "output": {0: "batch"}},
    opset_version=17
)
print("[OK] Exported to models/model.onnx")
```

</details>


In [None]:
# TODO C1: export ONNX here

print("✅ ONNX export completed")


In [None]:
# Test ONNX export
# (onnx and os already imported in first cell)
assert os.path.exists("./models/model.onnx"), "ONNX file missing"
m = onnx.load("./models/model.onnx")
onnx.checker.check_model(m)
print("✅ ONNX export verified")

# Show model info
file_size = os.path.getsize("./models/model.onnx") / (1024*1024)
print(f"Model size: {file_size:.2f} MB")
print(f"Input shape: {[d.dim_value for d in m.graph.input[0].type.tensor_type.shape.dim]}")
print(f"Output shape: {[d.dim_value for d in m.graph.output[0].type.tensor_type.shape.dim]}")


## Reflection Questions

Please answer these questions in 2-3 sentences each:


**1. What two hyperparameters most affected your validation accuracy? Why?**

*Your answer here (2-3 sentences):*

---

**2. Why is exporting to ONNX useful for edge deployment?**

*Your answer here (2-3 sentences):*

---

**3. What would happen if you forgot to call `model.eval()` before ONNX export?**

*Your answer here (2-3 sentences):*


## Next Steps

Great work! You've implemented a CNN, trained it, and exported it to ONNX format.

**Next**: Open `02_latency_benchmark.ipynb` to learn about performance measurement and optimization.

---

### Summary
- ✅ Implemented TinyCNN architecture
- ✅ Created training loop with metrics
- ✅ Exported model to ONNX format
- ✅ Verified export integrity


# 🧠 Träning & ONNX Export - Förstå vad som händer

**Mål**: Förstå hur träning fungerar och experimentera med olika inställningar.

I detta notebook kommer vi att:
- Förstå vad FakeData är och varför vi använder det
- Se hur dataset-pipeline → modell → loss/accuracy fungerar
- Experimentera med olika hyperparametrar
- Förstå varför vi exporterar till ONNX

> **💡 Tips**: Kör cellerna i ordning och läs förklaringarna. Experimentera gärna med värdena!


## 🤔 Vad är FakeData och varför använder vi det?

**FakeData** är syntetiska bilder som PyTorch genererar automatiskt. Det är perfekt för:
- **Snabb prototyping** - ingen nedladdning av stora dataset
- **Reproducerbarhet** - samma data varje gång
- **Undervisning** - fokus på algoritmer, inte datahantering

<details>
<summary>🔍 Klicka för att se vad FakeData innehåller</summary>

```python
# FakeData genererar:
# - Slumpmässiga RGB-bilder (64x64 pixlar)
# - Slumpmässiga klasser (0, 1, 2, ...)
# - Samma struktur som riktiga bilddataset
```

</details>


In [None]:
# Låt oss skapa en liten FakeData för att se vad den innehåller
import torch
from torchvision import datasets
import matplotlib.pyplot as plt

# Skapa FakeData med 2 klasser
fake_data = datasets.FakeData(size=10, num_classes=2, transform=None)

# Visa första bilden
image, label = fake_data[0]
print(f"Bildstorlek: {image.size}")
print(f"Klass: {label}")
print(f"Pixelvärden: {image.getextrema()}")

# Visa bilden
plt.figure(figsize=(6, 4))
plt.imshow(image)
plt.title(f"FakeData - Klass {label}")
plt.axis('off')
plt.show()


## 🎯 Experimentera med Träning

Nu ska vi träna en modell och se hur olika inställningar påverkar resultatet.

**Hyperparametrar att experimentera med**:
- `epochs` - antal genomgångar av datasetet
- `batch_size` - antal bilder per träningssteg
- `--no-pretrained` - börja från noll vs förtränade vikter


In [None]:
# Experiment 1: Snabb träning (1 epoch, ingen pretrained)
print("🧪 Experiment 1: Snabb träning")
!python -m piedge_edukit.train --fakedata --no-pretrained --epochs 1 --batch-size 128 --output-dir ./models_exp1


In [None]:
# Visa träningsresultat från Experiment 1
import json
import os

if os.path.exists("./models_exp1/training_info.json"):
    with open("./models_exp1/training_info.json", "r") as f:
        info = json.load(f)
    
    print("📊 Träningsresultat (Experiment 1):")
    print(f"Final accuracy: {info.get('final_accuracy', 'N/A'):.3f}")
    print(f"Final loss: {info.get('final_loss', 'N/A'):.3f}")
    print(f"Epochs: {info.get('epochs', 'N/A')}")
    print(f"Batch size: {info.get('batch_size', 'N/A')}")
else:
    print("❌ Träningsinfo saknas")


## 🤔 Reflektionsfrågor

<details>
<summary>💭 Vad händer med överfitting när du höjer epochs?</summary>

**Svar**: Med fler epochs kan modellen lära sig träningsdata för bra och dåligt generalisera till nya data. Detta kallas överfitting.

**Experiment**: Kör samma träning men med `--epochs 5` och jämför accuracy på tränings- vs valideringsdata.

</details>

<details>
<summary>💭 Varför exporterar vi till ONNX (för Pi/edge)?</summary>

**Svar**: ONNX är ett standardformat som fungerar på många plattformar (CPU, GPU, mobil, edge). Det gör modellen portabel och optimerad för inference.

**Fördelar**:
- Snabbare inference än PyTorch
- Mindre minnesanvändning
- Fungerar på Raspberry Pi
- Stöd för kvantisering (INT8)

</details>


## 🎯 Ditt eget experiment

**Uppgift**: Träna en modell med andra inställningar och jämför resultaten.

**Förslag**:
- Öka epochs till 3-5
- Ändra batch_size till 64 eller 256
- Testa med och utan `--no-pretrained`

**Kod att modifiera**:
```python
# Ändra dessa värden:
EPOCHS = 3
BATCH_SIZE = 64
USE_PRETRAINED = False  # True för förtränade vikter

!python -m piedge_edukit.train --fakedata --epochs {EPOCHS} --batch-size {BATCH_SIZE} --output-dir ./models_myexp
```


In [None]:
# TODO: Implementera ditt experiment här
# Ändra värdena nedan och kör träningen

EPOCHS = 3
BATCH_SIZE = 64
USE_PRETRAINED = False

print(f"🧪 Mitt experiment: epochs={EPOCHS}, batch_size={BATCH_SIZE}, pretrained={USE_PRETRAINED}")

# TODO: Kör träningen med dina inställningar
# !python -m piedge_edukit.train --fakedata --epochs {EPOCHS} --batch-size {BATCH_SIZE} --output-dir ./models_myexp


## 🎉 Sammanfattning

Du har nu lärt dig:
- Vad FakeData är och varför vi använder det
- Hur träning fungerar med olika hyperparametrar
- Varför ONNX-export är viktigt för edge deployment

**Nästa steg**: Gå till `02_latency_benchmark.ipynb` för att förstå hur vi mäter modellens prestanda.

**Viktiga begrepp**:
- **Epochs**: Antal genomgångar av datasetet
- **Batch size**: Antal bilder per träningssteg
- **Pretrained weights**: Förtränade vikter från ImageNet
- **ONNX**: Standardformat för edge deployment
