# Lektion 10 - Forenklade processer (Lightning / fastai)

**Assignment: Reduce boilerplate in training loops**

Instructions:
1. Start from a plain PyTorch training loop
2. Refactor to Lightning OR implement with fastai
3. Compare code length and readability

## Task 1: Baseline PyTorch loop
Start with a plain training loop.

In [9]:
# TODO: Build a small model and training loop in PyTorch
# Det h√§r g√∂r vi som vanligt! 
# Vi laddar in data, sen bygger en tr√§ningsloop

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler


import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

# Vi laddar in data och splittar i X och y
iris = load_iris()
X = iris.data
y = iris.target

# Vi delar in datan i train och test
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.3, random_state=42)

# Vi scalar v√•r data efter splitten, f√∂r att undvika data l√§ckage
# Vi skulle kunna skapa en Scaler, och sedan √•teranv√§nda, men d√•
# f√•r vi se till att inte k√∂ra fit p√• b√§gge, s√• inget data smiter √∂ver
X_train = StandardScaler().fit_transform(X_train)
X_test = StandardScaler().fit_transform(X_test)

# ============ H√ÑR SKULLE VI TYPISKT G√ñRA EDA ============ #

# vi definerar en DL-modell med 3 lager (in - hidden - out)
# De har (4, 128 respektive 3 noder)
model = nn.Sequential(
    nn.Linear(4, 128),
    nn.ReLU(),
    nn.Linear(128, 3)
)

# Vi anv√§nder crossentropyloss, eftersom vi har ett klassifikationsproblem med >2 klasser
criterion = nn.CrossEntropyLoss()

# adam √§r v√•r standardoptimerare!
optimizer = optim.Adam(model.parameters(), lr=0.01)

device = "cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu"

model.to(device)


Sequential(
  (0): Linear(in_features=4, out_features=128, bias=True)
  (1): ReLU()
  (2): Linear(in_features=128, out_features=3, bias=True)
)

In [10]:
X.shape

(150, 4)

In [11]:
# Vi g√∂r om v√•r data till tensors, och packeterar i TensorDataset och DataLoader f√∂r att underl√§tta tr√§ning

train_ds = TensorDataset(
    torch.tensor(X_train, dtype=torch.float32),
    torch.tensor(y_train, dtype=torch.long),
)
test_ds = TensorDataset(
    torch.tensor(X_test, dtype=torch.float32),
    torch.tensor(y_test, dtype=torch.long),
)

# Loadern g√∂r att vi kan iterera √∂ver v√•r data i batches, och √§ven shuffla den under tr√§ning
# Detta √§r ocks√• n√•got som underl√§ttar f√∂r torch att hantera datan, och kan leda till b√§ttre konvergens
train_loader = DataLoader(train_ds, batch_size=16, shuffle=True)
test_loader = DataLoader(test_ds, batch_size=16)

In [12]:
# TODO: Train for a few epochs and record accuracy

epochs = 3
for _ in range(epochs):
    model.train()
    for xb, yb in train_loader:
        xb, yb = xb.to(device), yb.to(device)
        
        # A. Zero grad: Set gradients to zero before backward pass
        optimizer.zero_grad()
    
        # B. Forward: Build the graph & get prediction
        # Outputs kan ofta kallas logits, men det √§r inget m√•ste
        outputs = model(xb)
        loss = criterion(outputs, yb)
    
        # C. Backward: AutoDiff calculates the "blame" (gradients)
        loss.backward()
        
        # D. Update: Optimizer moves weights down the hill
        optimizer.step()

model.eval()
correct, total = 0, 0
with torch.no_grad():
    for xb, yb in test_loader:
        xb, yb = xb.to(device), yb.to(device)
        preds = torch.argmax(model(xb), dim=1)
        correct += (preds == yb).sum().item()
        total += yb.size(0)
print(f"Baseline accuracy: {correct / total:.4f}")

Baseline accuracy: 0.9111


## Task 2: Refactor with Lightning OR fastai
Reduce boilerplate using a higher-level framework.

In [13]:
# TODO: Convert the loop into a LightningModule (or a fastai Learner)
import pytorch_lightning as pl

class IrisModule(pl.LightningModule):
    def __init__(self):
        # Vi kan se att modellen och lossfunktionen √§r helt centrala f√∂r v√•r tr√§ning
        # och till och med instansierade i init 
        super().__init__()
        self.model = nn.Sequential(nn.Linear(4, 16), nn.ReLU(), nn.Linear(16, 3))
        self.loss_fn = nn.CrossEntropyLoss()

    def forward(self, x):
        return self.model(x)

    def training_step(self, batch, batch_idx):
        x, y = batch
        logits = self(x)
        loss = self.loss_fn(logits, y)
        self.log("train_loss", loss)
        return loss

    def configure_optimizers(self):
        return optim.Adam(self.parameters(), lr=0.01)
    
    # def test_step(self, batch, batch_idx):
    #     x, y = batch
    #     logits = self(x)
    #     loss = self.loss_fn(logits, y)
    #     acc = (logits.argmax(dim=1) == y).float().mean()

    #     # üëá These will appear nicely formatted at the end
    #     self.log("test_loss", loss, prog_bar=True)
    #     self.log("test_acc", acc, prog_bar=True)

    #     return loss
    



In [14]:
# TODO: Train the same model and record accuracy

# I klassen ovan ser vi att vi saknar ett par av de centrala stegen i en tr√§ningsloop
# (zero_grad, backward, step, train, osv?)
# Lightning, och frameworks likt den, hanterar sj√§lva tr√§ningen √•t oss, och beh√∂ver
# bara veta n√•gra grundl√§ggande egenskaper
trainer = pl.Trainer(max_epochs=30, logger=False, enable_checkpointing=False)
lightning_model = IrisModule()
trainer.fit(lightning_model, train_loader)
print("Lightning training complete.")

# Lightning gav oss en rej√§l print, d√§r den dels letade (och hittade) GPU
# Den tr√§nade √§ven v√•r modell

# Find the accuracy of the lightning model
# lightning_model.eval()
# correct, total = 0, 0
# with torch.no_grad():
#     for xb, yb in test_loader:
#         # xb, yb = xb.to(device), yb.to(device)
#         preds = torch.argmax(lightning_model(xb), dim=1)
#         correct += (preds == yb).sum().item()
#         total += yb.size(0)
# print(f"Lightning accuracy: {correct / total:.4f}")


GPU available: True (mps), used: True


TPU available: False, using: 0 TPU cores
üí° Tip: For seamless cloud logging and experiment tracking, try installing [litlogger](https://pypi.org/project/litlogger/) to enable LitLogger, which logs metrics and artifacts automatically to the Lightning Experiments platform.

  | Name    | Type             | Params | Mode  | FLOPs
-------------------------------------------------------------
0 | model   | Sequential       | 131    | train | 0    
1 | loss_fn | CrossEntropyLoss | 0      | train | 0    
-------------------------------------------------------------
131       Trainable params
0         Non-trainable params
131       Total params
0.001     Total estimated model params size (MB)
5         Modules in train mode
0         Modules in eval mode
0         Total Flops
/Users/andreas/ML-Frameworks/ML-Frameworks/.venv/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:434: The 'train_dataloader' does not have many workers which may be a bottleneck. Con

Epoch 29: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 7/7 [00:00<00:00, 144.79it/s]

`Trainer.fit` stopped: `max_epochs=30` reached.


Epoch 29: 100%|‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 7/7 [00:00<00:00, 142.37it/s]
Lightning training complete.


## Task 3: Compare
Reflect on readability and debugging.

In [15]:
# TODO: Write 4-6 comment lines about:
# - what boilerplate disappeared
# - what became easier or harder to debug

In [16]:
print("Done! You simplified training with higher-level tools.")

Done! You simplified training with higher-level tools.
