# Lektion 7 - Automatisk differentiering och optimering

**Assignment: Gradients and optimizers**

Instructions:
1. Use PyTorch or JAX
2. Keep examples small and explain with comments

## Task 1: Autodiff basics
 Use automatic differentiation and compare with an analytic derivative.

In [3]:
# TODO: Define f(x) = x**3 + 2*x

def f(x):
    return (x**3 + 2*x)

print(f(3))

33


In [4]:
# TODO: Use autodiff to compute df/dx at x=3
import torch 

# Task 1: Autodiff basics
x = torch.tensor(3.0, requires_grad=True)
# f = x**3 + 2 * x
g = f(x)
g.backward()
print("Autodiff df/dx at x=3:", x.grad.item())


A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.4.2 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/Users/andreas/ML-Frameworks/ML-Frameworks/.venv/lib/python3.11/site-packages/ipykernel_launcher.py", line 18, in <module>
    app.launch_new_instance()
  File "/Users/andreas/ML-Frameworks/ML-Frameworks/.venv/lib/python3.11/site-packages/traitlets/config/application.py", line 1075, in launch_instance
    app.start()
  File "/Users/andreas/ML-Frameworks/ML-Frameworks/.venv/lib/python3.11/site-packages/ipykernel/

Autodiff df/dx at x=3: 29.0


In [5]:
# TODO: Compare with the analytic derivative

def f_prim(x):
    return 3 * x**2 + 2

print(f_prim(3))



29


## Task 2: Optimizer comparison
Train a small model and compare optimizers.

In [7]:
# TODO: Train a small model (e.g., logistic regression)
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import torch
import torch.nn as nn
import torch.optim as optim

# Vi börjar med att skapa ett litet syntetiskt klassifikationsdataset
# Till det använder vi sklearns make_classification
X, y = make_classification(
    n_samples=500, n_features=6, n_informative=4, n_redundant=0, random_state=42
)

# Vi vill inte göra scaling innan vi splittar data i train/test
# VARFÖR?: Information om test kommer att smita med i train.
# Detta kallas dataläckage, modellen kommer bli artificiellt bra! 
# Men generaliserar dåligt
X = StandardScaler().fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# Här skulle vi typiskt skala data!

X_train_t = torch.tensor(X_train, dtype=torch.float32)
X_test_t = torch.tensor(X_test, dtype=torch.float32)
y_train_t = torch.tensor(y_train, dtype=torch.float32).view(-1, 1)
y_test_t = torch.tensor(y_test, dtype=torch.float32).view(-1, 1)




In [None]:
# TODO: Compare SGD vs Adam for 20-50 epochs

def train_with_optimizer(optimizer_cls, lr=0.01, epochs=30):
    model = nn.Sequential(nn.Linear(X_train.shape[1], 1), nn.Sigmoid())
    loss_fn = nn.BCELoss() # Binary cross entropy pga bara 2 klasser, vid fler kör vi categoricalCrossEntropy
    optimizer = optimizer_cls(model.parameters(), lr=lr)
    
    # I loopen nedan använder vi _ istället för en variabel (t.ex i)
    # Då sparar vi inte värdet => vi sparar lite minne.
    # För ML/DL är minne ofta en begränsande faktor. 
    for _ in range(epochs):
        model.train()
        preds = model(X_train_t)
        loss = loss_fn(preds, y_train_t)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    model.eval()
    with torch.no_grad():
        test_preds = (model(X_test_t) > 0.5).float()
        acc = (test_preds == y_test_t).float().mean().item()
        test_loss = loss_fn(model(X_test_t), y_test_t).item()
    return test_loss, acc

sgd_loss, sgd_acc = train_with_optimizer(optim.SGD, lr=0.1, epochs=40)
adam_loss, adam_acc = train_with_optimizer(optim.Adam, lr=0.01, epochs=40)



In [None]:
# TODO: Record final loss and accuracy

# Hur tolkar vi resultaten?
# Accuracy: Vad var andelen rätt vi hade? (0-100%, ett värde mellan 0 och 1)
# Loss: Hur mycket fel hade modellen? (0-inf, lägre är bättre, bra för att jämföra olika modeller på samma problem)

print(f"SGD -> loss: {sgd_loss:.4f}, acc: {sgd_acc:.4f}")
print(f"Adam -> loss: {adam_loss:.4f}, acc: {adam_acc:.4f}")

SGD -> loss: 0.6577, acc: 0.6100
Adam -> loss: 0.6609, acc: 0.6200


## Task 3: Reflection
Add short comments based on your results.

In [None]:
# TODO: Write 4-6 comment lines about when SGD can be preferable to Adam

In [None]:
print("Done! You computed gradients and compared optimizers.")