<a href="https://colab.research.google.com/github/els285/Aachen_Intro2NN/blob/main/Exercises/3_Regression_Penguins_PyTorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Regression in PyTorch

Repeating the procedure in PyTorch will be instructive for building neural networks in PyTorch later

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import torch
from torch import nn, optim

## Loading data

In [None]:
!wget -O penguins_downloaded.csv "https://cernbox.cern.ch/s/wh34GhKCOv0Umh7/download"
print("Download complete")

Load the penguin dataset (a CSV file) using the `pandas` module

In [None]:
input_penguins_df = pd.read_csv("penguins_downloaded.csv")
penguins_df = input_penguins_df.dropna(inplace=False)

For PyTorch, we need to load the data into tensors, luckily we're familiar with these already. They need to be columns hence the reshaping

In [None]:
input_data = torch.tensor(penguins_df["flipper_length_mm"].values, dtype=torch.float32).reshape(-1,1)
target     = torch.tensor(penguins_df["body_mass_g"].values,       dtype=torch.float32).reshape(-1,1)

Our model here is a "linear" layer which for the simple linear regression case, will map a single value $X_i$ to a single value $y_i$

In [None]:
model = nn.Linear(1, 1)

We can examine the model through the following:

In [None]:
print(model)
print(list(model.parameters()))

It shows we've defined a Linear model with one input, one output, and biases turned on (which is just the y-intercept for simple linear regression). The model parameters are the things which will alter as we train - just now they're set to random values

## Training

In the sklearn case, an analytic solution for ordinary least squares was implicitly used to solve the regression problem.

In PyTorch, the same problem is solved iteratively by minimising some loss function and updating our two model paramters. We will use MSELoss which we saw in the lecture slides: it is a very common choice for regression problems.

In [None]:
loss_function = nn.MSELoss()
optimizer = optim.Rprop(model.parameters())

Now the iteration itself, a `for` loop which:
* Makes a prediction based on the current model
* Computes the loss ~ the difference between the prediction and the true value
* Updates the model

This cell is slightly more verbose, so make sure you understand what each line is doing

In [None]:
# keep track of the loss every epoch. This is only for visualisation
losses = []

N_epochs = 1000

for epoch in range(N_epochs):
    # tell the optimizer to begin an optimization step
    optimizer.zero_grad()

    # use the model as a prediction function: features → prediction
    predictions = model(input_data)

    # compute the loss (χ²) between these predictions and the intended targets
    loss = loss_function(predictions, target)

    # tell the loss function and optimizer to end an optimization step
    loss.backward()
    optimizer.step()

    losses.append(loss.item())

    # Print the loss every 10 epochs
    if (epoch + 1) % 10 == 0:
        print(f'Epoch [{epoch + 1}/{N_epochs}], Loss: {loss.item():.4f}')

# Task 1
* Try with a small number of epochs e.g. 30 and see what the loss curve looks like: has the model converged?
* Reset the model parameters and see how many it takes for the model to converge. Beware of local minima!

We've printed the loss every 10 epochs to keep track that it is decreasing as we want.
This is good practice for training ML models, rather than just waiting until all epochs have run and then examining the result.
There are also many highly advanced tools for monitoring training: wandb.ai and TensorBoard are a couple of examples

In [None]:
def plot_loss_curve(losses):
    plt.plot(losses)
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.yscale('log')
    plt.title('Loss Curve')
    plt.show()

plot_loss_curve(losses)

In [None]:
# USE THIS TO RESET THE MODEL PARAMETERS!!!
model.reset_parameters()
optimizer = optim.Rprop(model.parameters())

## Evaluating the model

Evaluating the model on the input data is as simple as a passing it as an argument. This returns a tensor with the derivates attached, so we have to detach these and map back to numpy

In [None]:
y_out = model(input_data)
y_pred = y_out.detach()

In [None]:
# Plot the original data and the linear regression line
plt.scatter(input_data, target, color='blue', label='Data Points')
plt.plot(input_data, y_pred, color='red', label='Linear Regression Line')
plt.xlabel('Input')
plt.ylabel('Target')
plt.title('Linear Regression Example')
plt.legend()
plt.show()

We can use the regression performance metrics from before to evaluate the performance of the model. Build a function which computes the R-squared score from it's definition:
$ R^2 = 1 - \frac{\sum (y_{\text{true}} - y_{\text{pred}})^2}{\sum (y_{\text{true}} - \bar{y_{\text{true}}})^2}$

In [None]:
# @title Solution to coefficient of determination {"display-mode":"form"}

# R-squared metric
def r_squared(y_true, y_pred):
    ss_res = torch.sum((y_true - y_pred) ** 2)
    ss_tot = torch.sum((y_true - torch.mean(y_true)) ** 2)
    return 1 - (ss_res / ss_tot)

r_squared_value = r_squared(target, y_pred)
print(r_squared_value)

# Task 2: Multilinear Regression in PyTorch

Extend to multilinear regression. You will need to adapt the `Linear` model to take more than single inputs. How does the $R^2$ compare?

## Solution

In [None]:
input_data2D = torch.tensor(penguins_df[["flipper_length_mm" , "bill_depth_mm"]].values, dtype=torch.float32)
target     = torch.tensor(penguins_df["body_mass_g"].values,       dtype=torch.float32).reshape(-1,1)

In [None]:
model2 = nn.Linear(2, 1)

In [None]:
loss_function = nn.MSELoss()
optimizer = optim.Rprop(model2.parameters())

In [None]:
# keep track of the loss every epoch. This is only for visualisation
losses = []

N_epochs = 3000

for epoch in range(N_epochs):
    # tell the optimizer to begin an optimization step
    optimizer.zero_grad()

    # use the model as a prediction function: features → prediction
    predictions = model2(input_data2D)

    # compute the loss (χ²) between these predictions and the intended targets
    loss = loss_function(predictions, target)

    # tell the loss function and optimizer to end an optimization step
    loss.backward()
    optimizer.step()

    losses.append(loss.item())

    # Print the loss every 10 epochs
    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch + 1}/{N_epochs}], Loss: {loss.item():.4f}')

In [None]:
plot_loss_curve(losses)

In [None]:
y_out2 = model2(input_data2D)
y_pred2 = y_out2.detach()

# Neural Networks

Implementing the neural network part is trivial. We simply replace our `model` with one defined by `nn.Sequential`, as shown.

In [None]:
model_DNN = nn.Sequential(nn.Linear(1, 50),
                          nn.ReLU(),
                          nn.Linear(50, 1))

Build a multilinear regression model, which takes in whatever inputs you wish, and regress another continuous value. Use the `nn.Sequential` defined above, or something similar.

What other metrics can we use to define our regression model performance? Refer to [Scikit-Learn regression metrics](https://scikit-learn.org/stable/modules/model_evaluation.html#regression-metrics)

## Solution

In [None]:
# Optimiser and loss function
loss_function = nn.MSELoss()
optimizer = optim.Rprop(model_DNN.parameters())

In [None]:
# Trainging loop
# keep track of the loss every epoch. This is only for visualisation
losses = []
N_epochs = 3000
for epoch in range(N_epochs):
    # tell the optimizer to begin an optimization step
    optimizer.zero_grad()

    # use the model as a prediction function: features → prediction
    predictions = model_DNN(input_data)

    # compute the loss (χ²) between these predictions and the intended targets
    loss = loss_function(predictions, target)

    # tell the loss function and optimizer to end an optimization step
    loss.backward()
    optimizer.step()

    losses.append(loss.item())

    # Print the loss every 10 epochs
    if (epoch + 1) % 100 == 0:
        print(f'Epoch [{epoch + 1}/{N_epochs}], Loss: {loss.item():.4f}')

In [None]:
plot_loss_curve(losses)

In [None]:
r_squared_value = r_squared(target, y_pred)
print(r_squared_value)

In [None]:
input_data/input_data.mean()

In [None]:
# Perform all this again with normalized data
input_data_normalised = (input_data - input_data.mean())/ input_data.std()
target_normalised = (target - target.mean())/ target.std()
# model_cheese = nn.Linear(1, 1)
model_DNN_norm = nn.Sequential(nn.Linear(1, 50),
                              nn.ReLU(),
                              nn.Linear(50, 1))


loss_function = nn.MSELoss()
optimizer = optim.Rprop(model_DNN_norm.parameters())
# keep track of the loss every epoch. This is only for visualisation
losses = []
N_epochs = 200
for epoch in range(N_epochs):
    # tell the optimizer to begin an optimization step
    optimizer.zero_grad()

    # use the model as a prediction function: features → prediction
    predictions = model_DNN_norm(input_data_normalised)

    # compute the loss (χ²) between these predictions and the intended targets
    loss = loss_function(predictions, target_normalised)

    # tell the loss function and optimizer to end an optimization step
    loss.backward()
    optimizer.step()

    losses.append(loss.item())

    # Print the loss every 10 epochs
    if (epoch + 1) % 1 == 0:
        print(f'Epoch [{epoch + 1}/{N_epochs}], Loss: {loss.item():.4f}')

In [None]:
plot_loss_curve(losses)

In [None]:
y_out = model_DNN_norm(input_data_normalised)
y_pred = y_out.detach()

In [None]:
plt.plot(input_data, y_pred)

In [None]:
# Plot the original data and the linear regression line
plt.scatter(input_data, target, color='blue', label='Data Points')
plt.plot(input_data, y_pred, color='red', label='Linear Regression Line')
plt.xlabel('Input')
plt.ylabel('Target')
plt.title('Linear Regression Example')

In [None]:
target_normalised.detach().numpy()

In [None]:
# Plot the original data and the linear regression line
plt.scatter(input_data_normalised, target_normalised.detach().numpy(), color='blue', label='Data Points')
plt.plot(input_data_normalised, y_pred.detach().numpy(), color='red', label='Linear Regression Line')
plt.xlabel('Input')
plt.ylabel('Target')
plt.title('Linear Regression Example')
plt.legend()
plt.show()

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt

# ------------------------
# 1. Generate synthetic data
# ------------------------
torch.manual_seed(0)
N = 200
X = torch.linspace(-3, 3, N).unsqueeze(1)         # shape (N, 1)
y_true = torch.sin(X) + 0.1 * torch.randn_like(X)  # regression target

# ------------------------
# 2. Normalize input manually
# ------------------------
x_mean = X.mean(0, keepdim=True)
x_std = X.std(0, keepdim=True)
X_norm = (X - x_mean) / x_std

# ------------------------
# 3. Define model
# ------------------------
model = nn.Sequential(
    nn.Linear(1, 16),
    nn.ReLU(),
    nn.Linear(16, 1)
)

# ------------------------
# 4. Train the model
# ------------------------
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.01)

for epoch in range(500):
    optimizer.zero_grad()
    y_pred = model(X_norm)
    loss = criterion(y_pred, y_true)
    loss.backward()
    optimizer.step()

    if epoch % 50 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item():.4f}")

# ------------------------
# 5. Inference + Plot
# ------------------------
# Generate input points in original (unnormalized) space
X_test = torch.linspace(-3, 3, 300).unsqueeze(1)

# Normalize before inference
X_test_norm = (X_test - x_mean) / x_std

with torch.no_grad():
    y_test_pred = model(X_test_norm)

# Plot prediction vs. ground truth
plt.figure(figsize=(8, 5))
plt.scatter(X.numpy(), y_true.numpy(), color='gray', alpha=0.4, label="Training Data")
plt.plot(X_test.numpy(), y_test_pred.numpy(), color='red', label="Model Prediction")
plt.xlabel("x")
plt.ylabel("y")
plt.title("1D Regression with Normalized Input")
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()
