# Task 2
In this notebook you will benchmark a nonlinear model (MLP) against the linear regression baseline from HW1 on the same mini‑GEO gene‑expression task (predict a single target gene from landmark genes). 

We have provided you with skeleton code to read the dataset file, you should complete the missing code, and modify any code as you need to complete Task 2.

Reuse / adapt your HW1 linear regression code to record its test MSE and Pearson $r$, and then design, train, and tune a PyTorch MLP (choose hidden layer sizes, activation, learning rate, batch size, epochs, optional dropout...). You’ll track train/validation losses to pick hyper‑parameters, and finally compare LR vs the best MLP on the held‑out test set.

The TODO blocks below correspond exactly to those steps: fill them in, run the notebook end‑to‑end, and copy the resulting metrics/plots into your report. Keep runs short at first (few epochs, small model) to verify everything works before tuning.

## Linear Regression (LR)

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split

data = pd.read_csv('gene_expression_regression.csv')
data_array = data.iloc[:, 1:].values  # Skip the sample ID column

X = data_array[:, :943]       # First 943 genes = input
Y = data_array[:, 943:]       # Remaining genes = targets

X_train, X_test, Y_train, Y_test = train_test_split(
    X, Y, test_size=0.2, random_state=42)

X_test, X_val, Y_test, Y_val = train_test_split(
    X_test, Y_test, test_size=0.5, random_state=42)


# TODO-1 reuse and adapt HW1 code to train a LR model and record its test MSE and Pearson r
# === YOUR CODE HERE ===

# == END OF YOUR CODE ===

## Multilayered Perceptron (MLP)

TODO-2 – Hyper-parameters: This block is where you set all the hyperparamters that control training: batch_size (how many samples each gradient step sees) and epochs (how many full passes over the training data). You are free to add any other hyper-paramters here too: learning rate, hidden-layer widths, dropout (if using). so you can tweak them from one place.

TODO-3 – Model definition: Decide on the model architechture (which layers are you using, what are their size, activation functions (e.g. nn.ReLU()), and optionally also nn.Dropout as needed). The final layer should output a single value (nn.Linear(..., 1)). See example in `task-1.ipynb`.

TODO-4 – Training loop: Implement a loop that trains the model, by using the predefined optimizer (Adam is an advanced optimization algorithm that builds upon the basic ideas of Gradient Descent.) and criterion (MSE). See example in `task-1.ipynb`.

Once those TODOs are filled, the evaluation block we provided to you will compute test-set MSE so you can compare models.

TODO-5 - Any other plots required for the report, evaluations on the validation set and code you might need to add.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset, random_split
import numpy as np

# TODO-2: Define Hyperparameters 
# === YOUR CODE HERE ===
batch_size = ...
epochs = ...
# (more can be added as needed)
# == END OF YOUR CODE ===

# Create datasets and dataloaders
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(Y_train, dtype=torch.float32).view(-1, 1)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(Y_test, dtype=torch.float32).view(-1, 1)
X_val_tensor = torch.tensor(X_val, dtype=torch.float32)
y_val_tensor = torch.tensor(Y_val, dtype=torch.float32).view(-1, 1)

train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
test_dataset = TensorDataset(X_test_tensor, y_test_tensor)
val_dataset = TensorDataset(X_val_tensor, y_val_tensor)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)

In [None]:
# Define the model
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.model = nn.Sequential(
            # TODO-3: Implement your model
            # === YOUR CODE HERE ===

            # == END OF YOUR CODE ===
         ...
        )
    
    def forward(self, x):
        return self.model(x)

model = MLP()
print(model)

# Loss and optimizer
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters())

# Training loop
for epoch in range(epochs):
    model.train()

    for batch_X, batch_y in train_loader:
            # TODO-4: Implement the training step
            # === YOUR CODE HERE ===
            ...
            # == END OF YOUR CODE ===

## Evalutation

In [None]:
#TODO-5: Validation
# === YOUR CODE HERE ===

# == END OF YOUR CODE ===

In [None]:
# Final Evaluation
model.eval()
with torch.no_grad():
    y_pred = model(X_test_tensor).numpy().flatten()
    y_true = y_test_tensor.numpy().flatten()
    test_mse = np.mean((y_true - y_pred) ** 2)

print(f"Test MSE: {test_mse:.4f}")