<h1 align="center">Deep Learning - Master in Deep Learning of UPM</h1> 

**IMPORTANTE**

Antes de empezar debemos instalar PyTorch Lightning, por defecto, esto valdría:

In [75]:
!pip install pytorch-lightning

Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
^C


Además, si te encuentras ejecutando este código en Google Collab, lo mejor será que montes tu drive para tener acceso a los datos:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

En este ejercicio práctico vamos a utilizar los conocimientos adquiridos para abordar un caso de Regresión mediante PyTorch Lightning

In [None]:
import pandas as pd

DATA_PATH = 'data/exercise.csv' # Pon tu ruta dependiendo de donde tengas el archivo en el Drive

df = pd.read_csv(DATA_PATH)

df.head() # Imprimamos las primeras filas del dataframe

Unnamed: 0,0,1,2,3,4,5,6,7,8,target
0,-0.567889,1.939935,2.151471,-0.18695,-0.651792,1.509136,1.248835,1.023286,-2.2966,-74.736449
1,0.296674,-0.529911,0.273535,-1.220958,0.949093,-0.791798,-1.228863,0.676973,-0.280019,-4.239277
2,-0.379515,1.181062,-0.292617,-0.424034,-0.108128,1.749699,0.377352,-1.964881,-0.844832,-95.592664
3,-0.12071,0.269624,-0.009167,-0.852415,-0.121054,-0.589381,-0.321264,-0.736134,-0.88351,-132.503258
4,-0.270182,-1.466287,0.335747,-0.038218,-1.206132,-0.820438,-1.082228,-0.77405,0.330435,-181.452417


# Dataset

In [None]:
import torch
import pandas as pd

class RegressionDataset(torch.utils.data.Dataset):
    def __init__(self, df):
        ...

    def __len__(self):
        ...

    def __getitem__(self, idx):
        ...
        return features, target

# DataModule

In [None]:
from sklearn.model_selection import train_test_split

# NO TOCAR
def split_train_val_test(df, val_size=0.2, test_size=0.2):
    eval_size = val_size + test_size # eval es un split intermedio que luego se divide en val y test
    test_prop = test_size / eval_size # proporción de test respecto a eval

    train, eval_ = train_test_split(df, test_size=eval_size)
    val, test = train_test_split(eval_, test_size=test_prop)
    return train, val, test

El DataModule debe recibir dinámicamente:
- El batch_size, por defecto a 64
- El número de trabajadores (num_workers), por defecto a 4
- El prefetch_factor, por defecto a 2
- Y el pin_memory, por defecto a True

Con ello deberá inicializar los dataloaders para entrenamiento, validación y test

In [None]:
import pytorch_lightning
import numpy as np
from torch.utils.data import DataLoader

class RegressionDataModule(pytorch_lightning.LightningDataModule):
    def __init__(self, df, ...):
        super().__init__()
        ...

    def setup(self, stage=None): # esta función la ejecuta el trainer cuando se va a ejecutar el fit o el predict
        ...

    def collate_fn(self, batch): # PISTA: recordad que esto es regresión...
        ...

    def train_dataloader(self):
        ...

    def val_dataloader(self):
        ...

    def test_dataloader(self):
        ...

# LightningModule

La red debe ser un MLP a vuestra elección, también el optimizador.

Sin embargo, la función de pérdida se deberá adecuar para el caso de regresión...

La métrica que se deberá guardar será el R2 (disponible en [torchmetrics](https://lightning.ai/docs/torchmetrics/stable/regression/r2_score.html))

Ya os habréis fijado en que los steps de entrenamiento y validación son bastante semejante. Para ahorrar código, vamos a abstraer la parte común a otra función compute_batch() que recibirá el batch y el split para el que se realiza el step. Esta es la magia de Lightning, esta función no es nativa del LightningModule, es cosecha propia.

In [None]:
from torchmetrics import R2Score
import torch.nn as nn

class Regressor(pytorch_lightning.LightningModule):
    def __init__(self, input_shape):
        super().__init__()

        # Inicializamos las capas de la red
        ...

        # Función de pérdida
        self.criterion = ...

        # Inicializamos las métricas
        self.r2 = R2Score()

    # Función forward como en un nn.Module de PyTorch
    def forward(self, x):
        ...
        return x
    
    
    def compute_batch(self, batch, split='train'):
        ...
        
        return loss
    
    def training_step(self, batch, batch_idx):
        return self.compute_batch(batch, 'train')
    
    def validation_step(self, batch, batch_idx):
        return self.compute_batch(batch, 'val')
    
    def test_step(self, batch, batch_idx):
        return self.compute_batch(batch, 'test')
    
    def configure_optimizers(self):
        return ...

# Callbacks, Loggers y Trainer

**Ejercicio extra** - Implementar un callback que haga de timer. Este tendrá que registrar el momento en el que empieza el entrenamiento e imprimir el tiempo que ha transcurrido cuando este termina

In [None]:
import pytorch_lightning
import time

class Timer(pytorch_lightning.Callback):
    ...

In [None]:
import os
import datetime

SAVE_DIR = f'lightning_logs/exercise/{datetime.datetime.now().strftime("%Y-%m-%d_%H-%M-%S")}'


# DataModule (NO TOCAR)
data = pd.read_csv(DATA_PATH)
data_module = RegressionDataModule(data, batch_size=64)

# LightningModule
input_shape = ... # Encontrar una manera dinámica de obtener el input_shape, PISTA: df.shape
model = Regressor(input_shape=input_shape)

# Callbacks
# Se deberá monitorizar la métrica 'val_r2' y guardar únicamente el mejor modelo.
early_stopping_callback = pytorch_lightning.callbacks.EarlyStopping(
    ...
)
model_checkpoint_callback = pytorch_lightning.callbacks.ModelCheckpoint(
    ...
)

# Descomentar las dos líneas siguientes si se ha implementado el callback Timer
# timer_callback = Timer()
# callbacks = [early_stopping_callback, model_checkpoint_callback, timer_callback]

callbacks = [early_stopping_callback, model_checkpoint_callback]

# Loggers (NO TOCAR)
csv_logger = pytorch_lightning.loggers.CSVLogger(
    save_dir=SAVE_DIR,
    name='metrics',
    version=None
)

loggers = [csv_logger] # se pueden poner varios loggers (mirar documentación)

# Trainer (NO TOCAR)
trainer = pytorch_lightning.Trainer(max_epochs=50, accelerator='gpu', devices=[0], callbacks=callbacks, logger=loggers)

trainer.fit(model, data_module)
results = trainer.test(model, data_module)

GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Missing logger folder: lightning_logs/exercise/2024-11-28_18-49-52/metrics
/home/adrian/.local/lib/python3.10/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:652: Checkpoint directory /home/adrian/workspace/deep-learning-dlmasterupm/assignments/pytorch_basics/session_5/lightning_logs/exercise/2024-11-28_18-49-52 exists and is not empty.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]

  | Name      | Type    | Params | Mode 
----------------------------------------------
0 | l1        | Linear  | 320    | train
1 | l2        | Linear  | 2.1 K  | train
2 | l3        | Linear  | 2.1 K  | train
3 | out       | Linear  | 33     | train
4 | act       | GELU    | 0      | train
5 | criterion | MSELoss | 0      | train
6 | r2        | R2Score | 0      | train
----------------------------------------------
4.5 K     Trainable params
0         Non-trainable params
4.5 

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=50` reached.
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]


El entrenamiento ha durado 114.78146147727966 segundos!


Testing: |          | 0/? [00:00<?, ?it/s]

# Inferencia

In [69]:
test_sample = data_module.test_df.sample(10)

inputs = torch.tensor(test_sample.drop('target', axis=1).values, dtype=torch.float32)
targets = torch.tensor(test_sample['target'].values, dtype=torch.float32)

model.eval()

with torch.no_grad():
    outputs = model(inputs)
    preds = outputs.squeeze().numpy()

for i, (pred, target) in enumerate(zip(preds, targets)):
    print(f"Predicción {i}: {pred:.2f}, Valor real: {target:.2f}")

Predicción 0: -68.67, Valor real: -49.20
Predicción 1: -50.52, Valor real: -38.50
Predicción 2: -36.19, Valor real: -26.59
Predicción 3: -111.46, Valor real: -83.73
Predicción 4: 232.16, Valor real: 253.69
Predicción 5: -159.69, Valor real: -142.23
Predicción 6: -212.26, Valor real: -220.30
Predicción 7: 187.02, Valor real: 179.16
Predicción 8: -199.13, Valor real: -198.58
Predicción 9: 164.71, Valor real: 154.61
