# Implementacion de Redes Neuronales

## 1. Introduccion

En esta etapa se implementa un modelo de red neuronal utilizando PyTorch con el objetivo de predecir el precio de las propiedades, para eso se utilizara una arquitectura MLP, adecuada para datos tabulares estructurados. Luego se comparara este modelo contra los modelos clasicos de ML entrenados en dicha etapa.

## 2. Importacion de librerias y carga del dataset procesado

In [12]:
import pandas as pd
import torch
from sklearn.preprocessing import  StandardScaler
from sklearn.model_selection import train_test_split

import sys
from pathlib import Path
PROJECT_ROOT = Path.cwd().parents[0]
sys.path.append(str(PROJECT_ROOT))

In [13]:
df = pd.read_csv('../data/processed/listings_processed.csv')


### 2.1 Separacion de variables

In [14]:
y = df['price']  #Variable objetivo 
X = df.drop(columns=['price'])  #Features

In [None]:
X_train,y_train,X_test,y_test = train_test_split(X,y, test_size=0.2, random_state=42)

### 2.3 Escalado de datos

Las redes neuronales son sensibles a la escala de los datos, por lo que se aplica estandarización tanto a las variables predictoras como a la variable objetivo.


In [None]:

scaler_X = StandardScaler()
X_train = scaler_X.fit_transform(X_train)
X_test  = scaler_X.transform(X_test)

scaler_y = StandardScaler()
y_train = scaler_y.fit_transform(y_train.values.reshape(-1,1))
y_test  = scaler_y.transform(y_test.values.reshape(-1,1))


### 2.4 Conversion a tensores de PyTorch

Los datos deben convertirse a tensores para poder ser utilizados por el modelo.


In [None]:
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)

y_train = torch.tensor(y_train, dtype=torch.float32)
y_test = torch.tensor(y_test, dtype=torch.float32)

In [19]:
from torch.utils.data import DataLoader, TensorDataset

train_dataset = TensorDataset(X_train, y_train)
val_dataset   = TensorDataset(X_val, y_val)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader   = DataLoader(val_dataset, batch_size=64)

## 3. Definicion del modelo

### 3.1 Importacion de modulos desde src

Se importa la arquitectura MLP definida en `src/deep_learning/model.py` y la función de entrenamiento definida en `src/deep_learning/train.py`.

In [20]:
from src.deep_learning.model import MLPRegressor
from src.deep_learning.train import train_model


### 3.2 Inicializacion de modelo 

Dimension de entrada en funcion del numero de variables predictoras.

In [21]:
input_dim = X_train.shape[1]
model = MLPRegressor(input_dim)

## 4. Entrenamiento del modelo

Se entrena el modelo utilizando función de pérdida MSE y optimizador Adam.

In [None]:
model = train_model(model,
                    X_train,
                    y_train, 
                    epochs=100, 
                    lr=0.001)

Epoch 1 | Train Loss: 2651.1146 | Val Loss: 62.4456
Epoch 2 | Train Loss: 280.0082 | Val Loss: 51.6213
Epoch 3 | Train Loss: 249.3764 | Val Loss: 53.4672
Epoch 4 | Train Loss: 214.5918 | Val Loss: 34.7725
Epoch 5 | Train Loss: 205.6865 | Val Loss: 34.7145
Epoch 6 | Train Loss: 191.1895 | Val Loss: 34.0810
Epoch 7 | Train Loss: 182.7085 | Val Loss: 29.1673
Epoch 8 | Train Loss: 171.7605 | Val Loss: 24.8766
Epoch 9 | Train Loss: 161.6809 | Val Loss: 27.3297
Epoch 10 | Train Loss: 153.1351 | Val Loss: 21.7263
Epoch 11 | Train Loss: 140.9143 | Val Loss: 20.4821
Epoch 12 | Train Loss: 132.4370 | Val Loss: 19.5609
Epoch 13 | Train Loss: 122.2874 | Val Loss: 20.9994
Epoch 14 | Train Loss: 114.0790 | Val Loss: 22.5177
Epoch 15 | Train Loss: 104.6827 | Val Loss: 21.3492
Epoch 16 | Train Loss: 98.3945 | Val Loss: 17.8601
Epoch 17 | Train Loss: 91.9898 | Val Loss: 18.3084
Epoch 18 | Train Loss: 88.2635 | Val Loss: 19.9080
Epoch 19 | Train Loss: 81.3746 | Val Loss: 18.8221
Epoch 20 | Train Loss: 8

## 5.Evaluacion del modelo

### 5.1 Generacion de predicciones

In [24]:
model.eval()

with torch.no_grad():
    predictions = model(X_test)

### 5.2 Reversion del escalado

Las predicciones y valores reales se transforman nuevamente a su escala original para una correcta interpretación.

In [25]:
predictions_mlp = predictions.detach().cpu().numpy()
y_test_mlp = y_test.detach().cpu().numpy()

predictions_original = scaler_y.inverse_transform(predictions_mlp)
y_test_original = scaler_y.inverse_transform(y_test_mlp)



NameError: name 'scaler_y' is not defined

### 5.3 Metricas

In [None]:
from sklearn.metrics import mean_absolute_error,root_mean_squared_error,r2_score

mae_mlp = mean_absolute_error(y_test_original,predictions_original)
rmse_mlp = root_mean_squared_error(y_test_original,predictions_original)
r2_mlp = r2_score(y_test_original,predictions_original)

print(f"MAE Multilayer Perceptron: {mae_mlp:.2f}")
print(f"RMSE Multilayer Perceptron: {rmse_mlp:.2f}")
print(f"R² Multilayer Perceptron: {r2_mlp:.4f}")

MAE Multilayer Perceptron: 1157.86
RMSE Multilayer Perceptron: 2091.30
R² Multilayer Perceptron: 0.4032


In [None]:
print("Promedio real:", y_test_original.mean().item())
print("Promedio predicho:", predictions_original.mean().item())

Promedio real: 2871.589599609375
Promedio predicho: 2840.170654296875


In [None]:
from src.modeling.xboost import train_xgboost
import numpy as np

In [None]:

xgb_model = train_xgboost(X_train,y_train)
preds_xgb = xgb_model.predict(X_test)
preds_real = np.expm1(preds_xgb)
y_test_real = np.expm1(y_test)


  y_test_real = np.expm1(y_test)


In [None]:
mae_xb = mean_absolute_error(y_test_real,preds_real)
rmse_xb = root_mean_squared_error(y_test_real,preds_real)
r2_xb = r2_score(y_test_real,preds_real)

print(f"MAE Multilayer Perceptron: {mae_xb:.2f}")
print(f"RMSE Multilayer Perceptron: {rmse_xb:.2f}")
print(f"R² Multilayer Perceptron: {r2_xb:.4f}")

MAE Multilayer Perceptron: 15.24
RMSE Multilayer Perceptron: 360.01
R² Multilayer Perceptron: 0.0556
