Importación y configuración de MLflow:
Se cargan todas la librerias y las herramientas para la creación de los modelos.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

import mlflow
import mlflow.keras

print("TensorFlow:", tf.__version__)

# Configuración de MLflow (local, sin tracking_uri raro)
mlflow.set_experiment("airbnb_regresion_nn")

# Activo autolog para Keras (registra automáticamente métricas, parámetros y modelo)
mlflow.keras.autolog()

  return FileStore(store_uri, store_uri)
2025/12/01 13:13:20 INFO mlflow.tracking.fluent: Experiment with name 'airbnb_regresion_nn' does not exist. Creating a new experiment.


TensorFlow: 2.16.2


Carga del data set

In [2]:

df = pd.read_csv("/Users/S340/Documents/Octavo/Analítica de datos /airbnb-proyecto2-20252/airbnb_limpio.csv")
df.head()

Unnamed: 0,id,source,name,host_id,host_name,host_since,host_is_superhost,host_listings_count,host_total_listings_count,host_verifications,...,has_free_street_parking,has_private_entrance,has_essentials,has_heating,has_wifi,has_pets_allowed,has_hot_water,has_self_check_in,has_freezer,has_exercise_equipment
0,18744501,city scrape,"""Artist´s Creative Residence"" 100m² im Zentrum",129635321,Sylvia,2017-05-09,f,1.0,1.0,"['email', 'phone']",...,1,1,1,1,1,0,1,0,0,0
1,23356842,city scrape,"""Bohemian Residency"" (Central & Quiet) * * * * *",150173398,Vincent,2017-09-11,t,2.0,3.0,"['email', 'phone']",...,1,1,1,1,1,0,1,1,0,0
2,819658084391291386,city scrape,"""Feel at Home"" Flat at the Lerchenauer See",29225873,Skandar,2015-03-12,f,1.0,1.0,"['email', 'phone']",...,0,0,1,1,1,1,1,0,0,1
3,34677963,city scrape,"""Little Star"" Schlafoase im Zentrum",28482431,Adriana,2015-02-28,f,5.0,5.0,"['email', 'phone']",...,0,1,1,1,1,0,1,1,0,0
4,34431776,city scrape,"""Moonlight"" Schlafoase mitten im Szenenviertel",28482431,Adriana,2015-02-28,f,5.0,5.0,"['email', 'phone']",...,0,1,1,1,1,0,1,1,0,0


Limpieza de la variable predictoria. La variable 'price' viene como texto "string" con símbolos. Se limpia y se transforma en numérico.

In [3]:
df["price"] = (
    df["price"]
    .str.replace("$", "", regex=False)
    .str.replace(",", "", regex=False)
    .astype(float)
)

df["price"].head(), df["price"].dtype

(0    221.0
 1    797.0
 2    106.0
 3    258.0
 4    249.0
 Name: price, dtype: float64,
 dtype('float64'))

Hacemos una tranformación logarítmica del precio ya que este tiene una distribución muy asimétrica. Esto se hace para estabilizar la varianza y mejorar los modelos 

In [4]:
df["price_log"] = np.log1p(df["price"])
df["price_log"].describe()

count    5562.000000
mean        5.243958
std         0.744214
min         2.772589
25%         4.727388
50%         5.198497
75%         5.707110
max         9.332912
Name: price_log, dtype: float64

Selección de la variables.
Seleccionamos las variables numéricas y transformadas anteriormente por el equipo del TEC. Se excluyen columnas textuales y redundantes. Estas variables se usarán para entrenar las redes neuronales.

In [5]:
feature_cols = [
    'latitude','longitude','accommodates','bathrooms','bedrooms','beds',
    'minimum_nights','maximum_nights','minimum_minimum_nights','maximum_minimum_nights',
    'minimum_maximum_nights','maximum_maximum_nights','minimum_nights_avg_ntm',
    'maximum_nights_avg_ntm','availability_30','availability_60','availability_90',
    'availability_365','number_of_reviews','number_of_reviews_ltm','number_of_reviews_l30d',
    'availability_eoy','number_of_reviews_ly','estimated_occupancy_l365d',
    'estimated_revenue_l365d',
    'calculated_host_listings_count','calculated_host_listings_count_entire_homes',
    'calculated_host_listings_count_private_rooms','calculated_host_listings_count_shared_rooms',
    'is_Entire_home_apt','is_Hotel_room','is_Private_room','is_Shared_room',
    'accommodates_1_to_4','accommodates_5_to_10','accommodates_greater_than_10',
    'bathrooms_menor_o_igual_1_5','bathrooms_mas_de_1_5','is_shared_bathroom',
    'is_private_bathroom','bedrooms_le_2','bedrooms_3_to_5','bedrooms_gt_6',
    'beds_0_to_3','beds_4_to_8','beds_9_to_13','beds_gt_13',
    'is_instant_bookable_binary','has_free_street_parking','has_private_entrance',
    'has_essentials','has_heating','has_wifi','has_pets_allowed','has_hot_water',
    'has_self_check_in','has_freezer','has_exercise_equipment','has_binary'
]

X = df[feature_cols].copy()
y = df["price_log"].copy()

X.shape, y.shape

((5562, 59), (5562,))

División Train/Test
Se dividen los datos en en entrenamiento y prueba y se escalan los features con StandarScaler 

In [6]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

X_train_scaled.shape, X_test_scaled.shape

((4449, 59), (1113, 59))

Función auxiliar para evaluar modelos: 
Defino una función auxiliar para evaluar cada red neuronal en la escala real del precio. 

In [7]:
def evaluate_on_real_scale(model, X_test_scaled, y_test, prefix=""):
    # Predicciones en escala log
    y_pred_log = model.predict(X_test_scaled).ravel()
    
    # Pasar a escala real
    y_pred = np.expm1(y_pred_log)
    y_true = np.expm1(y_test.values)
    
    mae = mean_absolute_error(y_true, y_pred)
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))
    r2 = r2_score(y_true, y_pred)
    
    if prefix:
        print(f"{prefix} - MAE: {mae:.2f}, RMSE: {rmse:.2f}, R2: {r2:.4f}")
    else:
        print(f"MAE: {mae:.2f}, RMSE: {rmse:.2f}, R2: {r2:.4f}")
    
    return mae, rmse, r2

Primer modelo de regresion: RN1 (Red Neuronal 1)
Este es el primer modelo de referencia. Uso una arquitectura sencilla (64 → 32 → 16) con activación ReLU en las capas ocultas y salida lineal para regresión.  
Sirve como punto de comparación para evaluar si las arquitecturas posteriores realmente mejoran el desempeño.  
El modelo se registra en MLflow con sus parámetros y métricas.

In [8]:
def build_nn1(n_features: int):
    model = keras.Sequential([
        layers.Dense(64, activation="relu", input_shape=(n_features,)),
        layers.Dense(32, activation="relu"),
        layers.Dense(16, activation="relu"),
        layers.Dense(1)  # salida lineal
    ])
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn1 = rmse_nn1 = r2_nn1 = None

with mlflow.start_run(run_name="NN1_baseline"):
    mlflow.log_param("model_type", "NN1_baseline")
    mlflow.log_param("layers", "64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("learning_rate", 1e-3)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 80)
    mlflow.log_param("validation_split", 0.2)
    
    nn1 = build_nn1(X_train_scaled.shape[1])
    
    history1 = nn1.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10, restore_best_weights=True
            )
        ]
    )
    
    mae_nn1, rmse_nn1, r2_nn1 = evaluate_on_real_scale(nn1, X_test_scaled, y_test, prefix="NN1")
    
    # Logueo métricas en escala real
    mlflow.log_metric("MAE_real", mae_nn1)
    mlflow.log_metric("RMSE_real", rmse_nn1)
    mlflow.log_metric("R2_real", r2_nn1)
    
    # Guardo el modelo como artefacto explícito (además del autolog)
    mlflow.keras.log_model(nn1, artifact_path="nn1_model")

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
2025-12-01 13:13:20.921746: I metal_plugin/src/device/metal_device.cc:1154] Metal device set to: Apple M2
2025-12-01 13:13:20.921930: I metal_plugin/src/device/metal_device.cc:296] systemMemory: 8.00 GB
2025-12-01 13:13:20.921938: I metal_plugin/src/device/metal_device.cc:313] maxCacheSize: 2.67 GB
2025-12-01 13:13:20.922455: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:305] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2025-12-01 13:13:20.922465: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:271] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


Epoch 1/80


2025-12-01 13:13:21.500216: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:117] Plugin optimizer for device_type GPU is enabled.


[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 23ms/step - loss: 21.2727 - mae: 4.3275 - val_loss: 18.1766 - val_mae: 3.4715
Epoch 2/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 12.5761 - mae: 2.7992 - val_loss: 33.4024 - val_mae: 2.5741
Epoch 3/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 31.0130 - mae: 3.4021 - val_loss: 10.5358 - val_mae: 2.2349
Epoch 4/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 8.9795 - mae: 1.8026 - val_loss: 6.9098 - val_mae: 1.0973
Epoch 5/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 29.9604 - mae: 2.7901 - val_loss: 15.4527 - val_mae: 2.9607
Epoch 6/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 16.0347 - mae: 2.1879 - val_loss: 3.1117 - val_mae: 1.0414
Epoch 7/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step -



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step




NN1 - MAE: 143.58, RMSE: 365.63, R2: 0.0172




Segundo modelo de regresión: (RN2)
Segundo modelo con dos capas ocultas y un dropout. Se busca aumentar la capacidad y controlar el sobreajuste.

In [9]:
def build_nn2(n_features: int):
    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(128, activation="relu")(inp)
    x = layers.Dropout(0.3)(x)
    x = layers.Dense(64, activation="relu")(x)
    out = layers.Dense(1)(x)
    
    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn2 = rmse_nn2 = r2_nn2 = None

with mlflow.start_run(run_name="NN2_deeper_dropout"):
    mlflow.log_param("model_type", "NN2_deeper_dropout")
    mlflow.log_param("layers", "128 -> 64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("dropout_first", 0.3)
    mlflow.log_param("learning_rate", 1e-3)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 80)
    mlflow.log_param("validation_split", 0.2)
    
    nn2 = build_nn2(X_train_scaled.shape[1])
    
    history2 = nn2.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10, restore_best_weights=True
            )
        ]
    )
    
    mae_nn2, rmse_nn2, r2_nn2 = evaluate_on_real_scale(nn2, X_test_scaled, y_test, prefix="NN2")
    
    mlflow.log_metric("MAE_real", mae_nn2)
    mlflow.log_metric("RMSE_real", rmse_nn2)
    mlflow.log_metric("R2_real", r2_nn2)
    
    mlflow.keras.log_model(nn2, artifact_path="nn2_model")



Epoch 1/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 12ms/step - loss: 22.2927 - mae: 4.0227 - val_loss: 11.3872 - val_mae: 2.9637
Epoch 2/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 7.0354 - mae: 2.0055 - val_loss: 4.1192 - val_mae: 1.0750
Epoch 3/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 3.1872 - mae: 1.2406 - val_loss: 1.0468 - val_mae: 0.6610
Epoch 4/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 1.9858 - mae: 0.9824 - val_loss: 0.7455 - val_mae: 0.5125
Epoch 5/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 1.4649 - mae: 0.8545 - val_loss: 0.4391 - val_mae: 0.4621
Epoch 6/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - loss: 1.2089 - mae: 0.7901 - val_loss: 0.4009 - val_mae: 0.4652
Epoch 7/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step




NN2 - MAE: 113.28, RMSE: 343.43, R2: 0.1329




Tercer modelo de redes neuronales: (RN3)
Se hace un tercer con modelo con mayor profundidad, batch de normalización, regularización y activación ReLU

In [10]:
def build_nn3(n_features: int):
    inp = keras.Input(shape=(n_features,))
    
    x = layers.Dense(256, activation="relu")(inp)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.4)(x)
    
    x = layers.Dense(128, activation="relu")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.3)(x)
    
    x = layers.Dense(64, activation="relu")(x)
    
    out = layers.Dense(1)(x)
    
    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=5e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn3 = rmse_nn3 = r2_nn3 = None

with mlflow.start_run(run_name="NN3_deep_bn_dropout"):
    mlflow.log_param("model_type", "NN3_deep_bn_dropout")
    mlflow.log_param("layers", "256 -> 128 -> 64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("dropout", "0.4, 0.3")
    mlflow.log_param("batchnorm", True)
    mlflow.log_param("learning_rate", 5e-4)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 100)
    mlflow.log_param("validation_split", 0.2)
    
    nn3 = build_nn3(X_train_scaled.shape[1])
    
    history3 = nn3.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=100,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=12, restore_best_weights=True
            ),
            keras.callbacks.ReduceLROnPlateau(
                patience=5, factor=0.5
            )
        ]
    )
    
    mae_nn3, rmse_nn3, r2_nn3 = evaluate_on_real_scale(nn3, X_test_scaled, y_test, prefix="NN3")
    
    mlflow.log_metric("MAE_real", mae_nn3)
    mlflow.log_metric("RMSE_real", rmse_nn3)
    mlflow.log_metric("R2_real", r2_nn3)
    
    mlflow.keras.log_model(nn3, artifact_path="nn3_model")



Epoch 1/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 24ms/step - loss: 24.1820 - mae: 4.6843 - val_loss: 16.6717 - val_mae: 4.0255 - learning_rate: 5.0000e-04
Epoch 2/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 15ms/step - loss: 12.5020 - mae: 3.2250 - val_loss: 5.3232 - val_mae: 2.2217 - learning_rate: 5.0000e-04
Epoch 3/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 14ms/step - loss: 5.5034 - mae: 1.8851 - val_loss: 1.2093 - val_mae: 0.9477 - learning_rate: 5.0000e-04
Epoch 4/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 15ms/step - loss: 4.0452 - mae: 1.5191 - val_loss: 0.8211 - val_mae: 0.7467 - learning_rate: 5.0000e-04
Epoch 5/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 14ms/step - loss: 3.2509 - mae: 1.3741 - val_loss: 0.8970 - val_mae: 0.7731 - learning_rate: 5.0000e-04
Epoch 6/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step




NN3 - MAE: 100.09, RMSE: 322.98, R2: 0.2331




Comparación de modelos neuronales
Comparo los resultados de las 3 redes neuronales 

In [11]:
results_nn = pd.DataFrame({
    "Model": ["NN1_baseline", "NN2_deeper_dropout", "NN3_deep_bn_dropout"],
    "MAE": [mae_nn1, mae_nn2, mae_nn3],
    "RMSE": [rmse_nn1, rmse_nn2, rmse_nn3],
    "R2": [r2_nn1, r2_nn2, r2_nn3]
})

results_nn

Unnamed: 0,Model,MAE,RMSE,R2
0,NN1_baseline,143.581453,365.632837,0.017195
1,NN2_deeper_dropout,113.276728,343.430486,0.132929
2,NN3_deep_bn_dropout,100.08795,322.98227,0.233108


Selección del mejor modelo:

In [12]:
best_row = results_nn.iloc[results_nn["MAE"].idxmin()]
best_row

Model    NN3_deep_bn_dropout
MAE                100.08795
RMSE               322.98227
R2                  0.233108
Name: 2, dtype: object

Ampliación en la búsqueda de hiperparámetros:
En este modelo cambio dos hiperparámetros clave frente a los anteriores:
- Uso un batch size más pequeño (16), lo que introduce más ruido estocástico.
- Aumento el learning rate a 0.01.
La arquitectura es intermedia (128 → 64 → 32) con activación ReLU. El objetivo es ver si una tasa de aprendizaje más agresiva mejora o empeora el desempeño.

In [13]:
def build_nn4(n_features: int):
    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(128, activation="relu")(inp)
    x = layers.Dense(64, activation="relu")(x)
    x = layers.Dense(32, activation="relu")(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=5e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn4 = rmse_nn4 = r2_nn4 = None

with mlflow.start_run(run_name="NN4_low_lr_bs64"):
    mlflow.log_param("model_type", "NN4_low_lr_bs64")
    mlflow.log_param("layers", "128 -> 64 -> 32 -> 1")
    mlflow.log_param("learning_rate", 5e-4)
    mlflow.log_param("batch_size", 64)
    mlflow.log_param("activation_hidden", "ReLU")

    nn4 = build_nn4(X_train_scaled.shape[1])

    history4 = nn4.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=64,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10,
                restore_best_weights=True
            )
        ]
    )

    mae_nn4, rmse_nn4, r2_nn4 = evaluate_on_real_scale(nn4, X_test_scaled, y_test, prefix="NN4")

    mlflow.log_metric("MAE_real", mae_nn4)
    mlflow.log_metric("RMSE_real", rmse_nn4)
    mlflow.log_metric("R2_real", r2_nn4)

    mlflow.keras.log_model(nn4, artifact_path="nn4_model")



Epoch 1/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - loss: 26.7012 - mae: 4.7684 - val_loss: 24.2697 - val_mae: 4.3940
Epoch 2/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - loss: 18.4492 - mae: 3.8263 - val_loss: 15.1675 - val_mae: 3.4352
Epoch 3/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 9.8708 - mae: 2.8605 - val_loss: 7.5568 - val_mae: 2.4847
Epoch 4/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 5.6365 - mae: 2.0173 - val_loss: 4.5665 - val_mae: 1.6887
Epoch 5/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 3.1223 - mae: 1.3396 - val_loss: 3.8223 - val_mae: 1.5450
Epoch 6/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 6.0756 - mae: 1.5466 - val_loss: 6.4186 - val_mae: 1.8563
Epoch 7/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss:



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step




NN4 - MAE: 163.95, RMSE: 1172.44, R2: -9.1055




### 13. Modelo NN5 (arquitectura 256–128–64 + regularización L2 + dropout suave)
En NN5 pruebo una arquitectura más grande que NN4, agregando:
- Regularización L2 (1e-4) para controlar sobreajuste.
- Dropout suave (20%).
- Learning rate moderado (0.0008).
- Adam como optimizador, capas ReLU.

El objetivo es evaluar si una red más ancha pero con regularización mejora la precisión.

In [14]:
from tensorflow.keras import regularizers

def build_nn5(n_features: int):
    l2_reg = regularizers.l2(1e-4)

    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(256, activation="relu", kernel_regularizer=l2_reg)(inp)
    x = layers.Dropout(0.2)(x)
    x = layers.Dense(128, activation="relu", kernel_regularizer=l2_reg)(x)
    x = layers.Dense(64, activation="relu", kernel_regularizer=l2_reg)(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=8e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn5 = rmse_nn5 = r2_nn5 = None

with mlflow.start_run(run_name="NN5_l2_dropout_arch256"):
    mlflow.log_param("model_type", "NN5_l2_dropout_arch256")
    mlflow.log_param("layers", "256 -> 128 -> 64 -> 1")
    mlflow.log_param("learning_rate", 8e-4)
    mlflow.log_param("dropout", 0.2)
    mlflow.log_param("l2", 1e-4)
    mlflow.log_param("activation_hidden", "ReLU")

    nn5 = build_nn5(X_train_scaled.shape[1])

    history5 = nn5.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=100,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=12,
                restore_best_weights=True
            )
        ]
    )

    mae_nn5, rmse_nn5, r2_nn5 = evaluate_on_real_scale(nn5, X_test_scaled, y_test, prefix="NN5")

    mlflow.log_metric("MAE_real", mae_nn5)
    mlflow.log_metric("RMSE_real", rmse_nn5)
    mlflow.log_metric("R2_real", r2_nn5)

    mlflow.keras.log_model(nn5, artifact_path="nn5_model")



Epoch 1/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 13ms/step - loss: 19.6851 - mae: 3.8777 - val_loss: 14.7703 - val_mae: 3.1327
Epoch 2/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 19.4446 - mae: 3.0186 - val_loss: 3.7744 - val_mae: 1.3287
Epoch 3/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 98.8901 - mae: 5.6334 - val_loss: 2.5455 - val_mae: 1.0304
Epoch 4/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 21.2697 - mae: 2.6303 - val_loss: 65.1935 - val_mae: 6.1662
Epoch 5/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 48.8840 - mae: 4.3906 - val_loss: 19.4720 - val_mae: 3.3596
Epoch 6/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 71.2242 - mae: 4.9000 - val_loss: 42.9895 - val_mae: 4.9868
Epoch 7/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step




NN5 - MAE: 180.16, RMSE: 718.12, R2: -2.7911




In [15]:
def build_nn6(n_features: int):
    inp = keras.Input(shape=(n_features,))

    x = layers.Dense(256, activation="relu")(inp)
    x = layers.Dropout(0.5)(x)

    x = layers.Dense(256, activation="relu")(x)
    x = layers.Dropout(0.5)(x)

    x = layers.Dense(128, activation="relu")(x)
    x = layers.Dropout(0.3)(x)

    x = layers.Dense(64, activation="relu")(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=3e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn6 = rmse_nn6 = r2_nn6 = None

with mlflow.start_run(run_name="NN6_big_dropout_lowLR"):
    mlflow.log_param("model_type", "NN6_big_dropout_lowLR")
    mlflow.log_param("layers", "256 -> 256 -> 128 -> 64 -> 1")
    mlflow.log_param("learning_rate", 3e-4)
    mlflow.log_param("dropout", "0.5, 0.5, 0.3")
    mlflow.log_param("activation_hidden", "ReLU")

    nn6 = build_nn6(X_train_scaled.shape[1])

    history6 = nn6.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=120,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=15,
                restore_best_weights=True
            ),
            keras.callbacks.ReduceLROnPlateau(
                patience=7,
                factor=0.5
            )
        ]
    )

    mae_nn6, rmse_nn6, r2_nn6 = evaluate_on_real_scale(nn6, X_test_scaled, y_test, prefix="NN6")

    mlflow.log_metric("MAE_real", mae_nn6)
    mlflow.log_metric("RMSE_real", rmse_nn6)
    mlflow.log_metric("R2_real", r2_nn6)

    mlflow.keras.log_model(nn6, artifact_path="nn6_model")



Epoch 1/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 17ms/step - loss: 27.0296 - mae: 4.6256 - val_loss: 17.7216 - val_mae: 3.9829 - learning_rate: 3.0000e-04
Epoch 2/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 28.1952 - mae: 4.2024 - val_loss: 20.0545 - val_mae: 3.6387 - learning_rate: 3.0000e-04
Epoch 3/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 142.5268 - mae: 8.0186 - val_loss: 63.8619 - val_mae: 6.2858 - learning_rate: 3.0000e-04
Epoch 4/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 249.7858 - mae: 10.7887 - val_loss: 45.6508 - val_mae: 5.3642 - learning_rate: 3.0000e-04
Epoch 5/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 643.2095 - mae: 16.2020 - val_loss: 5221.7583 - val_mae: 58.2423 - learning_rate: 3.0000e-04
Epoch 6/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step




NN6 - MAE: 226.08, RMSE: 431.51, R2: -0.3689




In [16]:
results_nn_full = pd.DataFrame({
    "Model": [
        "NN1_baseline",
        "NN2_deeper_dropout",
        "NN3_deep_bn_dropout",
        "NN4_low_lr_bs64",
        "NN5_l2_dropout_arch256",
        "NN6_big_dropout_lowLR"
    ],
    "MAE": [mae_nn1, mae_nn2, mae_nn3, mae_nn4, mae_nn5, mae_nn6],
    "RMSE": [rmse_nn1, rmse_nn2, rmse_nn3, rmse_nn4, rmse_nn5, rmse_nn6],
    "R2": [r2_nn1, r2_nn2, r2_nn3, r2_nn4, r2_nn5, r2_nn6]
})

results_nn_full

Unnamed: 0,Model,MAE,RMSE,R2
0,NN1_baseline,143.581453,365.632837,0.017195
1,NN2_deeper_dropout,113.276728,343.430486,0.132929
2,NN3_deep_bn_dropout,100.08795,322.98227,0.233108
3,NN4_low_lr_bs64,163.951803,1172.442084,-9.105549
4,NN5_l2_dropout_arch256,180.164854,718.117926,-2.79113
5,NN6_big_dropout_lowLR,226.083449,431.50944,-0.368856
