Importación y configuración de MLflow:
Se cargan todas la librerias y las herramientas para la creación de los modelos.

In [59]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

import mlflow
import mlflow.keras

print("TensorFlow:", tf.__version__)

# Configuración de MLflow (local, sin tracking_uri raro)
mlflow.set_experiment("airbnb_regresion_nn")

# Activo autolog para Keras (registra automáticamente métricas, parámetros y modelo)
mlflow.keras.autolog()

TensorFlow: 2.16.2


Carga del data set limpio y entregado por los estudiantes del TEC.

In [60]:

df = pd.read_csv("listings_clean_final.csv")
df.head()

Unnamed: 0,id,source,name,host_id,host_name,host_since,host_is_superhost,host_listings_count,host_total_listings_count,host_verifications,...,has_free_street_parking,has_private_entrance,has_essentials,has_heating,has_wifi,has_pets_allowed,has_hot_water,has_self_check_in,has_freezer,has_exercise_equipment
0,18744501,city scrape,"""Artist´s Creative Residence"" 100m² im Zentrum",129635321,Sylvia,2017-05-09,f,1.0,1.0,"['email', 'phone']",...,1,1,1,1,1,0,1,0,0,0
1,23356842,city scrape,"""Bohemian Residency"" (Central & Quiet) * * * * *",150173398,Vincent,2017-09-11,t,2.0,3.0,"['email', 'phone']",...,1,1,1,1,1,0,1,1,0,0
2,819658084391291386,city scrape,"""Feel at Home"" Flat at the Lerchenauer See",29225873,Skandar,2015-03-12,f,1.0,1.0,"['email', 'phone']",...,0,0,1,1,1,1,1,0,0,1
3,34677963,city scrape,"""Little Star"" Schlafoase im Zentrum",28482431,Adriana,2015-02-28,f,5.0,5.0,"['email', 'phone']",...,0,1,1,1,1,0,1,1,0,0
4,34431776,city scrape,"""Moonlight"" Schlafoase mitten im Szenenviertel",28482431,Adriana,2015-02-28,f,5.0,5.0,"['email', 'phone']",...,0,1,1,1,1,0,1,1,0,0


Limpieza de la variable predictoria. La variable 'price' viene como texto "string" con símbolos. Se limpia y se transforma en numérico.

In [61]:
df["price"] = (
    df["price"]
    .str.replace("$", "", regex=False)
    .str.replace(",", "", regex=False)
    .astype(float)
)

df["price"].head(), df["price"].dtype

(0    221.0
 1    797.0
 2    106.0
 3    258.0
 4    249.0
 Name: price, dtype: float64,
 dtype('float64'))

Hacemos una tranformación logarítmica del precio, ya que este tiene una distribución muy asimétrica. Esto se hace para estabilizar la varianza y mejorar los modelos 

In [62]:
df["price_log"] = np.log1p(df["price"])
df["price_log"].describe()

count    5562.000000
mean        5.243958
std         0.744214
min         2.772589
25%         4.727388
50%         5.198497
75%         5.707110
max         9.332912
Name: price_log, dtype: float64

Selección de la variables.
Seleccionamos las variables numéricas y transformadas anteriormente por el equipo del TEC. Se excluyen columnas textuales y redundantes. Estas variables se usarán para entrenar las redes neuronales.

In [63]:
feature_cols = [
    'latitude','longitude','accommodates','bathrooms','bedrooms','beds',
    'minimum_nights','maximum_nights','minimum_minimum_nights','maximum_minimum_nights',
    'minimum_maximum_nights','maximum_maximum_nights','minimum_nights_avg_ntm',
    'maximum_nights_avg_ntm','availability_30','availability_60','availability_90',
    'availability_365','number_of_reviews','number_of_reviews_ltm','number_of_reviews_l30d',
    'availability_eoy','number_of_reviews_ly','estimated_occupancy_l365d',
    'estimated_revenue_l365d',
    'calculated_host_listings_count','calculated_host_listings_count_entire_homes',
    'calculated_host_listings_count_private_rooms','calculated_host_listings_count_shared_rooms',
    'is_Entire_home_apt','is_Hotel_room','is_Private_room','is_Shared_room',
    'accommodates_1_to_4','accommodates_5_to_10','accommodates_greater_than_10',
    'bathrooms_menor_o_igual_1_5','bathrooms_mas_de_1_5','is_shared_bathroom',
    'is_private_bathroom','bedrooms_le_2','bedrooms_3_to_5','bedrooms_gt_6',
    'beds_0_to_3','beds_4_to_8','beds_9_to_13','beds_gt_13',
    'is_instant_bookable_binary','has_free_street_parking','has_private_entrance',
    'has_essentials','has_heating','has_wifi','has_pets_allowed','has_hot_water',
    'has_self_check_in','has_freezer','has_exercise_equipment','has_binary'
]

X = df[feature_cols].copy()
y = df["price_log"].copy()

X.shape, y.shape

((5562, 59), (5562,))

División Train/Test
Se dividen los datos en en entrenamiento y prueba y se escalan los features con StandarScaler 

In [64]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

X_train_scaled.shape, X_test_scaled.shape

((4449, 59), (1113, 59))

Función auxiliar para evaluar modelos: 
Defino una función auxiliar para evaluar cada red neuronal en la escala real del precio. 

In [65]:
def evaluate_on_real_scale(model, X_test_scaled, y_test, prefix=""):
    # Predicciones en escala log
    y_pred_log = model.predict(X_test_scaled).ravel()
    
    # Pasar a escala real
    y_pred = np.expm1(y_pred_log)
    y_true = np.expm1(y_test.values)
    
    mae = mean_absolute_error(y_true, y_pred)
    rmse = np.sqrt(mean_squared_error(y_true, y_pred))
    r2 = r2_score(y_true, y_pred)
    
    if prefix:
        print(f"{prefix} - MAE: {mae:.2f}, RMSE: {rmse:.2f}, R2: {r2:.4f}")
    else:
        print(f"MAE: {mae:.2f}, RMSE: {rmse:.2f}, R2: {r2:.4f}")
    
    return mae, rmse, r2

Primer modelo de regresion: NN1 (Red Neuronal 1)
Este es el primer modelo de referencia. Uso una arquitectura sencilla (64, 32, 16) con activación ReLU en las capas ocultas y salida lineal para regresión.  
Sirve como punto de comparación para evaluar si las arquitecturas posteriores realmente mejoran el desempeño.  
El modelo se registra en MLflow con sus parámetros y métricas.

In [66]:
def build_nn1(n_features: int):
    model = keras.Sequential([
        layers.Dense(64, activation="relu", input_shape=(n_features,)),
        layers.Dense(32, activation="relu"),
        layers.Dense(16, activation="relu"),
        layers.Dense(1)  # salida lineal
    ])
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn1 = rmse_nn1 = r2_nn1 = None

with mlflow.start_run(run_name="NN1_baseline"):
    mlflow.log_param("model_type", "NN1_baseline")
    mlflow.log_param("layers", "64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("learning_rate", 1e-3)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 80)
    mlflow.log_param("validation_split", 0.2)
    
    nn1 = build_nn1(X_train_scaled.shape[1])
    
    history1 = nn1.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10, restore_best_weights=True
            )
        ]
    )
    
    mae_nn1, rmse_nn1, r2_nn1 = evaluate_on_real_scale(nn1, X_test_scaled, y_test, prefix="NN1")
    
    # Logueo métricas en escala real
    mlflow.log_metric("MAE_real", mae_nn1)
    mlflow.log_metric("RMSE_real", rmse_nn1)
    mlflow.log_metric("R2_real", r2_nn1)
    
    # Guardo el modelo como artefacto explícito (además del autolog)
    mlflow.keras.log_model(nn1, artifact_path="nn1_model")

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 16ms/step - loss: 20.6816 - mae: 4.0045 - val_loss: 17.8865 - val_mae: 2.8344
Epoch 2/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 8.3497 - mae: 2.0345 - val_loss: 17.7916 - val_mae: 2.3509
Epoch 3/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 4.6847 - mae: 1.3476 - val_loss: 3.2434 - val_mae: 1.3929
Epoch 4/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 2.3855 - mae: 1.0071 - val_loss: 4.4321 - val_mae: 1.5578
Epoch 5/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 2.4752 - mae: 1.0017 - val_loss: 1.4284 - val_mae: 0.8775
Epoch 6/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 4.3933 - mae: 1.2600 - val_loss: 1.3926 - val_mae: 0.8662
Epoch 7/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step




NN1 - MAE: 110.79, RMSE: 339.58, R2: 0.1523




Segundo modelo de regresión: (NN2)
Segundo modelo con dos capas ocultas y un dropout del 30%. Es decir, durante los entrenamientos se apagan aleatoriamente el 30% de las neuronas. Esto ayuda a controlar el sobreajuste y a que prediga patrones más generales y no ruido del dataset.

In [67]:
def build_nn2(n_features: int):
    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(128, activation="relu")(inp)
    x = layers.Dropout(0.3)(x)
    x = layers.Dense(64, activation="relu")(x)
    out = layers.Dense(1)(x)
    
    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn2 = rmse_nn2 = r2_nn2 = None

with mlflow.start_run(run_name="NN2_deeper_dropout"):
    mlflow.log_param("model_type", "NN2_deeper_dropout")
    mlflow.log_param("layers", "128 -> 64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("dropout_first", 0.3)
    mlflow.log_param("learning_rate", 1e-3)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 80)
    mlflow.log_param("validation_split", 0.2)
    
    nn2 = build_nn2(X_train_scaled.shape[1])
    
    history2 = nn2.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10, restore_best_weights=True
            )
        ]
    )
    
    mae_nn2, rmse_nn2, r2_nn2 = evaluate_on_real_scale(nn2, X_test_scaled, y_test, prefix="NN2")
    
    mlflow.log_metric("MAE_real", mae_nn2)
    mlflow.log_metric("RMSE_real", rmse_nn2)
    mlflow.log_metric("R2_real", r2_nn2)
    
    mlflow.keras.log_model(nn2, artifact_path="nn2_model")



Epoch 1/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 14ms/step - loss: 20.9523 - mae: 4.1222 - val_loss: 11.8026 - val_mae: 2.9982
Epoch 2/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 7.6192 - mae: 2.0532 - val_loss: 2.8055 - val_mae: 1.1997
Epoch 3/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 3.9743 - mae: 1.3809 - val_loss: 3.1080 - val_mae: 0.9544
Epoch 4/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 2.7999 - mae: 1.0991 - val_loss: 2.3674 - val_mae: 0.9039
Epoch 5/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10ms/step - loss: 1.8927 - mae: 0.9455 - val_loss: 1.4955 - val_mae: 0.7385
Epoch 6/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 1.5301 - mae: 0.8660 - val_loss: 1.0671 - val_mae: 0.5612
Epoch 7/80
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 10m



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step




NN2 - MAE: 605.20, RMSE: 10258.41, R2: -772.6373




Tercer modelo de redes neuronales: (NN3)
Se hace un tercer con modelo con mayor profundidad (deep), batch de normalización (bn), regularización y activación ReLU

In [68]:
def build_nn3(n_features: int):
    inp = keras.Input(shape=(n_features,))
    
    x = layers.Dense(256, activation="relu")(inp)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.4)(x)
    
    x = layers.Dense(128, activation="relu")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.3)(x)
    
    x = layers.Dense(64, activation="relu")(x)
    
    out = layers.Dense(1)(x)
    
    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=5e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn3 = rmse_nn3 = r2_nn3 = None

with mlflow.start_run(run_name="NN3_deep_bn_dropout"):
    mlflow.log_param("model_type", "NN3_deep_bn_dropout")
    mlflow.log_param("layers", "256 -> 128 -> 64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("dropout", "0.4, 0.3")
    mlflow.log_param("batchnorm", True)
    mlflow.log_param("learning_rate", 5e-4)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 100)
    mlflow.log_param("validation_split", 0.2)
    
    nn3 = build_nn3(X_train_scaled.shape[1])
    
    history3 = nn3.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=100,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=12, restore_best_weights=True
            ),
            keras.callbacks.ReduceLROnPlateau(
                patience=5, factor=0.5
            )
        ]
    )
    
    mae_nn3, rmse_nn3, r2_nn3 = evaluate_on_real_scale(nn3, X_test_scaled, y_test, prefix="NN3")
    
    mlflow.log_metric("MAE_real", mae_nn3)
    mlflow.log_metric("RMSE_real", rmse_nn3)
    mlflow.log_metric("R2_real", r2_nn3)
    
    mlflow.keras.log_model(nn3, artifact_path="nn3_model")



Epoch 1/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 30ms/step - loss: 23.7246 - mae: 4.6573 - val_loss: 16.7798 - val_mae: 4.0463 - learning_rate: 5.0000e-04
Epoch 2/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - loss: 12.2597 - mae: 3.1795 - val_loss: 5.2633 - val_mae: 2.1854 - learning_rate: 5.0000e-04
Epoch 3/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 19ms/step - loss: 5.3788 - mae: 1.8377 - val_loss: 1.1897 - val_mae: 0.9504 - learning_rate: 5.0000e-04
Epoch 4/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 20ms/step - loss: 3.7669 - mae: 1.4847 - val_loss: 0.8573 - val_mae: 0.7612 - learning_rate: 5.0000e-04
Epoch 5/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step - loss: 3.3790 - mae: 1.3898 - val_loss: 0.7786 - val_mae: 0.7262 - learning_rate: 5.0000e-04
Epoch 6/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step




NN3 - MAE: 98.88, RMSE: 318.99, R2: 0.2519




Construyo otra red neuronal eliminando una capa en comparacion con la red anterior a ver qué sucede

In [69]:
def build_nnJD(n_features: int):
    inp = keras.Input(shape=(n_features,))
    
    x = layers.Dense(128, activation="relu")(x)
    x = layers.BatchNormalization()(x)
    x = layers.Dropout(0.3)(x)
    
    x = layers.Dense(64, activation="relu")(x)
    
    out = layers.Dense(1)(x)
    
    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=5e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nnJD = rmse_nnJD = r2_nnJD = None

with mlflow.start_run(run_name="NNJD_128_bn_dropout"):
    mlflow.log_param("model_type", "NNJD_128_bn_dropout")
    mlflow.log_param("layers", "256 -> 128 -> 64 -> 1")
    mlflow.log_param("activation_hidden", "ReLU")
    mlflow.log_param("dropout", "0.4, 0.3")
    mlflow.log_param("batchnorm", True)
    mlflow.log_param("learning_rate", 5e-4)
    mlflow.log_param("batch_size", 32)
    mlflow.log_param("epochs_max", 100)
    mlflow.log_param("validation_split", 0.2)
    
    nnJD = build_nn3(X_train_scaled.shape[1])
    
    history3 = nnJD.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=100,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=12, restore_best_weights=True
            ),
            keras.callbacks.ReduceLROnPlateau(
                patience=5, factor=0.5
            )
        ]
    )
    
    mae_nnJD, rmse_nnJD, r2_nnJD = evaluate_on_real_scale(nnJD, X_test_scaled, y_test, prefix="NNJD")
    
    mlflow.log_metric("MAE_real", mae_nnJD)
    mlflow.log_metric("RMSE_real", rmse_nnJD)
    mlflow.log_metric("R2_real", r2_nnJD)
    
    mlflow.keras.log_model(nnJD, artifact_path="nnjd_model")



Epoch 1/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 24ms/step - loss: 23.6840 - mae: 4.6051 - val_loss: 15.6814 - val_mae: 3.9059 - learning_rate: 5.0000e-04
Epoch 2/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - loss: 11.8758 - mae: 3.0881 - val_loss: 4.9607 - val_mae: 2.1189 - learning_rate: 5.0000e-04
Epoch 3/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 19ms/step - loss: 5.7963 - mae: 1.9288 - val_loss: 3.1286 - val_mae: 1.4117 - learning_rate: 5.0000e-04
Epoch 4/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step - loss: 4.3237 - mae: 1.6255 - val_loss: 0.9570 - val_mae: 0.8126 - learning_rate: 5.0000e-04
Epoch 5/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - loss: 3.6807 - mae: 1.4538 - val_loss: 1.3276 - val_mae: 0.9125 - learning_rate: 5.0000e-04
Epoch 6/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step
NNJD - MAE: 99.47, RMSE: 320.65, R2: 0.2441




Comparación de modelos neuronales
Se imprimen los resultados de las 3 redes neuronales y se puede evidenciar que la de menor MSE y mayor R2 es la NN3. Esto se debe a la construcción de una red mas profunda, con capa de normalización, y dropout. Las anteriores ayudan a entrenar mejor al modelo.

In [70]:
results_nn = pd.DataFrame({
    "Model": ["NN1_baseline", "NN2_deeper_dropout", "NN3_deep_bn_dropout","NNJD_128_bn_dropout"],
    "MAE": [mae_nn1, mae_nn2, mae_nn3, mae_nnJD],
    "RMSE": [rmse_nn1, rmse_nn2, rmse_nn3, rmse_nnJD],
    "R2": [r2_nn1, r2_nn2, r2_nn3, r2_nnJD]
})

results_nn

Unnamed: 0,Model,MAE,RMSE,R2
0,NN1_baseline,110.79283,339.576092,0.152282
1,NN2_deeper_dropout,605.196646,10258.41195,-772.63729
2,NN3_deep_bn_dropout,98.876599,318.99299,0.251935
3,NNJD_128_bn_dropout,99.472153,320.653136,0.244128


Selección del mejor modelo:

In [71]:
best_row = results_nn.iloc[results_nn["MAE"].idxmin()]
best_row

Model    NN3_deep_bn_dropout
MAE                98.876599
RMSE               318.99299
R2                  0.251935
Name: 2, dtype: object

Ampliación en la búsqueda de hiperparámetros:
- Uso un batch size más pequeño (16), lo que introduce más ruido estocástico.
- Aumento el learning rate a 0.01.
La arquitectura es intermedia (128, 64, 32) con activación ReLU. El objetivo es ver si una tasa de aprendizaje más agresiva mejora o empeora el desempeño.

In [72]:
def build_nn4(n_features: int):
    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(128, activation="relu")(inp)
    x = layers.Dense(64, activation="relu")(x)
    x = layers.Dense(32, activation="relu")(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=5e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn4 = rmse_nn4 = r2_nn4 = None

with mlflow.start_run(run_name="NN4_low_lr_bs64"):
    mlflow.log_param("model_type", "NN4_low_lr_bs64")
    mlflow.log_param("layers", "128 -> 64 -> 32 -> 1")
    mlflow.log_param("learning_rate", 5e-4)
    mlflow.log_param("batch_size", 64)
    mlflow.log_param("activation_hidden", "ReLU")

    nn4 = build_nn4(X_train_scaled.shape[1])

    history4 = nn4.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=80,
        batch_size=64,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=10,
                restore_best_weights=True
            )
        ]
    )

    mae_nn4, rmse_nn4, r2_nn4 = evaluate_on_real_scale(nn4, X_test_scaled, y_test, prefix="NN4")

    mlflow.log_metric("MAE_real", mae_nn4)
    mlflow.log_metric("RMSE_real", rmse_nn4)
    mlflow.log_metric("R2_real", r2_nn4)

    mlflow.keras.log_model(nn4, artifact_path="nn4_model")



Epoch 1/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step - loss: 26.1896 - mae: 4.8472 - val_loss: 22.9919 - val_mae: 4.4557
Epoch 2/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 19.4585 - mae: 4.1178 - val_loss: 16.5327 - val_mae: 3.8087
Epoch 3/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 13.2066 - mae: 3.4469 - val_loss: 12.1147 - val_mae: 3.1237
Epoch 4/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 10.7984 - mae: 2.8563 - val_loss: 11.7621 - val_mae: 2.4385
Epoch 5/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 7.1038 - mae: 2.1962 - val_loss: 7.3379 - val_mae: 1.8093
Epoch 6/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 8.2375 - mae: 2.0134 - val_loss: 16.1394 - val_mae: 3.2076
Epoch 7/80
[1m56/56[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step 



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step




NN4 - MAE: 117.50, RMSE: 351.99, R2: 0.0892




Modelo NN5 – Red ancha con L2 y dropout
En este modelo uso una arquitectura más grande que NN4 (256–128–64) e incluyo:
- Regularización L2 (1e-4) para reducir sobreajuste.
- Dropout del 20%.
- Learning rate de 0.0008.

El objetivo es evaluar si una red más ancha con regularización logra un mejor desempeño que las anteriores.

In [73]:
from tensorflow.keras import regularizers

def build_nn5(n_features: int):
    l2_reg = regularizers.l2(1e-4)

    inp = keras.Input(shape=(n_features,))
    x = layers.Dense(256, activation="relu", kernel_regularizer=l2_reg)(inp)
    x = layers.Dropout(0.2)(x)
    x = layers.Dense(128, activation="relu", kernel_regularizer=l2_reg)(x)
    x = layers.Dense(64, activation="relu", kernel_regularizer=l2_reg)(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=8e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn5 = rmse_nn5 = r2_nn5 = None

with mlflow.start_run(run_name="NN5_l2_dropout_arch256"):
    mlflow.log_param("model_type", "NN5_l2_dropout_arch256")
    mlflow.log_param("layers", "256 -> 128 -> 64 -> 1")
    mlflow.log_param("learning_rate", 8e-4)
    mlflow.log_param("dropout", 0.2)
    mlflow.log_param("l2", 1e-4)
    mlflow.log_param("activation_hidden", "ReLU")

    nn5 = build_nn5(X_train_scaled.shape[1])

    history5 = nn5.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=100,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=12,
                restore_best_weights=True
            )
        ]
    )

    mae_nn5, rmse_nn5, r2_nn5 = evaluate_on_real_scale(nn5, X_test_scaled, y_test, prefix="NN5")

    mlflow.log_metric("MAE_real", mae_nn5)
    mlflow.log_metric("RMSE_real", rmse_nn5)
    mlflow.log_metric("R2_real", r2_nn5)

    mlflow.keras.log_model(nn5, artifact_path="nn5_model")



Epoch 1/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 16ms/step - loss: 17.5958 - mae: 3.6192 - val_loss: 6.8638 - val_mae: 1.8706
Epoch 2/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 23.3489 - mae: 2.9359 - val_loss: 7.3361 - val_mae: 1.8600
Epoch 3/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 21.8408 - mae: 2.8174 - val_loss: 8.5958 - val_mae: 2.1144
Epoch 4/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 162.3613 - mae: 6.1742 - val_loss: 7.6571 - val_mae: 2.1233
Epoch 5/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 12ms/step - loss: 11.1026 - mae: 1.9683 - val_loss: 36.5355 - val_mae: 4.3790
Epoch 6/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 157.5822 - mae: 5.7675 - val_loss: 2.0374 - val_mae: 1.0426
Epoch 7/100
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m 



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step




NN5 - MAE: 222.09, RMSE: 2095.75, R2: -31.2892




Modelo NN6 – Red grande con dropout fuerte
En este modelo uso una arquitectura grande (256, 256, 128, 64) con dropout alto (0.5, 0.5 y 0.3) para controlar sobreajuste.  
El learning rate es bajo (0.0003) para hacer el entrenamiento más estable.  
El objetivo es evaluar si una red de alta capacidad mejora el desempeño al comparar con los modelos anteriores.  
Todo el entrenamiento queda registrado en MLflow.

In [74]:
def build_nn6(n_features: int):
    inp = keras.Input(shape=(n_features,))

    x = layers.Dense(256, activation="relu")(inp)
    x = layers.Dropout(0.5)(x)

    x = layers.Dense(256, activation="relu")(x)
    x = layers.Dropout(0.5)(x)

    x = layers.Dense(128, activation="relu")(x)
    x = layers.Dropout(0.3)(x)

    x = layers.Dense(64, activation="relu")(x)
    out = layers.Dense(1)(x)

    model = keras.Model(inputs=inp, outputs=out)
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=3e-4),
        loss="mse",
        metrics=["mae"]
    )
    return model

mae_nn6 = rmse_nn6 = r2_nn6 = None

with mlflow.start_run(run_name="NN6_big_dropout_lowLR"):
    mlflow.log_param("model_type", "NN6_big_dropout_lowLR")
    mlflow.log_param("layers", "256 -> 256 -> 128 -> 64 -> 1")
    mlflow.log_param("learning_rate", 3e-4)
    mlflow.log_param("dropout", "0.5, 0.5, 0.3")
    mlflow.log_param("activation_hidden", "ReLU")

    nn6 = build_nn6(X_train_scaled.shape[1])

    history6 = nn6.fit(
        X_train_scaled, y_train,
        validation_split=0.2,
        epochs=120,
        batch_size=32,
        verbose=1,
        callbacks=[
            keras.callbacks.EarlyStopping(
                patience=15,
                restore_best_weights=True
            ),
            keras.callbacks.ReduceLROnPlateau(
                patience=7,
                factor=0.5
            )
        ]
    )

    mae_nn6, rmse_nn6, r2_nn6 = evaluate_on_real_scale(nn6, X_test_scaled, y_test, prefix="NN6")

    mlflow.log_metric("MAE_real", mae_nn6)
    mlflow.log_metric("RMSE_real", rmse_nn6)
    mlflow.log_metric("R2_real", r2_nn6)

    mlflow.keras.log_model(nn6, artifact_path="nn6_model")



Epoch 1/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 18ms/step - loss: 27.0783 - mae: 4.6435 - val_loss: 16.5253 - val_mae: 3.9772 - learning_rate: 3.0000e-04
Epoch 2/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 15ms/step - loss: 24.6869 - mae: 3.9700 - val_loss: 13.7711 - val_mae: 3.1682 - learning_rate: 3.0000e-04
Epoch 3/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 14ms/step - loss: 107.1667 - mae: 7.0019 - val_loss: 16.6336 - val_mae: 3.7418 - learning_rate: 3.0000e-04
Epoch 4/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 14ms/step - loss: 221.4662 - mae: 10.2091 - val_loss: 345.3941 - val_mae: 14.6128 - learning_rate: 3.0000e-04
Epoch 5/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 14ms/step - loss: 396.8261 - mae: 12.5767 - val_loss: 2233.7998 - val_mae: 37.7726 - learning_rate: 3.0000e-04
Epoch 6/120
[1m112/112[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1



[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step




NN6 - MAE: 220.92, RMSE: 422.69, R2: -0.3135




In [75]:
results_nn_full = pd.DataFrame({
    "Model": [
        "NN1_baseline",
        "NN2_deeper_dropout",
        "NN3_deep_bn_dropout",
        "NNJD_128_bn_dropout",
        "NN4_low_lr_bs64",
        "NN5_l2_dropout_arch256",
        "NN6_big_dropout_lowLR"
    ],
    "MAE": [mae_nn1, mae_nn2, mae_nn3, mae_nnJD, mae_nn4, mae_nn5, mae_nn6],
    "RMSE": [rmse_nn1, rmse_nn2, rmse_nn3, rmse_nnJD, rmse_nn4, rmse_nn5, rmse_nn6],
    "R2": [r2_nn1, r2_nn2, r2_nn3, r2_nn4, r2_nnJD, r2_nn5, r2_nn6]
})

results_nn_full

Unnamed: 0,Model,MAE,RMSE,R2
0,NN1_baseline,110.79283,339.576092,0.152282
1,NN2_deeper_dropout,605.196646,10258.41195,-772.63729
2,NN3_deep_bn_dropout,98.876599,318.99299,0.251935
3,NNJD_128_bn_dropout,99.472153,320.653136,0.08915
4,NN4_low_lr_bs64,117.498517,351.993625,0.244128
5,NN5_l2_dropout_arch256,222.087464,2095.753154,-31.289207
6,NN6_big_dropout_lowLR,220.924395,422.69419,-0.313499
