# Model Training & Tuning con MLflow

Objetivos:
- Entrenar m√∫ltiples modelos (Linear Regression, Random Forest, Gradient Boosting, SVR, XGBoost)
- Registrar hiperpar√°metros y m√©tricas en MLflow
- Implementar sistema Champion/Challenger
- Registrar modelo campe√≥n en MLflow

In [1]:
import logging
import warnings
import time
import json
from datetime import datetime

import pandas as pd
import numpy as np
import joblib

warnings.filterwarnings('ignore')

from sklearn.model_selection import cross_val_score
from sklearn.metrics import mean_squared_error, r2_score, mean_absolute_error

from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.svm import SVR
from xgboost import XGBRegressor

import mlflow
import mlflow.sklearn
from mlflow.models.signature import infer_signature


## 1. Configuraci√≥n de Logging

In [2]:
# Configurar logging
logging.basicConfig(
    filename="ml_system.log", 
    encoding="utf-8", 
    filemode="a", 
    level=logging.INFO,
    format="{asctime} - {levelname} - {message}",
    style="{",
    datefmt="%Y-%m-%d %H:%M:%S"
)

logger = logging.getLogger(__name__)
logger.info("="*80)
logger.info("Iniciando proceso de entrenamiento de modelos")
logger.info("="*80)

## 2. Configuraci√≥n de MLflow

In [3]:
# Configurar conexi√≥n con MLflow
mlflow.set_tracking_uri("http://127.0.0.1:8080")
experiment_name = "Sales_Forecasting_Model_Selection"
mlflow.set_experiment(experiment_name)

# Obtener el experimento
experiment = mlflow.get_experiment_by_name(experiment_name)
experiment_id = experiment.experiment_id

print(f"‚úÖ MLflow configurado")
print(f"   Tracking URI: {mlflow.get_tracking_uri()}")
print(f"   Experimento: {experiment_name}")
print(f"   Experiment ID: {experiment_id}")

‚úÖ MLflow configurado
   Tracking URI: http://127.0.0.1:8080
   Experimento: Sales_Forecasting_Model_Selection
   Experiment ID: 507424209113317466


## 3. Configuraci√≥n de Modelos

Definimos 5 modelos candidatos con sus hiperpar√°metros

In [4]:
# Configuraci√≥n de modelos con metadatos
model_configurations = {
    'linear_regression': {
        'model': LinearRegression(),
        'params': {},
        'description': 'Modelo de regresi√≥n lineal b√°sico'
    },
    'random_forest': {
        'model': RandomForestRegressor(n_estimators=200, random_state=2026, n_jobs=-1),
        'params': {
            'n_estimators': 200,
            'random_state': 2026,
            'n_jobs': -1,
            'max_features': 1.0,
            'min_samples_split': 2,
            'min_samples_leaf': 1
        },
        'description': 'Random Forest con 200 √°rboles'
    },
    'gradient_boosting': {
        'model': GradientBoostingRegressor(random_state=2024, n_estimators=100, learning_rate=0.1),
        'params': {
            'random_state': 2024,
            'n_estimators': 100,
            'learning_rate': 0.1,
            'max_depth': 3,
            'min_samples_split': 2,
            'min_samples_leaf': 1
        },
        'description': 'Gradient Boosting con tasa de aprendizaje 0.1'
    },
    'svr': {
        'model': SVR(kernel='rbf', C=10, epsilon=0.1),
        'params': {
            'kernel': 'rbf',
            'C': 10,
            'epsilon': 0.1,
            'gamma': 'scale'
        },
        'description': 'Support Vector Regressor con kernel RBF'
    },
    'xgboost': {
        'model': XGBRegressor(
            n_estimators=200, 
            learning_rate=0.05, 
            random_state=2026,
            n_jobs=-1
        ),
        'params': {
            'n_estimators': 200,
            'learning_rate': 0.05,
            'random_state': 2026,
            'max_depth': 6,
            'min_child_weight': 1,
            'subsample': 1.0,
            'colsample_bytree': 1.0
        },
        'description': 'XGBoost optimizado para forecasting'
    }
}

print(f"‚úÖ Configurados {len(model_configurations)} modelos:")
for name, config in model_configurations.items():
    print(f"   - {name}: {config['description']}")

‚úÖ Configurados 5 modelos:
   - linear_regression: Modelo de regresi√≥n lineal b√°sico
   - random_forest: Random Forest con 200 √°rboles
   - gradient_boosting: Gradient Boosting con tasa de aprendizaje 0.1
   - svr: Support Vector Regressor con kernel RBF
   - xgboost: XGBoost optimizado para forecasting


## 4. Carga y Preparaci√≥n de Datos

In [5]:
# Cargar dataset
dataset = pd.read_csv('../data/raw/stores_sales_forecasting_updated_v3.1.csv', 
                      sep=';', 
                      encoding='utf-8')

print(f"‚úÖ Dataset cargado: {dataset.shape}")

# Seleccionar features num√©ricas (excluyendo target)
numeric_columns = dataset.select_dtypes(include=['int64', 'float64', 'int32', 'float32']).columns.tolist()
if 'Sales' in numeric_columns:
    numeric_columns.remove('Sales')

X = dataset[numeric_columns].copy()
y = dataset['Sales'].copy()

print(f"   Features: {X.shape[1]} columnas")
print(f"   Registros: {len(X):,}")
print(f"   Target (Sales): min={y.min():.2f}, max={y.max():.2f}, mean={y.mean():.2f}")

‚úÖ Dataset cargado: (2121, 22)
   Features: 5 columnas
   Registros: 2,121
   Target (Sales): min=1.89, max=4416.17, mean=349.83


## 5. Entrenamiento de Modelos con MLflow Tracking

Entrenamos cada modelo registrando:
- **Hiperpar√°metros**: Todos los par√°metros del modelo
- **M√©tricas**: RMSE, R¬≤, MAE, tiempo de entrenamiento
- **Tags**: Tipo de modelo, status (challenger), descripci√≥n
- **Artefactos**: Informaci√≥n adicional del modelo

In [6]:
# Diccionario para almacenar resultados
results = {}
run_ids = {}

logger.info("Iniciando entrenamiento de modelos")
start_time = time.time()

print("\n" + "="*80)
print("ENTRENAMIENTO DE MODELOS CANDIDATOS (CHALLENGERS)")
print("="*80 + "\n")

current_datetime = datetime.now()
formatted_time = current_datetime.strftime("%Y-%m-%d %H:%M:%S")
formatted_time

#with mlflow.start_run(run_name=formatted_time):
for model_name, config in model_configurations.items():
        print(f"\nüîÑ Entrenando: {model_name}")
        print(f"   Descripci√≥n: {config['description']}")
        
        # Iniciar run de MLflow
        with mlflow.start_run(run_name=f"{model_name}_challenger", nested=True) as run:
            run_id = run.info.run_id
            run_ids[model_name] = run_id
            
            # Tiempo de inicio
            model_start = time.time()
            
            # ==============================================
            # 1. REGISTRAR HIPERPAR√ÅMETROS
            # ==============================================
            print("   üìù Registrando hiperpar√°metros...")
            mlflow.log_params(config['params'])
            
            # Par√°metros adicionales del experimento
            mlflow.log_param("cv_folds", 10)
            mlflow.log_param("random_state", config['params'].get('random_state', 'N/A'))
            mlflow.log_param("dataset_size", len(X))
            mlflow.log_param("n_features", X.shape[1])
            
            # ==============================================
            # 2. ENTRENAR MODELO CON CROSS-VALIDATION
            # ==============================================
            print("   üéØ Realizando validacion")
            cv_scores = cross_val_score(
                config['model'], 
                X, 
                y, 
                scoring='neg_root_mean_squared_error', 
                cv=10,
                n_jobs=-1
            )
            
            # Convertir a valores positivos
            rmse_scores = -cv_scores
            rmse_mean = rmse_scores.mean()
            rmse_std = rmse_scores.std()
            
            # Calcular otras m√©tricas
            cv_r2 = cross_val_score(
                config['model'], 
                X, 
                y, 
                scoring='r2', 
                cv=10,
                n_jobs=-1
            ).mean()
            
            cv_mae = -cross_val_score(
                config['model'], 
                X, 
                y, 
                scoring='neg_mean_absolute_error', 
                cv=10,
                n_jobs=-1
            ).mean()
            
            # Tiempo de entrenamiento
            training_time = time.time() - model_start
            
            # Guardar resultados
            results[model_name] = {
                'rmse_mean': rmse_mean,
                'rmse_std': rmse_std,
                'r2': cv_r2,
                'mae': cv_mae,
                'training_time': training_time,
                'run_id': run_id
            }
            
            # ==============================================
            # 3. REGISTRAR M√âTRICAS
            # ==============================================
            print("   üìä Registrando m√©tricas...")
            mlflow.log_metric("rmse_mean", rmse_mean)
            mlflow.log_metric("rmse_std", rmse_std)
            mlflow.log_metric("r2_score", cv_r2)
            mlflow.log_metric("mae", cv_mae)
            mlflow.log_metric("training_time_seconds", training_time)
            
            # Registrar m√©tricas individuales de CV
            for fold, score in enumerate(rmse_scores, 1):
                mlflow.log_metric(f"rmse_fold_{fold}", score)
            
            # ==============================================
            # 4. REGISTRAR TAGS
            # ==============================================
            mlflow.set_tags({
                "model_type": model_name,
                "model_status": "challenger",
                "description": config['description'],
                "framework": "sklearn" if model_name != 'xgboost' else "xgboost",
                "training_date": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
                "cv_strategy": "10-fold"
            })
            
            # ==============================================
            # 5. LOGGING Y PRINT
            # ==============================================
            logger.info(f"{model_name} - RMSE: {rmse_mean:.2f} (+/- {rmse_std:.2f}), R¬≤: {cv_r2:.4f}, MAE: {cv_mae:.2f}, Tiempo: {training_time:.2f}s")
            
            print(f"   ‚úÖ Completado:")
            print(f"      RMSE: {rmse_mean:.2f} (+/- {rmse_std:.2f})")
            print(f"      R¬≤ Score: {cv_r2:.4f}")
            print(f"      MAE: {cv_mae:.2f}")
            print(f"      Tiempo: {training_time:.2f}s")
            print(f"      Run ID: {run_id[:8]}...")

mlflow.end_run()

#mlflow.end_run()

total_time = time.time() - start_time
logger.info(f"Entrenamiento completado en {total_time:.2f} segundos")

print(f"\n‚úÖ Todos los modelos entrenados en {total_time:.2f} segundos")


ENTRENAMIENTO DE MODELOS CANDIDATOS (CHALLENGERS)


üîÑ Entrenando: linear_regression
   Descripci√≥n: Modelo de regresi√≥n lineal b√°sico
   üìù Registrando hiperpar√°metros...
   üéØ Realizando validacion
   üìä Registrando m√©tricas...
   ‚úÖ Completado:
      RMSE: 447.43 (+/- 87.98)
      R¬≤ Score: 0.1847
      MAE: 281.10
      Tiempo: 6.65s
      Run ID: b4aabf1f...
üèÉ View run linear_regression_challenger at: http://127.0.0.1:8080/#/experiments/507424209113317466/runs/b4aabf1f0a354fcf91988924c77b00c4
üß™ View experiment at: http://127.0.0.1:8080/#/experiments/507424209113317466

üîÑ Entrenando: random_forest
   Descripci√≥n: Random Forest con 200 √°rboles
   üìù Registrando hiperpar√°metros...
   üéØ Realizando validacion
   üìä Registrando m√©tricas...
   ‚úÖ Completado:
      RMSE: 263.03 (+/- 45.73)
      R¬≤ Score: 0.7157
      MAE: 122.88
      Tiempo: 4.30s
      Run ID: 7629bf86...
üèÉ View run random_forest_challenger at: http://127.0.0.1:8080/#/experiment

## 6. Comparaci√≥n de Resultados

In [7]:
# Crear DataFrame de resultados
df_results = pd.DataFrame([
    {
        'model': name,
        'rmse_mean': results[name]['rmse_mean'],
        'rmse_std': results[name]['rmse_std'],
        'r2_score': results[name]['r2'],
        'mae': results[name]['mae'],
        'training_time': results[name]['training_time'],
        'run_id': results[name]['run_id']
    }
    for name in results
]).sort_values('rmse_mean')

print("\n" + "="*80)
print("TABLA COMPARATIVA DE MODELOS (ordenados por RMSE)")
print("="*80 + "\n")
print(df_results.to_string(index=False))

# Identificar el modelo campe√≥n
champion_name = df_results.iloc[0]['model']
champion_rmse = df_results.iloc[0]['rmse_mean']
champion_r2 = df_results.iloc[0]['r2_score']
champion_run_id = df_results.iloc[0]['run_id']

print(f"\nüèÜ MODELO CAMPE√ìN: {champion_name.upper()}")
print(f"   RMSE: {champion_rmse:.2f}")
print(f"   R¬≤ Score: {champion_r2:.4f}")
print(f"   Run ID: {champion_run_id}")

logger.info(f"Modelo campe√≥n seleccionado: {champion_name} con RMSE={champion_rmse:.2f}")


TABLA COMPARATIVA DE MODELOS (ordenados por RMSE)

            model  rmse_mean  rmse_std  r2_score        mae  training_time                           run_id
    random_forest 263.026876 45.730690  0.715651 122.876938       4.295941 7629bf86392944a59cf8bd6eeb6897a7
gradient_boosting 265.177486 49.736099  0.713161 133.351569       0.823964 98e283270ed641bca3d0c02908770757
          xgboost 281.507644 55.750759  0.675101 133.605623       0.536861 4644167d5911410aa226c832a4e9e811
linear_regression 447.434466 87.983759  0.184742 281.095742       6.654067 b4aabf1f0a354fcf91988924c77b00c4
              svr 524.659319 77.366631 -0.115038 287.190658       0.558163 8b7320fe20504769af044dc7da42cc5c

üèÜ MODELO CAMPE√ìN: RANDOM_FOREST
   RMSE: 263.03
   R¬≤ Score: 0.7157
   Run ID: 7629bf86392944a59cf8bd6eeb6897a7


## 7. Registro del Modelo Campe√≥n

Entrenamos el modelo campe√≥n en todo el dataset y lo registramos en MLflow con status "champion"

In [8]:
print("\n" + "="*80)
print("REGISTRO DEL MODELO CAMPE√ìN")
print("="*80 + "\n")

# Obtener configuraci√≥n del modelo campe√≥n
champion_config = model_configurations[champion_name]

# Crear nuevo run para el modelo campe√≥n
with mlflow.start_run(run_name=f"{champion_name}_champion") as run:
    champion_final_run_id = run.info.run_id
    
    print(f"üèÜ Entrenando modelo campe√≥n: {champion_name}")
    
    # ==============================================
    # 1. ENTRENAR EN TODO EL DATASET
    # ==============================================
    champion_model = champion_config['model']
    champion_model.fit(X, y)
    
    # Predicciones y m√©tricas en training set
    y_pred = champion_model.predict(X)
    final_rmse = np.sqrt(mean_squared_error(y, y_pred))
    final_r2 = r2_score(y, y_pred)
    final_mae = mean_absolute_error(y, y_pred)
    
    # ==============================================
    # 2. REGISTRAR HIPERPAR√ÅMETROS
    # ==============================================
    mlflow.log_params(champion_config['params'])
    mlflow.log_param("dataset_size", len(X))
    mlflow.log_param("n_features", X.shape[1])
    mlflow.log_param("training_type", "full_dataset")
    
    # ==============================================
    # 3. REGISTRAR M√âTRICAS
    # ==============================================
    mlflow.log_metric("rmse", final_rmse)
    mlflow.log_metric("r2_score", final_r2)
    mlflow.log_metric("mae", final_mae)
    mlflow.log_metric("cv_rmse_mean", champion_rmse)  # M√©trica de CV
    mlflow.log_metric("cv_r2_score", champion_r2)     # M√©trica de CV
    
    # ==============================================
    # 4. REGISTRAR TAGS
    # ==============================================
    mlflow.set_tags({
        "model_type": champion_name,
        "model_status": "champion",
        "description": champion_config['description'],
        "framework": "sklearn" if champion_name != 'xgboost' else "xgboost",
        "training_date": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
        "selection_criteria": "lowest_cv_rmse",
        "previous_run_id": champion_run_id
    })
    
    # ==============================================
    # 5. REGISTRAR EL MODELO
    # ==============================================
    print("   üì¶ Registrando modelo en MLflow...")
    
    # Inferir signature del modelo
    signature = infer_signature(X, y_pred)
    
    # Registrar modelo
    if champion_name == 'xgboost':
        mlflow.xgboost.log_model(
            champion_model,
            artifact_path="model",
            signature=signature,
            registered_model_name="sales_forecasting_champion"
        )
    else:
        mlflow.sklearn.log_model(
            champion_model,
            artifact_path="model",
            signature=signature,
            registered_model_name="sales_forecasting_champion"
        )
    
    # ==============================================
    # 6. REGISTRAR ARTEFACTOS ADICIONALES
    # ==============================================
    # Guardar resumen de resultados
    results_summary = {
        'champion_model': champion_name,
        'cv_rmse': champion_rmse,
        'cv_r2': champion_r2,
        'final_rmse': final_rmse,
        'final_r2': final_r2,
        'final_mae': final_mae,
        'training_date': datetime.now().isoformat(),
        'hyperparameters': champion_config['params']
    }
    
    import json
    with open('champion_summary.json', 'w') as f:
        json.dump(results_summary, f, indent=2)
    mlflow.log_artifact('champion_summary.json')
    
    # Guardar tabla de comparaci√≥n
    df_results.to_csv('../results/models_comparison.csv', index=False)
    mlflow.log_artifact('../results/models_comparison.csv')
    
    print(f"\n‚úÖ Modelo campe√≥n registrado exitosamente")
    print(f"   Run ID: {champion_final_run_id}")
    print(f"   Registered Model: sales_forecasting_champion")
    print(f"   RMSE (training): {final_rmse:.2f}")
    print(f"   R¬≤ (training): {final_r2:.4f}")
    print(f"   MAE (training): {final_mae:.2f}")
    
    logger.info(f"Modelo campe√≥n registrado - Run ID: {champion_final_run_id}")
    logger.info(f"M√©tricas finales - RMSE: {final_rmse:.2f}, R¬≤: {final_r2:.4f}")


REGISTRO DEL MODELO CAMPE√ìN

üèÜ Entrenando modelo campe√≥n: random_forest




   üì¶ Registrando modelo en MLflow...


Registered model 'sales_forecasting_champion' already exists. Creating a new version of this model...
2025/12/19 17:36:19 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: sales_forecasting_champion, version 4
Created version '4' of model 'sales_forecasting_champion'.



‚úÖ Modelo campe√≥n registrado exitosamente
   Run ID: 389473e28e4c4e35820e89d94ddba896
   Registered Model: sales_forecasting_champion
   RMSE (training): 100.45
   R¬≤ (training): 0.9601
   MAE (training): 45.49
üèÉ View run random_forest_champion at: http://127.0.0.1:8080/#/experiments/507424209113317466/runs/389473e28e4c4e35820e89d94ddba896
üß™ View experiment at: http://127.0.0.1:8080/#/experiments/507424209113317466


## 8. Integraci√≥n con Pipeline

Integramos el modelo campe√≥n con el pipeline de preprocesamiento

In [9]:
print("\n" + "="*80)
print("INTEGRACI√ìN CON PIPELINE DE PREPROCESAMIENTO")
print("="*80 + "\n")

# Cargar pipeline de preprocesamiento
pipeline = joblib.load('../models/stores_sales_forecasting_data_pre_proc.pkl')
print("‚úÖ Pipeline de preprocesamiento cargado")

# Agregar modelo campe√≥n al pipeline
pipeline.steps.append((champion_name, champion_model))
print(f"‚úÖ Modelo {champion_name} agregado al pipeline")

# Preparar datos para reentrenamiento
data_train = pd.read_csv('../data/raw/stores_sales_forecasting_updated_v3.1.csv', 
                         sep=';',
                         encoding='utf-8')

X_full = data_train.drop(['Sales'], axis=1)
y_full = data_train['Sales']

# Split temporal (80/20) - SIN shuffle para mantener orden temporal
split_index = int(len(data_train) * 0.8)

X_train = X_full.iloc[:split_index].copy()
X_test = X_full.iloc[split_index:].copy()
y_train = y_full.iloc[:split_index].copy()
y_test = y_full.iloc[split_index:].copy()

print(f"\nüìä Datos divididos:")
print(f"   Train: {len(X_train):,} registros ({len(X_train)/len(X_full)*100:.1f}%)")
print(f"   Test: {len(X_test):,} registros ({len(X_test)/len(X_full)*100:.1f}%)")

# Entrenar pipeline completo
print("\nüîÑ Entrenando pipeline completo...")
pipeline.fit(X_train, y_train)

# Evaluar en conjunto de test
y_pred_test = pipeline.predict(X_test)
test_rmse = np.sqrt(mean_squared_error(y_test, y_pred_test))
test_r2 = r2_score(y_test, y_pred_test)
test_mae = mean_absolute_error(y_test, y_pred_test)

print(f"\nüìä M√©tricas en conjunto de test:")
print(f"   RMSE: {test_rmse:.2f}")
print(f"   R¬≤ Score: {test_r2:.4f}")
print(f"   MAE: {test_mae:.2f}")

# Guardar pipeline completo
pipeline_path = '../models/stores_sales_forecasting_pipeline.pkl'
joblib.dump(pipeline, pipeline_path)
print(f"\n‚úÖ Pipeline completo guardado en: {pipeline_path}")

logger.info(f"Pipeline completo guardado con modelo campe√≥n: {champion_name}")
logger.info(f"M√©tricas en test - RMSE: {test_rmse:.2f}, R¬≤: {test_r2:.4f}")


INTEGRACI√ìN CON PIPELINE DE PREPROCESAMIENTO

‚úÖ Pipeline de preprocesamiento cargado
‚úÖ Modelo random_forest agregado al pipeline

üìä Datos divididos:
   Train: 1,696 registros (80.0%)
   Test: 425 registros (20.0%)

üîÑ Entrenando pipeline completo...

üìä M√©tricas en conjunto de test:
   RMSE: 466.28
   R¬≤ Score: 0.3101
   MAE: 225.26

‚úÖ Pipeline completo guardado en: ../models/stores_sales_forecasting_pipeline.pkl


## 9. Resumen Final

In [10]:
print("\n" + "="*100)
print(" " * 35 + "RESUMEN FINAL")
print("="*100 + "\n")

print("üìä MODELOS EVALUADOS:")
for i, row in df_results.iterrows():
    status = "üèÜ CHAMPION" if row['model'] == champion_name else "üîµ Challenger"
    print(f"   {status} {row['model']:20s} - RMSE: {row['rmse_mean']:7.2f} | R¬≤: {row['r2_score']:.4f}")

print(f"\nüèÜ MODELO SELECCIONADO: {champion_name.upper()}")
print(f"   Validacion RMSE: {champion_rmse:.2f}")
print(f"   Test Set RMSE: {test_rmse:.2f}")
print(f"   Test Set R¬≤: {test_r2:.4f}")

print(f"\nüì¶ ARTEFACTOS GENERADOS:")
print(f"   ‚úì Pipeline completo: {pipeline_path}")
print(f"   ‚úì Modelo registrado en MLflow: sales_forecasting_champion")
print(f"   ‚úì Comparaci√≥n de modelos: models_comparison.csv")
print(f"   ‚úì Resumen del campe√≥n: champion_summary.json")

print(f"\nüîó MLFLOW TRACKING:")
print(f"   Tracking URI: {mlflow.get_tracking_uri()}")
print(f"   Experimento: {experiment_name}")
print(f"   Total de runs: {len(results) + 1}")
print(f"   Champion Run ID: {champion_final_run_id}")

print("\n" + "="*100)
print("‚úÖ PROCESO COMPLETADO EXITOSAMENTE")
print("="*100 + "\n")

logger.info("="*80)
logger.info("Proceso de entrenamiento completado exitosamente")
logger.info(f"Modelo campe√≥n: {champion_name}")
logger.info(f"RMSE en test: {test_rmse:.2f}")
logger.info("="*80)


                                   RESUMEN FINAL

üìä MODELOS EVALUADOS:
   üèÜ CHAMPION random_forest        - RMSE:  263.03 | R¬≤: 0.7157
   üîµ Challenger gradient_boosting    - RMSE:  265.18 | R¬≤: 0.7132
   üîµ Challenger xgboost              - RMSE:  281.51 | R¬≤: 0.6751
   üîµ Challenger linear_regression    - RMSE:  447.43 | R¬≤: 0.1847
   üîµ Challenger svr                  - RMSE:  524.66 | R¬≤: -0.1150

üèÜ MODELO SELECCIONADO: RANDOM_FOREST
   Validacion RMSE: 263.03
   Test Set RMSE: 466.28
   Test Set R¬≤: 0.3101

üì¶ ARTEFACTOS GENERADOS:
   ‚úì Pipeline completo: ../models/stores_sales_forecasting_pipeline.pkl
   ‚úì Modelo registrado en MLflow: sales_forecasting_champion
   ‚úì Comparaci√≥n de modelos: models_comparison.csv
   ‚úì Resumen del campe√≥n: champion_summary.json

üîó MLFLOW TRACKING:
   Tracking URI: http://127.0.0.1:8080
   Experimento: Sales_Forecasting_Model_Selection
   Total de runs: 6
   Champion Run ID: 389473e28e4c4e35820e89d94ddba896

‚úÖ