# Modelos Predictivos: Rentas Cedidas Municipales
# SARIMAX, Prophet, XGBoost y Deep Learning

---

## üìã Proyecto de Grado
**T√≠tulo**: Predicci√≥n del Comportamiento de las Rentas Cedidas en el Financiamiento del R√©gimen Subsidiado de Salud a Nivel Municipal mediante Modelos de Machine Learning

**Autores**:
- Mauricio Garc√≠a Mojica
- Efr√©n Bohorquez Vargas
- Ernesto S√°nchez Garc√≠a

**Objetivo**: Desarrollar modelos predictivos para estimar ingresos por Rentas Cedidas (2020-2024) y fortalecer la planeaci√≥n financiera territorial de la ADRES.

### Modelos Implementados:
1. üìä **SARIMAX**: Modelo econom√©trico base (Benchmark)
2. üîÆ **Prophet**: Descomposici√≥n aditiva con regresores
3. üå≥ **XGBoost**: Gradient Boosting con features de ingenier√≠a
4. üß† **LSTM**: Redes Neuronales Recurrentes (Deep Learning)
5. üéØ **Ensemble**: Combinaci√≥n ponderada de modelos

---

**Fecha**: Febrero 2026  
**Versi√≥n**: 2.0 (Alineada con Registro de Grado)

## 1. Configuraci√≥n del Entorno

In [None]:
# Importaciones generales
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
from datetime import datetime

# Modelos Econom√©tricos
from statsmodels.tsa.statespace.sarimax import SARIMAX
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

# Modelos ML y Prophet
from prophet import Prophet
from neuralprophet import NeuralProphet
import xgboost as xgb

# Deep Learning
import torch
import torch.nn as nn

# Optimizaci√≥n y Validaci√≥n
import optuna
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_error, mean_squared_error, mean_absolute_percentage_error

warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
%matplotlib inline

print("‚úÖ Entorno configurado para modelos SARIMAX, Prophet, XGBoost y LSTM")

## 2. Carga de Datos (2020-2024)

In [None]:
# Cargar datasets preparados
print("üìÅ Cargando datos hist√≥ricos (2020-2024)...")

# TODO: Cargar archivos reales
# train_df = pd.read_parquet('../data/processed/train_mensual.parquet')
# test_df = pd.read_parquet('../data/processed/test_mensual.parquet')

print("‚è≥ Pendiente de ejecuci√≥n con datos depurados")

---
# MODELADO PREDICTIVO
---

## 3. Modelo Base: SARIMAX (Benchmark)

In [None]:
def entrenar_sarimax(train_series, order=(1,1,1), seasonal_order=(1,1,1,12), exog=None):
    """
    Entrena modelo SARIMAX (Seasonal AutoRegressive Integrated Moving Average with eXogenous regressors)
    """
    print("üìä Entrenando SARIMAX...")
    
    try:
        model = SARIMAX(
            train_series,
            order=order,
            seasonal_order=seasonal_order,
            exog=exog,
            enforce_stationarity=False,
            enforce_invertibility=False
        )
        
        results = model.fit(disp=False)
        print("‚úÖ SARIMAX convergencia exitosa")
        print(results.summary().tables[1])
        return results
        
    except Exception as e:
        print(f"‚ùå Error en SARIMAX: {str(e)}")
        return None

print("‚úÖ Funci√≥n SARIMAX definida")

## 4. Modelo Prophet (Meta AI)

In [None]:
def entrenar_prophet(train_df, features_exogenas):
    """
    Entrena Prophet con regresores
    """
    print("üîÆ Entrenando Prophet...")
    
    model = Prophet(
        yearly_seasonality=True,
        weekly_seasonality=False,
        daily_seasonality=False,
        seasonality_mode='multiplicative'
    )
    
    for regressor in features_exogenas:
        model.add_regressor(regressor)
    
    # model.fit(train_df)
    print("‚úÖ Prophet configurado")
    return model

print("‚úÖ Funci√≥n Prophet definida")

## 5. Modelo XGBoost (Machine Learning)

In [None]:
def entrenar_xgboost(X_train, y_train):
    """
    Entrena XGBoost Regressor
    """
    print("üå≥ Entrenando XGBoost...")
    
    model = xgb.XGBRegressor(
        objective='reg:squarederror',
        n_estimators=500,
        learning_rate=0.05,
        max_depth=6,
        subsample=0.8,
        colsample_bytree=0.8,
        early_stopping_rounds=50
    )
    
    # model.fit(X_train, y_train, eval_set=[(X_val, y_val)], verbose=False)
    print("‚úÖ XGBoost configurado")
    return model

print("‚úÖ Funci√≥n XGBoost definida")

## 6. Modelo LSTM (Deep Learning)

In [None]:
class LSTMNet(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim=1, num_layers=2):
        super(LSTMNet, self).__init__()
        self.hidden_dim = hidden_dim
        self.num_layers = num_layers
        
        self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True, dropout=0.2)
        self.fc = nn.Linear(hidden_dim, output_dim)
    
    def forward(self, x):
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).to(x.device)
        c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).to(x.device)
        
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])
        return out

print("üß† Arquitectura LSTM definida")

## 7. Comparaci√≥n y Ensemble

In [None]:
def evaluar_modelos(predicciones_dict, y_true):
    """
    Compara m√©tricas (RMSE, MAE, MAPE) de todos los modelos
    """
    resultados = []
    
    for nombre, y_pred in predicciones_dict.items():
        mape = mean_absolute_percentage_error(y_true, y_pred)
        rmse = np.sqrt(mean_squared_error(y_true, y_pred))
        
        resultados.append({
            'Modelo': nombre,
            'MAPE': mape,
            'RMSE': rmse
        })
    
    return pd.DataFrame(resultados).sort_values('MAPE')

print("‚úÖ Funci√≥n de evaluaci√≥n definida")

## 8. Ejecuci√≥n y Resultados

In [None]:
print("="*80)
print("EJECUCI√ìN DEL PIPELINE PREDICTIVO")
print("="*80)

print("\n1. Entrenar SARIMAX (Benchmark)")
print("2. Entrenar Prophet y XGBoost (ML)")
print("3. Entrenar LSTM (Deep Learning)")
print("4. Generar Ensemble ponderado")
print("5. Comparar MAYE y seleccionar mejor modelo")

print("\n‚ö†Ô∏è IMPORTANTE: Ajustar fechas de corte seg√∫n datos disponibles (2020-2024)")