<img src="https://iteso.mx/documents/27014/202031/Logo-ITESO-MinimoH.png"
align="right"
width="300"/>

# Modelos SARIMA


## **<font color= #0077b6> Objetivos de la Práctica </font>**
1.  Utilizar datos de la API oficial de la MLB.
2.  Construir una serie diaria de lesiones registradas en la liga (2015-2025).
3.  Analizar la estacionariedad de la serie (Prueba de Dickey-Fuller Aumentada).
4.  Implementar la metodología Box-Jenkins (ACF/PACF) para seleccionar un modelo SARIMAX.
5.  Evaluar el pronóstico con métricas de error (RMSE y MAE).


In [1]:
# @title Instalación de Librerías y Configuración

!pip install plotly statsmodels scikit-learn pandas numpy --quiet

import re
import json
import time
from urllib.parse import urlencode
from urllib.request import urlopen

import pandas as pd
import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from statsmodels.tsa.statespace.sarimax import SARIMAX
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error
from statsmodels.tsa.stattools import adfuller, acf, pacf


In [2]:
# (Celda reservada para notas / pruebas cortas si se requiere)


## **<font color= #0077b6> Descripción de los datos </font>**
- Se utiliza el endpoint `transactions` de la API oficial de MLB con `sportId=1`.
- La variable objetivo es la cantidad de **lesiones registradas por día** en la MLB.
- Se considera lesión registrada cuando la transacción indica colocación en **injured list** o **disabled list**.
- Las transferencias entre listas (ej. 15-day a 60-day) no se cuentan como una nueva lesión.
- Se filtran solo transacciones vinculadas a equipos MLB (30 franquicias).
- Se recorta la serie a **días de temporada regular** (se excluye offseason) para no sesgar el modelo con largos periodos de ceros.


In [3]:
# Función para obtener datos de transacciones MLB y construir la serie diaria de lesiones (2015-2025)

BASE_URL = "https://statsapi.mlb.com/api/v1/transactions"
SCHEDULE_URL = "https://statsapi.mlb.com/api/v1/schedule"
START_YEAR = 2015
END_YEAR = 2025

# IDs oficiales de las 30 franquicias MLB
MLB_TEAM_IDS = {
    108,109,110,111,112,113,114,115,116,117,
    118,119,120,121,133,134,135,136,137,138,
    139,140,141,142,143,144,145,146,147,158
}

IL_PLACEMENT_RE = re.compile(r"\bplaced\b.*?\bon the\b.*?\b(?:injured|disabled) list\b", re.IGNORECASE)
IL_TRANSFER_RE = re.compile(r"\btransferred\b.*?\b(?:injured|disabled) list\b.*?\bto the\b.*?\b(?:injured|disabled) list\b", re.IGNORECASE)
IL_60DAY_RE = re.compile(r"\b60-day (?:injured|disabled) list\b", re.IGNORECASE)


def fetch_json(url):
    with urlopen(url, timeout=120) as resp:
        return json.load(resp)


def fetch_transactions_year(year, sport_id=1):
    params = {
        "startDate": f"{year}-01-01",
        "endDate": f"{year}-12-31",
        "sportId": sport_id,
    }
    url = f"{BASE_URL}?{urlencode(params)}"
    payload = fetch_json(url)
    time.sleep(0.2)
    return payload.get("transactions", [])


def fetch_regular_season_window(year):
    params = {
        "sportId": 1,
        "season": year,
        "gameType": "R",
    }
    url = f"{SCHEDULE_URL}?{urlencode(params)}"
    payload = fetch_json(url)
    dates = payload.get("dates", [])
    if not dates:
        return None, None
    season_days = pd.to_datetime([d["date"] for d in dates]).sort_values()
    return season_days.min(), season_days.max()


def is_new_injury_registration(description):
    text = (description or "").strip()
    is_placement = bool(IL_PLACEMENT_RE.search(text))
    is_transfer = bool(IL_TRANSFER_RE.search(text))
    return int(is_placement and not is_transfer)


def is_mlb_transaction(from_team_id, to_team_id):
    return int((from_team_id in MLB_TEAM_IDS) or (to_team_id in MLB_TEAM_IDS))


# 1) Descargamos transacciones y calculamos flag de lesión
rows = []
for year in range(START_YEAR, END_YEAR + 1):
    txs = fetch_transactions_year(year, sport_id=1)
    print(f"{year}: {len(txs)} transacciones")
    for tx in txs:
        event_date = tx.get("date") or tx.get("effectiveDate") or tx.get("resolutionDate")
        desc = tx.get("description") or ""
        from_team_id = (tx.get("fromTeam") or {}).get("id")
        to_team_id = (tx.get("toTeam") or {}).get("id")
        rows.append({
            "transaction_id": tx.get("id"),
            "date": event_date,
            "type_code": tx.get("typeCode"),
            "type_desc": tx.get("typeDesc"),
            "description": desc,
            "person_id": (tx.get("person") or {}).get("id"),
            "person_name": (tx.get("person") or {}).get("fullName"),
            "from_team_id": from_team_id,
            "from_team_name": (tx.get("fromTeam") or {}).get("name"),
            "to_team_id": to_team_id,
            "to_team_name": (tx.get("toTeam") or {}).get("name"),
            "is_mlb_team_tx": is_mlb_transaction(from_team_id, to_team_id),
            "injury_registration": is_new_injury_registration(desc),
            "is_60day": int(bool(IL_60DAY_RE.search(desc)))
        })

# DataFrame completo
df_transactions = pd.DataFrame(rows)
df_transactions["date"] = pd.to_datetime(df_transactions["date"], errors="coerce")
df_transactions = df_transactions.dropna(subset=["date"]).sort_values(["date", "transaction_id"]).reset_index(drop=True)

# Filtramos solo transacciones vinculadas a MLB
df_transactions = df_transactions[df_transactions["is_mlb_team_tx"] == 1].copy()

# Solo lesiones registradas (IL / DL placements, excluyendo transfers)
df_injuries = df_transactions[df_transactions["injury_registration"] == 1].copy()

# 2) Obtenemos ventanas exactas de temporada regular por año
season_windows = {}
season_days = []
for year in range(START_YEAR, END_YEAR + 1):
    start_season, end_season = fetch_regular_season_window(year)
    season_windows[year] = (start_season, end_season)
    print(f"Temporada regular {year}: {start_season.date()} -> {end_season.date()}")
    season_days.extend(pd.date_range(start_season, end_season, freq="D"))

season_index = pd.DatetimeIndex(season_days, name="date")

# 3) Serie diaria recortada solo a temporada regular
raw_daily_injuries = df_injuries.groupby("date")["injury_registration"].sum().sort_index()
ts_mlb = raw_daily_injuries.reindex(season_index, fill_value=0).astype(int)

# Variable exógena: dummy de primera semana de temporada (absorbe pico de apertura)
opening_week_dummy = pd.Series(0, index=season_index, dtype=int, name='opening_week_dummy')
for year, (start_season, end_season) in season_windows.items():
    first_window_end = min(start_season + pd.Timedelta(days=6), end_season)
    opening_week_dummy.loc[(opening_week_dummy.index >= start_season) & (opening_week_dummy.index <= first_window_end)] = 1

# 4) Validación rápida del pico de arranque de temporada
# Revisamos si el pico se explica por colocaciones de 60-day IL (comunes al iniciar temporada)
peak_rows = []
for year, (start_season, end_season) in season_windows.items():
    first_window_end = min(start_season + pd.Timedelta(days=6), end_season)
    m_first = (df_injuries["date"] >= start_season) & (df_injuries["date"] <= first_window_end)
    m_rest = (df_injuries["date"] > first_window_end) & (df_injuries["date"] <= end_season)

    first_count = int(df_injuries.loc[m_first, "injury_registration"].sum())
    rest_count = int(df_injuries.loc[m_rest, "injury_registration"].sum())
    first_60 = int(df_injuries.loc[m_first, "is_60day"].sum())

    peak_rows.append({
        "year": year,
        "season_start": start_season.date(),
        "season_end": end_season.date(),
        "injuries_first_7_days": first_count,
        "injuries_rest_of_season": rest_count,
        "share_60day_first_7_days": (first_60 / first_count) if first_count > 0 else np.nan,
    })

df_peak_check = pd.DataFrame(peak_rows)

print("\nResumen:")
print("Transacciones MLB (filtradas a equipos MLB):", len(df_transactions))
print("Lesiones registradas en MLB (total, 2015-2025):", int(df_injuries["injury_registration"].sum()))
print("Longitud de serie (solo temporada regular):", len(ts_mlb), "días")
print("Promedio lesiones/día en temporada:", round(ts_mlb.mean(), 3))
print("Días marcados con opening_week_dummy=1:", int(opening_week_dummy.sum()))

print("\nChequeo de pico al inicio de temporada (primeros 7 días):")
display(df_peak_check)
print("Promedio proporción de 60-day en primeros 7 días:", round(df_peak_check['share_60day_first_7_days'].dropna().mean(), 3))

print("\nEjemplos de transacciones del pico (inicio de temporada más reciente):")
latest_year = END_YEAR
start_latest, _ = season_windows[latest_year]
m_latest = (df_injuries['date'] >= start_latest) & (df_injuries['date'] <= start_latest + pd.Timedelta(days=6))
display(df_injuries.loc[m_latest, ['date','person_name','description']].head(10))


2015: 11520 transacciones
2016: 11963 transacciones
2017: 12242 transacciones
2018: 12692 transacciones
2019: 13554 transacciones
2020: 9741 transacciones
2021: 17035 transacciones
2022: 20413 transacciones
2023: 18791 transacciones
2024: 14866 transacciones
2025: 17401 transacciones
Temporada regular 2015: 2015-04-05 -> 2015-10-04
Temporada regular 2016: 2016-04-03 -> 2016-10-03
Temporada regular 2017: 2017-04-02 -> 2017-10-01
Temporada regular 2018: 2018-03-29 -> 2018-10-01
Temporada regular 2019: 2019-03-20 -> 2019-09-29
Temporada regular 2020: 2020-07-23 -> 2020-09-27
Temporada regular 2021: 2021-04-01 -> 2021-10-03
Temporada regular 2022: 2022-04-07 -> 2022-10-05
Temporada regular 2023: 2023-03-30 -> 2023-10-02
Temporada regular 2024: 2024-03-20 -> 2024-09-30
Temporada regular 2025: 2025-03-18 -> 2025-09-28

Resumen:
Transacciones MLB (filtradas a equipos MLB): 159729
Lesiones registradas en MLB (total, 2015-2025): 8879
Longitud de serie (solo temporada regular): 1943 días
Promedi

Unnamed: 0,year,season_start,season_end,injuries_first_7_days,injuries_rest_of_season,share_60day_first_7_days
0,2015,2015-04-05,2015-10-04,84,392,0.119048
1,2016,2016-04-03,2016-10-03,66,437,0.060606
2,2017,2017-04-02,2017-10-01,65,569,0.107692
3,2018,2018-03-29,2018-10-01,84,593,0.059524
4,2019,2019-03-20,2019-09-29,15,679,0.333333
5,2020,2020-07-23,2020-09-27,53,370,0.132075
6,2021,2021-04-01,2021-10-03,105,1322,0.0
7,2022,2022-04-07,2022-10-05,50,884,0.24
8,2023,2023-03-30,2023-10-02,122,722,0.204918
9,2024,2024-03-20,2024-09-30,17,765,0.117647


Promedio proporción de 60-day en primeros 7 días: 0.168

Ejemplos de transacciones del pico (inicio de temporada más reciente):


Unnamed: 0,date,person_name,description
146953,2025-03-18,Clayton Kershaw,Los Angeles Dodgers placed LHP Clayton Kershaw...
146961,2025-03-18,Jon Gray,Texas Rangers placed RHP Jon Gray on the 60-da...
147054,2025-03-20,Joe Musgrove,San Diego Padres placed RHP Joe Musgrove on th...
147130,2025-03-21,Kyle Bradish,Baltimore Orioles placed RHP Kyle Bradish on t...
147170,2025-03-22,Gerrit Cole,New York Yankees placed RHP Gerrit Cole on the...
147228,2025-03-23,Prelander Berroa,Chicago White Sox placed RHP Prelander Berroa ...
147238,2025-03-23,Blake Walston,Arizona Diamondbacks placed LHP Blake Walston ...
147251,2025-03-23,Tyler Wells,Baltimore Orioles placed RHP Tyler Wells on th...
147280,2025-03-24,Luis Gil,New York Yankees placed RHP Luis Gil on the 60...
147296,2025-03-24,Zack Thompson,St. Louis Cardinals placed LHP Zack Thompson o...


In [4]:
# @title Graficamos la serie de tiempo original
fig = go.Figure()
fig.add_trace(go.Scatter(x=ts_mlb.index, y=ts_mlb.values, mode='lines', name='Lesiones Diarias'))

fig.update_layout(
    title='Volumen Diario de Lesiones Registradas en la MLB (2015-2025)',
    xaxis_title='Fecha',
    yaxis_title='Total Lesiones Registradas'
)
fig.show()


In [5]:
# @title Realizamos pruebas de estacionareidad

def check_stationarity(series, title="Serie Original"):
    result = adfuller(series.dropna())
    print(f'ADF Test: {title}')
    print(f'Estadístico ADF: {result[0]:.4f}')
    print(f'p-value: {result[1]:.4f}')
    is_stationary = result[1] < 0.05
    print(f"¿Es estacionaria? {'SÍ' if is_stationary else 'NO'}\n")
    return is_stationary

# 1. Revisamos la serie original
check_stationarity(ts_mlb, "Nivel Original")

# 2. Aplicamos Primera Diferencia (d=1)
ts_mlb_diff = ts_mlb.diff()

# 3. Revisamos la serie diferenciada
check_stationarity(ts_mlb_diff, "Primera Diferencia (d=1)")

# Figura comparativa (igual estilo de la plantilla)
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=("Serie Original", "Serie Diferenciada (d=1)")
)

fig.add_trace(go.Scatter(x=ts_mlb.index, y=ts_mlb, name='Original'), row=1, col=1)
fig.add_trace(go.Scatter(x=ts_mlb_diff.index, y=ts_mlb_diff, name='Diferenciada'), row=1, col=2)

fig.update_layout(
    title_text="Comparativa: Efecto de la Diferenciación",
    showlegend=False,
    height=500
)
fig.show()


ADF Test: Nivel Original
Estadístico ADF: -6.7210
p-value: 0.0000
¿Es estacionaria? SÍ

ADF Test: Primera Diferencia (d=1)
Estadístico ADF: -15.2742
p-value: 0.0000
¿Es estacionaria? SÍ



In [6]:
# @title Graficamos ACF y PACF

# Para ver mejor la estructura semanal, analizamos la serie diferenciada
ts_analysis = ts_mlb_diff.dropna()

# Parámetros
lags = 35  # 5 semanas
alpha = 0.05

# Cálculo de valores ACF y PACF
acf_vals = acf(ts_analysis, nlags=lags, alpha=alpha)[0][1:]
pacf_vals = pacf(ts_analysis, nlags=lags, alpha=alpha, method='ywm')[0][1:]

# Intervalo de confianza (aprox.)
n = len(ts_analysis)
conf_interval = 1.96 / np.sqrt(n)

fig = make_subplots(rows=2, cols=1,
                    subplot_titles=("Función de Autocorrelación (ACF) - Determina MA(q)",
                                    "Autocorrelación Parcial (PACF) - Determina AR(p)"),
                    vertical_spacing=0.15)

# ACF
fig.add_trace(go.Bar(
    x=list(range(1, lags+1)), y=acf_vals,
    name='ACF', marker_color='rgb(31, 119, 180)', showlegend=False
), row=1, col=1)
fig.add_shape(type="rect",
    x0=0.5, y0=-conf_interval, x1=lags+0.5, y1=conf_interval,
    line=dict(color="rgba(0,0,0,0)"), fillcolor="rgba(0,0,0,0.1)",
    row=1, col=1
)
fig.add_hline(y=conf_interval, line_dash="dash", line_color="gray", row=1, col=1)
fig.add_hline(y=-conf_interval, line_dash="dash", line_color="gray", row=1, col=1)

# PACF
fig.add_trace(go.Bar(
    x=list(range(1, lags+1)), y=pacf_vals,
    name='PACF', marker_color='rgb(255, 127, 14)', showlegend=False
), row=2, col=1)
fig.add_shape(type="rect",
    x0=0.5, y0=-conf_interval, x1=lags+0.5, y1=conf_interval,
    line=dict(color="rgba(0,0,0,0)"), fillcolor="rgba(0,0,0,0.1)",
    row=2, col=1
)
fig.add_hline(y=conf_interval, line_dash="dash", line_color="gray", row=2, col=1)
fig.add_hline(y=-conf_interval, line_dash="dash", line_color="gray", row=2, col=1)

fig.update_layout(
    title='<b>Diagnóstico de Estructura: ACF y PACF</b><br><sup>Serie Diferenciada</sup>',
    template='plotly_white',
    height=700,
    bargap=0.8
)

# Resaltamos lags estacionales semanales
for i in [7, 14, 21, 28, 35]:
    fig.add_vline(x=i, line_width=1, line_dash="dot", line_color="red", opacity=0.5)

fig.show()

# Sugerencia rápida (sin grid grande): lags significativos cercanos y estacionales
sig_acf = [i+1 for i, v in enumerate(acf_vals) if abs(v) > conf_interval]
sig_pacf = [i+1 for i, v in enumerate(pacf_vals) if abs(v) > conf_interval]
print("Lags significativos ACF:", sig_acf)
print("Lags significativos PACF:", sig_pacf)
print("Sugerencia inicial: probar d=1 y componente estacional semanal s=7 (con D=1 si hay señal en lags 7,14,...)")


Lags significativos ACF: [1, 2, 3, 6]
Lags significativos PACF: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Sugerencia inicial: probar d=1 y componente estacional semanal s=7 (con D=1 si hay señal en lags 7,14,...)


In [7]:
# @title Realizamos modelo y graficamos
TEST_DAYS = 21

train = ts_mlb.iloc[:-TEST_DAYS]
test = ts_mlb.iloc[-TEST_DAYS:]

# Exógena para absorber el pico del inicio de temporada (sin eliminar esos datos)
exog = opening_week_dummy.reindex(ts_mlb.index).astype(int)
exog_train = exog.iloc[:-TEST_DAYS]
exog_test = exog.iloc[-TEST_DAYS:]

# Modelos candidatos compactos (guiados por ACF/PACF + patrón semanal)
# Se mantiene simple y cercano a la plantilla, pero probamos pocos candidatos para mejorar claridad.
candidates = [
    {'order': (1, 1, 1), 'seasonal_order': (1, 1, 1, 7)},
    {'order': (2, 1, 1), 'seasonal_order': (1, 1, 1, 7)},
    {'order': (1, 1, 2), 'seasonal_order': (1, 1, 1, 7)},
    {'order': (2, 1, 2), 'seasonal_order': (1, 1, 1, 7)},
    {'order': (2, 1, 1), 'seasonal_order': (0, 1, 1, 7)},
    {'order': (2, 1, 2), 'seasonal_order': (0, 1, 1, 7)},
]

model_results = []
for c in candidates:
    try:
        model = SARIMAX(
            train,
            exog=exog_train,
            order=c['order'],
            seasonal_order=c['seasonal_order'],
            enforce_stationarity=False,
            enforce_invertibility=False
        )
        fitted = model.fit(disp=False)
        pred = fitted.get_forecast(steps=len(test), exog=exog_test).predicted_mean
        forecast_vals = pd.Series(np.asarray(pred), index=test.index)

        rmse = np.sqrt(mean_squared_error(test, forecast_vals))
        mae = mean_absolute_error(test, forecast_vals)
        try:
            mape = mean_absolute_percentage_error(test, forecast_vals)
        except Exception:
            mape = np.nan

        model_results.append({
            'order': c['order'],
            'seasonal_order': c['seasonal_order'],
            'rmse': rmse,
            'mae': mae,
            'mape': mape,
            'aic': fitted.aic,
            'results': fitted
        })
        print(f"OK {c['order']} x {c['seasonal_order']} | RMSE={rmse:.3f} | MAE={mae:.3f} | AIC={fitted.aic:.1f}")
    except Exception as e:
        print(f"FAIL {c['order']} x {c['seasonal_order']}: {e}")

best = sorted(model_results, key=lambda x: (x['rmse'], x['mae'], x['aic']))[0]
results = best['results']

forecast_object = results.get_forecast(steps=len(test), exog=exog_test)
forecast_vals = pd.Series(np.asarray(forecast_object.predicted_mean), index=test.index)
conf_raw = forecast_object.conf_int(alpha=0.05)
conf_int = pd.DataFrame(np.asarray(conf_raw), index=test.index, columns=['lower', 'upper'])

rmse = np.sqrt(mean_squared_error(test, forecast_vals))
mae = mean_absolute_error(test, forecast_vals)
try:
    mape = mean_absolute_percentage_error(test, forecast_vals)
except Exception:
    mape = np.nan

print(f"\n--- Mejor modelo seleccionado ---")
print(f"order={best['order']} | seasonal_order={best['seasonal_order']}")
print(f"RMSE: {rmse:.2f} lesiones")
print(f"MAE: {mae:.2f} lesiones")
print(f"MAPE: {mape:.2%}" if pd.notna(mape) else "MAPE: NaN")
print(f"AIC: {best['aic']:.2f}")
print("Exógena usada: opening_week_dummy (1 = primeros 7 días de cada temporada regular)")

print(results.summary())

# Gráfica
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=train.index, y=train,
    mode='lines',
    name='Train',
    line=dict(color='rgba(100, 100, 100, 0.6)', width=1.5)
))

fig.add_trace(go.Scatter(
    x=test.index, y=test,
    name='Test',
    line=dict(color='#1f77b4', width=3),
    marker=dict(size=6)
))

fig.add_trace(go.Scatter(
    x=test.index, y=forecast_vals,
    name='SARIMAX',
    line=dict(color='#ff7f0e', width=3, dash='dot')
))

fig.add_trace(go.Scatter(
    x=conf_int.index, y=conf_int.iloc[:, 0],
    mode='lines', line=dict(width=0), showlegend=False, hoverinfo='skip'
))
fig.add_trace(go.Scatter(
    x=conf_int.index, y=conf_int.iloc[:, 1],
    mode='lines', line=dict(width=0), fill='tonexty',
    fillcolor='rgba(255, 127, 14, 0.2)',
    name='Int. Confianza 95%', hoverinfo='skip'
))

fig.update_layout(
    title=f'<b>Modelo SARIMAX + Exógena: Lesiones MLB (Solo Temporada Regular)</b>',
    xaxis_title='Fecha',
    yaxis_title='Total de Lesiones Registradas',
    legend=dict(x=0, y=1, bgcolor='rgba(255,255,255,0.8)'),
    hovermode="x unified"
)

fig.show()



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



OK (1, 1, 1) x (1, 1, 1, 7) | RMSE=2.676 | MAE=2.230 | AIC=11840.5



No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



OK (2, 1, 1) x (1, 1, 1, 7) | RMSE=2.670 | MAE=2.227 | AIC=11841.4



Maximum Likelihood optimization failed to converge. Check mle_retvals


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



OK (1, 1, 2) x (1, 1, 1, 7) | RMSE=2.686 | MAE=2.234 | AIC=11995.8



Maximum Likelihood optimization failed to converge. Check mle_retvals


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



OK (2, 1, 2) x (1, 1, 1, 7) | RMSE=2.685 | MAE=2.245 | AIC=11833.5



No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



OK (2, 1, 1) x (0, 1, 1, 7) | RMSE=2.666 | MAE=2.225 | AIC=11840.4
OK (2, 1, 2) x (0, 1, 1, 7) | RMSE=2.679 | MAE=2.256 | AIC=11953.2

--- Mejor modelo seleccionado ---
order=(2, 1, 1) | seasonal_order=(0, 1, 1, 7)
RMSE: 2.67 lesiones
MAE: 2.23 lesiones
MAPE: 76371449885202176.00%
AIC: 11840.43
Exógena usada: opening_week_dummy (1 = primeros 7 días de cada temporada regular)
                                     SARIMAX Results                                     
Dep. Variable:               injury_registration   No. Observations:                 1922
Model:             SARIMAX(2, 1, 1)x(0, 1, 1, 7)   Log Likelihood               -5914.214
Date:                           Mon, 23 Feb 2026   AIC                          11840.429
Time:                                   11:22:38   BIC                          11873.742
Sample:                                        0   HQIC                         11852.691
                                          - 1922                                  


Maximum Likelihood optimization failed to converge. Check mle_retvals


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.

