<style>
body {
    max-width: 900px;
    margin: 40px auto;
    padding: 0 20px;
    font-family: "Georgia", serif;
    line-height: 1.6;
}
</style>

<div style="text-align: center; padding: 60px 60px">
  <h1 style="font-weight: bold; font-size: 3.1em">
    ENCONTRAR EL MEJOR ARIMA PARA LA SERIE DIFERENCIADA CON PMDARIMA
  </h1>
</div>

Se usa la libreria de `pmdarima` para intentar diferentes combinaciones de modelos y encontrar sus $\text{AIC}$.

**RESULTADOS**:
<pre>
(1) ARIMA(1,0,0)(0,1,1)[12], AIC=1430.397
(2) ARIMA(0,0,1)(0,1,1)[12], AIC=1431.266
(3) ARIMA(1,0,1)(0,1,1)[12], AIC=1432.133
(4) ARIMA(2,0,0)(0,1,1)[12], AIC=1432.157
(5) ARIMA(1,0,0)(0,1,2)[12], AIC=1432.368
(6) ARIMA(1,0,0)(1,1,1)[12], AIC=1432.373
(7) ARIMA(0,0,2)(0,1,1)[12], AIC=1432.664
(8) ARIMA(0,0,1)(0,1,2)[12], AIC=1433.240
(9) ARIMA(0,0,1)(1,1,1)[12], AIC=1433.245
(10) ARIMA(2,0,1)(0,1,1)[12], AIC=1434.127
(11) ARIMA(0,0,0)(0,1,1)[12], AIC=1445.334
(12) ARIMA(1,0,0)(1,1,0)[12], AIC=1523.141
(13) ARIMA(0,0,1)(1,1,0)[12], AIC=1523.219
(14) ARIMA(0,0,1)(0,1,0)[12], AIC=1652.849
(15) ARIMA(1,0,0)(0,1,0)[12], AIC=1652.984
(16) ARIMA(0,0,0)(0,1,0)[12], AIC=1655.671
</pre>

# **CARGAR LOS DATOS**

In [1]:
import pandas as pd
import numpy as np

data=pd.read_csv('MXN00021035.csv')

pre=data.iloc[:,6]  # Precipitacion, es la columna 5
date=data.iloc[:,5] # Date, es la columna 6
date = date.astype(str).str.replace(r'(\d{4})(\d{2})', r'\1/\2', regex=True)    # La fecha está como 195210 y la pasamos a 1952/10 
date = pd.to_datetime(date, format='%Y/%m')                                     # Lo convertimos en fecha
pre = pd.Series(pre.values, index=date)                                         # Creamos una Serie

# Partir la serie para train y test
pre_total = pre.copy()          # Copia de la serie original

# Todas hasta los ultimos 12 meses
pre = pre_total[:-12]           # Entrenamiento: todos menos los últimos 12 meses

from scipy.stats import zscore

# Estandarizar la serie original
zpre = zscore(pre)

# **ENCONTRAR EL MEJOR ARIMA BASADO EN EL $\text{AIC} $**

In [2]:
import pmdarima as pm

In [9]:
# Encontrar el mejor SARIMA, Basado en ul AIC 
auto_sarima_model = pm.auto_arima(zpre, 
                                  seasonal=True, 
                                  m=12,
                                  stepwise=True, 
                                  suppress_warnings=True, 
                                  trace=True, 
                                  D=1,
                                  max_order=None)

Performing stepwise search to minimize aic
 ARIMA(2,0,2)(1,1,1)[12] intercept   : AIC=inf, Time=3.40 sec
 ARIMA(0,0,0)(0,1,0)[12] intercept   : AIC=1655.671, Time=0.08 sec
 ARIMA(1,0,0)(1,1,0)[12] intercept   : AIC=1523.141, Time=0.64 sec
 ARIMA(0,0,1)(0,1,1)[12] intercept   : AIC=1431.266, Time=0.48 sec
 ARIMA(0,0,0)(0,1,0)[12]             : AIC=1653.671, Time=0.03 sec
 ARIMA(0,0,1)(0,1,0)[12] intercept   : AIC=1652.849, Time=0.09 sec
 ARIMA(0,0,1)(1,1,1)[12] intercept   : AIC=1433.245, Time=0.63 sec
 ARIMA(0,0,1)(0,1,2)[12] intercept   : AIC=1433.240, Time=1.95 sec
 ARIMA(0,0,1)(1,1,0)[12] intercept   : AIC=1523.219, Time=0.36 sec
 ARIMA(0,0,1)(1,1,2)[12] intercept   : AIC=inf, Time=4.49 sec
 ARIMA(0,0,0)(0,1,1)[12] intercept   : AIC=1445.334, Time=0.40 sec
 ARIMA(1,0,1)(0,1,1)[12] intercept   : AIC=1432.133, Time=1.22 sec
 ARIMA(0,0,2)(0,1,1)[12] intercept   : AIC=1432.664, Time=0.56 sec
 ARIMA(1,0,0)(0,1,1)[12] intercept   : AIC=1430.397, Time=0.67 sec
 ARIMA(1,0,0)(0,1,0)[12] inte

# **PONER LINDO EL OUTPUT**

In [4]:
import re

In [10]:
texto = """
Performing stepwise search to minimize aic
 ARIMA(2,0,2)(1,1,1)[12] intercept   : AIC=inf, Time=3.40 sec
 ARIMA(0,0,0)(0,1,0)[12] intercept   : AIC=1655.671, Time=0.08 sec
 ARIMA(1,0,0)(1,1,0)[12] intercept   : AIC=1523.141, Time=0.64 sec
 ARIMA(0,0,1)(0,1,1)[12] intercept   : AIC=1431.266, Time=0.48 sec
 ARIMA(0,0,0)(0,1,0)[12]             : AIC=1653.671, Time=0.03 sec
 ARIMA(0,0,1)(0,1,0)[12] intercept   : AIC=1652.849, Time=0.09 sec
 ARIMA(0,0,1)(1,1,1)[12] intercept   : AIC=1433.245, Time=0.63 sec
 ARIMA(0,0,1)(0,1,2)[12] intercept   : AIC=1433.240, Time=1.95 sec
 ARIMA(0,0,1)(1,1,0)[12] intercept   : AIC=1523.219, Time=0.36 sec
 ARIMA(0,0,1)(1,1,2)[12] intercept   : AIC=inf, Time=4.49 sec
 ARIMA(0,0,0)(0,1,1)[12] intercept   : AIC=1445.334, Time=0.40 sec
 ARIMA(1,0,1)(0,1,1)[12] intercept   : AIC=1432.133, Time=1.22 sec
 ARIMA(0,0,2)(0,1,1)[12] intercept   : AIC=1432.664, Time=0.56 sec
 ARIMA(1,0,0)(0,1,1)[12] intercept   : AIC=1430.397, Time=0.67 sec
 ARIMA(1,0,0)(0,1,0)[12] intercept   : AIC=1652.984, Time=0.05 sec
 ARIMA(1,0,0)(1,1,1)[12] intercept   : AIC=1432.373, Time=0.59 sec
 ARIMA(1,0,0)(0,1,2)[12] intercept   : AIC=1432.368, Time=1.91 sec
 ARIMA(1,0,0)(1,1,2)[12] intercept   : AIC=inf, Time=4.56 sec
 ARIMA(2,0,0)(0,1,1)[12] intercept   : AIC=1432.157, Time=0.79 sec
 ARIMA(2,0,1)(0,1,1)[12] intercept   : AIC=1434.127, Time=1.49 sec
 ARIMA(1,0,0)(0,1,1)[12]             : AIC=1428.625, Time=0.26 sec
 ARIMA(1,0,0)(0,1,0)[12]             : AIC=1650.985, Time=0.02 sec
 ARIMA(1,0,0)(1,1,1)[12]             : AIC=1430.599, Time=0.44 sec
 ARIMA(1,0,0)(0,1,2)[12]             : AIC=1430.594, Time=1.12 sec
 ARIMA(1,0,0)(1,1,0)[12]             : AIC=1521.170, Time=0.15 sec
 ARIMA(1,0,0)(1,1,2)[12]             : AIC=inf, Time=3.50 sec
 ARIMA(0,0,0)(0,1,1)[12]             : AIC=1443.593, Time=0.13 sec
 ARIMA(2,0,0)(0,1,1)[12]             : AIC=1430.380, Time=0.32 sec
 ARIMA(1,0,1)(0,1,1)[12]             : AIC=1430.355, Time=0.58 sec
 ARIMA(0,0,1)(0,1,1)[12]             : AIC=1429.501, Time=0.22 sec
 ARIMA(2,0,1)(0,1,1)[12]             : AIC=1432.349, Time=0.71 sec

Best model:  ARIMA(1,0,0)(0,1,1)[12]          
Total fit time: 31.867 seconds
"""

In [11]:
def limpiar_linea(linea):
    # Eliminar la palabra "intercept" con espacios alrededor
    linea = re.sub(r'\s*intercept\s*', ' ', linea)
    # Eliminar la parte ", Time=x.xx sec"
    linea = re.sub(r',\s*Time=\d+\.\d+\s*sec', '', linea)
    # Limpiar espacios extras
    linea = re.sub(r'            ', '', linea)
    # Reemplazar : por una coma
    linea = re.sub(r'] :', '],', linea)
    # Si la línea contiene "inf", eliminarla
    if 'inf' in linea:
        return ''
    return linea.strip()

# Procesar todas las líneas
lineas = texto.strip().split('\n')
lineas_limpias = [limpiar_linea(linea) for linea in lineas]

resultado = "\n".join(lineas_limpias)
print(resultado)


Performing stepwise search to minimize aic

ARIMA(0,0,0)(0,1,0)[12], AIC=1655.671
ARIMA(1,0,0)(1,1,0)[12], AIC=1523.141
ARIMA(0,0,1)(0,1,1)[12], AIC=1431.266
ARIMA(0,0,0)(0,1,0)[12], AIC=1653.671
ARIMA(0,0,1)(0,1,0)[12], AIC=1652.849
ARIMA(0,0,1)(1,1,1)[12], AIC=1433.245
ARIMA(0,0,1)(0,1,2)[12], AIC=1433.240
ARIMA(0,0,1)(1,1,0)[12], AIC=1523.219

ARIMA(0,0,0)(0,1,1)[12], AIC=1445.334
ARIMA(1,0,1)(0,1,1)[12], AIC=1432.133
ARIMA(0,0,2)(0,1,1)[12], AIC=1432.664
ARIMA(1,0,0)(0,1,1)[12], AIC=1430.397
ARIMA(1,0,0)(0,1,0)[12], AIC=1652.984
ARIMA(1,0,0)(1,1,1)[12], AIC=1432.373
ARIMA(1,0,0)(0,1,2)[12], AIC=1432.368

ARIMA(2,0,0)(0,1,1)[12], AIC=1432.157
ARIMA(2,0,1)(0,1,1)[12], AIC=1434.127
ARIMA(1,0,0)(0,1,1)[12], AIC=1428.625
ARIMA(1,0,0)(0,1,0)[12], AIC=1650.985
ARIMA(1,0,0)(1,1,1)[12], AIC=1430.599
ARIMA(1,0,0)(0,1,2)[12], AIC=1430.594
ARIMA(1,0,0)(1,1,0)[12], AIC=1521.170

ARIMA(0,0,0)(0,1,1)[12], AIC=1443.593
ARIMA(2,0,0)(0,1,1)[12], AIC=1430.380
ARIMA(1,0,1)(0,1,1)[12], AIC=1430.355
ARI

In [12]:
texto = """
ARIMA(0,0,0)(0,1,0)[12], AIC=1655.671
ARIMA(1,0,0)(1,1,0)[12], AIC=1523.141
ARIMA(0,0,1)(0,1,1)[12], AIC=1431.266
ARIMA(0,0,0)(0,1,0)[12], AIC=1653.671
ARIMA(0,0,1)(0,1,0)[12], AIC=1652.849
ARIMA(0,0,1)(1,1,1)[12], AIC=1433.245
ARIMA(0,0,1)(0,1,2)[12], AIC=1433.240
ARIMA(0,0,1)(1,1,0)[12], AIC=1523.219
ARIMA(0,0,0)(0,1,1)[12], AIC=1445.334
ARIMA(1,0,1)(0,1,1)[12], AIC=1432.133
ARIMA(0,0,2)(0,1,1)[12], AIC=1432.664
ARIMA(1,0,0)(0,1,1)[12], AIC=1430.397
ARIMA(1,0,0)(0,1,0)[12], AIC=1652.984
ARIMA(1,0,0)(1,1,1)[12], AIC=1432.373
ARIMA(1,0,0)(0,1,2)[12], AIC=1432.368
ARIMA(2,0,0)(0,1,1)[12], AIC=1432.157
ARIMA(2,0,1)(0,1,1)[12], AIC=1434.127
ARIMA(1,0,0)(0,1,1)[12], AIC=1428.625
ARIMA(1,0,0)(0,1,0)[12], AIC=1650.985
ARIMA(1,0,0)(1,1,1)[12], AIC=1430.599
ARIMA(1,0,0)(0,1,2)[12], AIC=1430.594
ARIMA(1,0,0)(1,1,0)[12], AIC=1521.170
ARIMA(0,0,0)(0,1,1)[12], AIC=1443.593
ARIMA(2,0,0)(0,1,1)[12], AIC=1430.380
ARIMA(1,0,1)(0,1,1)[12], AIC=1430.355
ARIMA(0,0,1)(0,1,1)[12], AIC=1429.501
ARIMA(2,0,1)(0,1,1)[12], AIC=1432.349
"""

In [13]:
def extraer_aic(linea):
    match = re.search(r'AIC=([0-9]+\.[0-9]+)', linea)
    return float(match.group(1)) if match else float('inf')

def extraer_modelo(linea):
    match = re.match(r'(ARIMA\([^)]+\)\([^)]+\)\[\d+\])', linea)
    return match.group(1) if match else None

# Limpiar líneas vacías
lineas = [linea.strip() for linea in texto.strip().split('\n') if linea.strip()]

# Diccionario para eliminar duplicados (se queda con el primero que aparece)
modelos_unicos = {}
for linea in lineas:
    modelo = extraer_modelo(linea)
    if modelo and modelo not in modelos_unicos:
        modelos_unicos[modelo] = linea

# Extraer líneas únicas
lineas_unicas = list(modelos_unicos.values())

# Ordenar por AIC
lineas_ordenadas = sorted(lineas_unicas, key=extraer_aic)

# Añadir índice
lineas_indexadas = [f"({i+1}) {linea}" for i, linea in enumerate(lineas_ordenadas)]

# Resultado final
resultado = "\n".join(lineas_indexadas)
print(resultado)


(1) ARIMA(1,0,0)(0,1,1)[12], AIC=1430.397
(2) ARIMA(0,0,1)(0,1,1)[12], AIC=1431.266
(3) ARIMA(1,0,1)(0,1,1)[12], AIC=1432.133
(4) ARIMA(2,0,0)(0,1,1)[12], AIC=1432.157
(5) ARIMA(1,0,0)(0,1,2)[12], AIC=1432.368
(6) ARIMA(1,0,0)(1,1,1)[12], AIC=1432.373
(7) ARIMA(0,0,2)(0,1,1)[12], AIC=1432.664
(8) ARIMA(0,0,1)(0,1,2)[12], AIC=1433.240
(9) ARIMA(0,0,1)(1,1,1)[12], AIC=1433.245
(10) ARIMA(2,0,1)(0,1,1)[12], AIC=1434.127
(11) ARIMA(0,0,0)(0,1,1)[12], AIC=1445.334
(12) ARIMA(1,0,0)(1,1,0)[12], AIC=1523.141
(13) ARIMA(0,0,1)(1,1,0)[12], AIC=1523.219
(14) ARIMA(0,0,1)(0,1,0)[12], AIC=1652.849
(15) ARIMA(1,0,0)(0,1,0)[12], AIC=1652.984
(16) ARIMA(0,0,0)(0,1,0)[12], AIC=1655.671


# **RESULTADOS**

<pre>
(1) ARIMA(1,0,0)(0,1,1)[12], AIC=1430.397
(2) ARIMA(0,0,1)(0,1,1)[12], AIC=1431.266
(3) ARIMA(1,0,1)(0,1,1)[12], AIC=1432.133
(4) ARIMA(2,0,0)(0,1,1)[12], AIC=1432.157
(5) ARIMA(1,0,0)(0,1,2)[12], AIC=1432.368
(6) ARIMA(1,0,0)(1,1,1)[12], AIC=1432.373
(7) ARIMA(0,0,2)(0,1,1)[12], AIC=1432.664
(8) ARIMA(0,0,1)(0,1,2)[12], AIC=1433.240
(9) ARIMA(0,0,1)(1,1,1)[12], AIC=1433.245
(10) ARIMA(2,0,1)(0,1,1)[12], AIC=1434.127
(11) ARIMA(0,0,0)(0,1,1)[12], AIC=1445.334
(12) ARIMA(1,0,0)(1,1,0)[12], AIC=1523.141
(13) ARIMA(0,0,1)(1,1,0)[12], AIC=1523.219
(14) ARIMA(0,0,1)(0,1,0)[12], AIC=1652.849
(15) ARIMA(1,0,0)(0,1,0)[12], AIC=1652.984
(16) ARIMA(0,0,0)(0,1,0)[12], AIC=1655.671
</pre>