# Determine SARIMAX order

Use pmdarima to find the values for the SARIMA model.

Based on: https://github.com/MKB-Datalab/time-series-analysis-with-SARIMAX-and-Prophet/blob/master/notebooks/02-Forecasting_with_SARIMAX.ipynb

## Conclusion
### residential WITH_PV:

Performing stepwise search to minimize aic
 - ARIMA(0,1,0)(0,0,0)[96] intercept   : AIC=6918.316, Time=0.17 sec
 - ARIMA(1,1,0)(1,0,0)[96] intercept   : AIC=6906.702, Time=6.23 sec
 - ARIMA(0,1,1)(0,0,1)[96] intercept   : AIC=6904.724, Time=5.30 sec
 - ARIMA(0,1,0)(0,0,0)[96]             : AIC=6916.322, Time=0.02 sec
 - ARIMA(0,1,1)(0,0,0)[96] intercept   : AIC=6903.313, Time=0.11 sec
 - ARIMA(0,1,1)(1,0,0)[96] intercept   : AIC=6904.583, Time=4.28 sec
 - ARIMA(0,1,1)(1,0,1)[96] intercept   : AIC=6906.083, Time=5.30 sec
 - ARIMA(1,1,1)(0,0,0)[96] intercept   : AIC=6903.978, Time=0.19 sec
 - ARIMA(0,1,2)(0,0,0)[96] intercept   : AIC=6903.936, Time=0.14 sec
 - ARIMA(1,1,0)(0,0,0)[96] intercept   : AIC=6905.635, Time=0.08 sec
 - ARIMA(1,1,2)(0,0,0)[96] intercept   : AIC=6905.923, Time=0.20 sec
 - ARIMA(0,1,1)(0,0,0)[96]             : AIC=6901.313, Time=0.05 sec
 - ARIMA(0,1,1)(1,0,0)[96]             : AIC=6902.583, Time=2.60 sec
 - ARIMA(0,1,1)(0,0,1)[96]             : AIC=6902.724, Time=3.55 sec
 - ARIMA(0,1,1)(1,0,1)[96]             : AIC=6904.083, Time=5.22 sec
 - ARIMA(1,1,1)(0,0,0)[96]             : AIC=6901.978, Time=0.07 sec
 - ARIMA(0,1,2)(0,0,0)[96]             : AIC=6901.935, Time=0.07 sec
 - ARIMA(1,1,0)(0,0,0)[96]             : AIC=6903.637, Time=0.04 sec
 - ARIMA(1,1,2)(0,0,0)[96]             : AIC=6903.921, Time=0.11 sec

Best model:  ARIMA(0,1,1)(0,0,0)[96]          
Total fit time: 33.920 seconds

### residential NO_PV

Performing stepwise search to minimize aic
 - ARIMA(0,1,0)(0,0,0)[96] intercept   : AIC=7206.921, Time=0.04 sec
 - ARIMA(1,1,0)(1,0,0)[96] intercept   : AIC=7186.202, Time=4.83 sec
 - ARIMA(0,1,1)(0,0,1)[96] intercept   : AIC=7151.917, Time=4.21 sec
 - ARIMA(0,1,0)(0,0,0)[96]             : AIC=7204.933, Time=0.01 sec
 - ARIMA(0,1,1)(0,0,0)[96] intercept   : AIC=7158.161, Time=0.08 sec
 - ARIMA(0,1,1)(1,0,1)[96] intercept   : AIC=7152.758, Time=6.01 sec
 - ARIMA(0,1,1)(0,0,2)[96] intercept   : AIC=7151.222, Time=23.45 sec
 - ARIMA(0,1,1)(1,0,2)[96] intercept   : AIC=7153.667, Time=34.29 sec
 - ARIMA(0,1,0)(0,0,2)[96] intercept   : AIC=7205.025, Time=33.94 sec
 - ARIMA(1,1,1)(0,0,2)[96] intercept   : AIC=inf, Time=33.62 sec
 - ARIMA(0,1,2)(0,0,2)[96] intercept   : AIC=7100.731, Time=32.16 sec
 - ARIMA(0,1,2)(0,0,1)[96] intercept   : AIC=7101.501, Time=5.24 sec
 - ARIMA(0,1,2)(1,0,2)[96] intercept   : AIC=7103.678, Time=43.70 sec
 - ARIMA(0,1,2)(1,0,1)[96] intercept   : AIC=7102.781, Time=8.74 sec
 - ARIMA(1,1,2)(0,0,2)[96] intercept   : AIC=7102.671, Time=37.17 sec
 - ARIMA(0,1,3)(0,0,2)[96] intercept   : AIC=7097.000, Time=35.75 sec
 - ARIMA(0,1,3)(0,0,1)[96] intercept   : AIC=7097.446, Time=6.93 sec
 - ARIMA(0,1,3)(1,0,2)[96] intercept   : AIC=7100.045, Time=43.99 sec
 - ARIMA(0,1,3)(1,0,1)[96] intercept   : AIC=7100.475, Time=7.76 sec

In [1]:
from pathlib import Path
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

import statsmodels.api as sm
from pmdarima import auto_arima

In [2]:
usable_data_folder = Path(r"C:\Users\Flin\OneDrive - TU Eindhoven\Flin\Flin\01 - Uni\00_Internship\Nokia\00_Programming\forecasting\datasets\train")
TINY_TEST = True

In [3]:
def lazy_load_data(path):
    df = pd.read_csv(path)
    
    if TINY_TEST:
        df = df.iloc[:4*24*5]

    y = df["y"].values

    X = df.drop(columns=["datetimes", "y"]).to_numpy()

    return y, X

def lazy_auto_arima(path):
    y, X = lazy_load_data(path)

    model = auto_arima(y=y,
                        # X=X, 
                        seasonal=True,
                        m=4*24,
                        start_p=0,
                        d=1, # See determine how stationary
                        start_q=0,
                        start_D=0,
                        max_D=1,
                        max_p=4,
                        max_q=4,
                        start_P=0,
                        start_Q=0,
                        max_P=2,
                        max_Q=2,
                        information_criterion="aic",
                        trace=True,
                        error_action="ignore",
                        stepwise=True,
                        maxiter=3
                        )

    return model





## RESIDENTIAL WITH PV

In [4]:
fn = r"residential_with_pv\h=2_residential_2018_WITH_PV_SFH13_2018.csv" # r"industrial\h=2_industrial_2016_LG_1.csv"
path = usable_data_folder / fn

rwp_model = lazy_auto_arima(path)

Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,0,0)[96] intercept   : AIC=6918.316, Time=0.17 sec
 ARIMA(1,1,0)(1,0,0)[96] intercept   : AIC=6906.702, Time=6.23 sec
 ARIMA(0,1,1)(0,0,1)[96] intercept   : AIC=6904.724, Time=5.30 sec
 ARIMA(0,1,0)(0,0,0)[96]             : AIC=6916.322, Time=0.02 sec
 ARIMA(0,1,1)(0,0,0)[96] intercept   : AIC=6903.313, Time=0.11 sec
 ARIMA(0,1,1)(1,0,0)[96] intercept   : AIC=6904.583, Time=4.28 sec
 ARIMA(0,1,1)(1,0,1)[96] intercept   : AIC=6906.083, Time=5.30 sec
 ARIMA(1,1,1)(0,0,0)[96] intercept   : AIC=6903.978, Time=0.19 sec
 ARIMA(0,1,2)(0,0,0)[96] intercept   : AIC=6903.936, Time=0.14 sec
 ARIMA(1,1,0)(0,0,0)[96] intercept   : AIC=6905.635, Time=0.08 sec
 ARIMA(1,1,2)(0,0,0)[96] intercept   : AIC=6905.923, Time=0.20 sec
 ARIMA(0,1,1)(0,0,0)[96]             : AIC=6901.313, Time=0.05 sec
 ARIMA(0,1,1)(1,0,0)[96]             : AIC=6902.583, Time=2.60 sec
 ARIMA(0,1,1)(0,0,1)[96]             : AIC=6902.724, Time=3.55 sec
 ARIMA(0,1,1)(1,0,1

## RESIDENTIAL NO PV

In [5]:
fn = r"residential_no_pv\h=2_residential_2018_NO_PV_SFH18_2018.csv" # r"industrial\h=2_industrial_2016_LG_1.csv"
path = usable_data_folder / fn

rwp_model = lazy_auto_arima(path)

Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,0,0)[96] intercept   : AIC=7206.921, Time=0.04 sec
 ARIMA(1,1,0)(1,0,0)[96] intercept   : AIC=7186.202, Time=4.83 sec
 ARIMA(0,1,1)(0,0,1)[96] intercept   : AIC=7151.917, Time=4.21 sec
 ARIMA(0,1,0)(0,0,0)[96]             : AIC=7204.933, Time=0.01 sec
 ARIMA(0,1,1)(0,0,0)[96] intercept   : AIC=7158.161, Time=0.08 sec
 ARIMA(0,1,1)(1,0,1)[96] intercept   : AIC=7152.758, Time=6.01 sec
 ARIMA(0,1,1)(0,0,2)[96] intercept   : AIC=7151.222, Time=23.45 sec
 ARIMA(0,1,1)(1,0,2)[96] intercept   : AIC=7153.667, Time=34.29 sec
 ARIMA(0,1,0)(0,0,2)[96] intercept   : AIC=7205.025, Time=33.94 sec
 ARIMA(1,1,1)(0,0,2)[96] intercept   : AIC=inf, Time=33.62 sec
 ARIMA(0,1,2)(0,0,2)[96] intercept   : AIC=7100.731, Time=32.16 sec
 ARIMA(0,1,2)(0,0,1)[96] intercept   : AIC=7101.501, Time=5.24 sec
 ARIMA(0,1,2)(1,0,2)[96] intercept   : AIC=7103.678, Time=43.70 sec
 ARIMA(0,1,2)(1,0,1)[96] intercept   : AIC=7102.781, Time=8.74 sec
 ARIMA(1,1,2)(0,0,

## INDUSTRIAL

In [4]:
fn = r"industrial\h=2_industrial_2016_LG_9.csv"
path = usable_data_folder / fn

model = lazy_auto_arima(path)

Performing stepwise search to minimize aic
 ARIMA(0,1,0)(0,0,0)[96] intercept   : AIC=3589.079, Time=0.17 sec
 ARIMA(1,1,0)(1,0,0)[96] intercept   : AIC=3540.084, Time=6.22 sec
 ARIMA(0,1,1)(0,0,1)[96] intercept   : AIC=3548.816, Time=6.93 sec
 ARIMA(0,1,0)(0,0,0)[96]             : AIC=3587.082, Time=0.07 sec
 ARIMA(1,1,0)(0,0,0)[96] intercept   : AIC=3548.835, Time=0.15 sec
 ARIMA(1,1,0)(2,0,0)[96] intercept   : AIC=inf, Time=47.94 sec
 ARIMA(1,1,0)(1,0,1)[96] intercept   : AIC=3540.728, Time=7.95 sec
 ARIMA(1,1,0)(0,0,1)[96] intercept   : AIC=3541.607, Time=6.31 sec
 ARIMA(1,1,0)(2,0,1)[96] intercept   : AIC=inf, Time=45.29 sec
 ARIMA(0,1,0)(1,0,0)[96] intercept   : AIC=3561.859, Time=5.08 sec
 ARIMA(2,1,0)(1,0,0)[96] intercept   : AIC=3529.899, Time=6.08 sec
 ARIMA(2,1,0)(0,0,0)[96] intercept   : AIC=3535.782, Time=0.62 sec
 ARIMA(2,1,0)(2,0,0)[96] intercept   : AIC=inf, Time=40.23 sec
 ARIMA(2,1,0)(1,0,1)[96] intercept   : AIC=3530.308, Time=8.56 sec
 ARIMA(2,1,0)(0,0,1)[96] interc

MemoryError: Unable to allocate 142. MiB for an array with shape (197, 197, 481) and data type float64

: 

: 