<a href="https://colab.research.google.com/github/damiancyrana/colab-notebooks/blob/main/ARIMA.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install pandas statsmodels matplotlib pmdarima
import pandas as pd
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import pmdarima as pm
import pickle
import os
from pmdarima import auto_arima
from statsmodels.tsa.arima.model import ARIMA
from datetime import datetime
import pytz
import re
from statsmodels.tsa.statespace.sarimax import SARIMAX



Collecting pmdarima
  Downloading pmdarima-2.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_28_x86_64.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m20.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pmdarima
Successfully installed pmdarima-2.0.4


In [None]:
# Ładowanie danych
url = '/content/drive/MyDrive/Colab Notebooks/US100_M5.csv'
data = pd.read_csv(url)

# Konwersja kolumny 'ctmString' na typ datetime i ustawienie jako indeks
data['ctmString'] = pd.to_datetime(data['ctmString'])
data.set_index('ctmString', inplace=True)

# Dodanie kolumny z dniem tygodnia
data['day_of_week'] = data.index.dayofweek

# Usunięcie weekendów (sobota = 5, niedziela = 6) i niepotrzebnych kolumn
data = data[~data['day_of_week'].isin([5, 6])]
data.drop(['Unnamed: 0', 'day_of_week'], axis=1, inplace=True)

# Sortowanie danych
data.sort_values('ctmString', inplace=True)
data.dropna(subset=['open', 'close', 'high', 'low', 'vol'], inplace=True)

**Funkcja do prognozowania z automatycznym doborem parametrów ARIMA**

In [None]:
def auto_arima_forecast(feature_name):
    """
    Funkcja do prognozowania z automatycznym doborem parametrów ARIMA
    """
    auto_model = auto_arima(data[feature_name], seasonal=False, m=0,
                            stepwise=True, suppress_warnings=True,
                            error_action='ignore', trace=True,
                            max_order=None)

    # Użycie dobrych parametrów do stworzenia modelu ARIMA
    model = ARIMA(data[feature_name], order=auto_model.order)
    model_fit = model.fit()

    # Prognozowanie następnej wartości
    forecast = model_fit.forecast(steps=1)
    return forecast.iloc[0]

# Przeprowadzenie prognozowania dla każdej cechy
forecast_open = round(auto_arima_forecast('open'), 2)
forecast_close = round(auto_arima_forecast('close'), 2)
forecast_high = round(auto_arima_forecast('high'), 2)
forecast_low = round(auto_arima_forecast('low'), 2)

# Wyświetlenie zaokrąglonych prognoz
print("Prognozowane wartości dla kolejnej świecy M5 (zaokrąglone):")
print(f"Open: {forecast_open}")
print(f"Close: {forecast_close}")
print(f"High: {forecast_high}")
print(f"Low: {forecast_low}")


Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40868.888, Time=6.75 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40868.066, Time=0.24 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40864.694, Time=0.55 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40864.863, Time=1.99 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40870.630, Time=0.21 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40865.172, Time=1.23 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40865.736, Time=4.59 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40867.148, Time=3.69 sec
 ARIMA(1,1,0)(0,0,0)[0]             : AIC=40867.539, Time=0.22 sec

Best model:  ARIMA(1,1,0)(0,0,0)[0] intercept
Total fit time: 19.498 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40886.093, Time=13.31 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40886.186, Time=0.15 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40881.984, Time=0.29 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40882.169, Time=0.95 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40888.693, Time=0.14 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40882.636, Time=0.55 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40883.071, Time=2.97 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40884.606, Time=12.82 sec
 ARIMA(1,1,0)(0,0,0)[0]             : AIC=40884.789, Time=0.23 sec

Best model:  ARIMA(1,1,0)(0,0,0)[0] intercept
Total fit time: 31.434 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=39748.048, Time=8.14 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=39796.348, Time=0.16 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=39758.931, Time=0.38 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=39760.178, Time=1.99 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=39799.772, Time=0.19 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=39755.270, Time=14.24 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=39754.880, Time=8.73 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=39755.573, Time=9.69 sec
 ARIMA(2,1,3)(0,0,0)[0] intercept   : AIC=39757.759, Time=14.79 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=39760.428, Time=1.39 sec
 ARIMA(1,1,3)(0,0,0)[0] intercept   : AIC=39757.533, Time=7.43 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=39755.965, Time=7.27 sec
 ARIMA(3,1,3)(0,0,0)[0] intercept   : AIC=inf, Time=19.69 sec
 ARIMA(2,1,2)(0,0,0)[0]             : AIC=39751.755, Time=2.52 sec

Best model:  ARIMA(2,


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40221.534, Time=10.65 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40264.029, Time=0.15 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40228.656, Time=0.30 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40226.347, Time=1.40 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40267.078, Time=0.13 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=40230.289, Time=3.73 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40219.603, Time=2.77 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40226.177, Time=3.15 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40223.716, Time=1.22 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=40221.552, Time=7.89 sec
 ARIMA(3,1,0)(0,0,0)[0] intercept   : AIC=40223.226, Time=1.25 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=40223.568, Time=6.18 sec
 ARIMA(2,1,1)(0,0,0)[0]             : AIC=40222.841, Time=1.97 sec

Best model:  ARIMA(2,1,1)(0,0,0)[0] intercept
Total fit time: 40.805 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



Prognozowane wartości dla kolejnej świecy M5 (zaokrąglone):
Open: 16018.25
Close: 16018.11
High: 16020.97
Low: 16016.56



No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



In [None]:
def sarimax_forecast(feature_name):
    """
    Funkcja do prognozowania z automatycznym doborem parametrów SARIMAX
    """
    # Auto ARIMA do znalezienia najlepszego zestawu parametrów
    auto_model = auto_arima(data[feature_name], exogenous=data[['vol']],
                            seasonal=False, m=0,  # m to liczba okresów w sezonie
                            stepwise=True, suppress_warnings=True,
                            error_action='ignore', trace=True)

    # Budowanie modelu SARIMAX z najlepszymi parametrami
    model = SARIMAX(data[feature_name], exog=data['vol'],
                    order=auto_model.order, seasonal_order=auto_model.seasonal_order)
    model_fit = model.fit(disp=False)

    # Prognozowanie następnej wartości
    forecast = model_fit.forecast(steps=1, exog=data['vol'].iloc[-1])
    return forecast.iloc[0]

# Przeprowadzenie prognozowania dla każdej cechy
forecast_open = round(sarimax_forecast('open'), 2)
forecast_close = round(sarimax_forecast('close'), 2)
forecast_high = round(sarimax_forecast('high'), 2)
forecast_low = round(sarimax_forecast('low'), 2)

# Wyświetlenie zaokrąglonych prognoz
print("Prognozowane wartości dla kolejnej świecy M5 (zaokrąglone):")
print(f"Open: {forecast_open}")
print(f"Close: {forecast_close}")
print(f"High: {forecast_high}")
print(f"Low: {forecast_low}")

Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40868.888, Time=14.93 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40868.066, Time=0.14 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40864.694, Time=0.29 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40864.863, Time=0.94 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40870.630, Time=0.12 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40865.172, Time=0.61 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40865.736, Time=3.12 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40867.148, Time=4.61 sec
 ARIMA(1,1,0)(0,0,0)[0]             : AIC=40867.539, Time=0.25 sec

Best model:  ARIMA(1,1,0)(0,0,0)[0] intercept
Total fit time: 25.032 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


Maximum Likelihood optimization failed to converge. Check mle_retvals


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40886.093, Time=7.57 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40886.186, Time=0.14 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40881.984, Time=0.30 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40882.169, Time=1.00 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40888.693, Time=0.13 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40882.636, Time=0.64 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40883.071, Time=2.93 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40884.606, Time=10.83 sec
 ARIMA(1,1,0)(0,0,0)[0]             : AIC=40884.789, Time=0.15 sec

Best model:  ARIMA(1,1,0)(0,0,0)[0] intercept
Total fit time: 23.707 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=39748.048, Time=6.08 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=39796.348, Time=0.16 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=39758.931, Time=0.30 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=39760.178, Time=1.73 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=39799.772, Time=0.20 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=39755.270, Time=9.22 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=39754.880, Time=4.70 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=39755.573, Time=13.12 sec
 ARIMA(2,1,3)(0,0,0)[0] intercept   : AIC=39757.759, Time=13.62 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=39760.428, Time=2.78 sec
 ARIMA(1,1,3)(0,0,0)[0] intercept   : AIC=39757.533, Time=5.35 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=39755.965, Time=6.02 sec
 ARIMA(3,1,3)(0,0,0)[0] intercept   : AIC=inf, Time=20.90 sec
 ARIMA(2,1,2)(0,0,0)[0]             : AIC=39751.755, Time=4.72 sec

Best model:  ARIMA(2,


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.


Non-invertible starting MA parameters found. Using zeros as starting parameters.


No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



Performing stepwise search to minimize aic
 ARIMA(2,1,2)(0,0,0)[0] intercept   : AIC=40221.534, Time=7.28 sec
 ARIMA(0,1,0)(0,0,0)[0] intercept   : AIC=40264.029, Time=0.23 sec
 ARIMA(1,1,0)(0,0,0)[0] intercept   : AIC=40228.656, Time=0.52 sec
 ARIMA(0,1,1)(0,0,0)[0] intercept   : AIC=40226.347, Time=2.68 sec
 ARIMA(0,1,0)(0,0,0)[0]             : AIC=40267.078, Time=0.20 sec
 ARIMA(1,1,2)(0,0,0)[0] intercept   : AIC=40230.289, Time=5.50 sec
 ARIMA(2,1,1)(0,0,0)[0] intercept   : AIC=40219.603, Time=2.68 sec
 ARIMA(1,1,1)(0,0,0)[0] intercept   : AIC=40226.177, Time=1.59 sec
 ARIMA(2,1,0)(0,0,0)[0] intercept   : AIC=40223.716, Time=0.65 sec
 ARIMA(3,1,1)(0,0,0)[0] intercept   : AIC=40221.552, Time=7.98 sec
 ARIMA(3,1,0)(0,0,0)[0] intercept   : AIC=40223.226, Time=1.64 sec
 ARIMA(3,1,2)(0,0,0)[0] intercept   : AIC=40223.568, Time=6.58 sec
 ARIMA(2,1,1)(0,0,0)[0]             : AIC=40222.841, Time=1.13 sec

Best model:  ARIMA(2,1,1)(0,0,0)[0] intercept
Total fit time: 38.707 seconds



A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.


A date index has been provided, but it has no associated frequency information and so will be ignored when e.g. forecasting.



Prognozowane wartości dla kolejnej świecy M5 (zaokrąglone):
Open: 16018.22
Close: 16018.11
High: 16021.59
Low: 16016.96



No supported index is available. Prediction results will be given with an integer index beginning at `start`.


No supported index is available. In the next version, calling this method in a model without a supported index will result in an exception.



SARIMAX  Skuteczność modelu: 99.99655093973188 %

Open: 16018.22
Close: 16018.11
High: 16020.31
Low: 16017.02

In [None]:
# Prognozowane wartości
predicted = {
    "Open": forecast_open,
    "Close": forecast_close,
    "High": forecast_high,
    "Low": forecast_low
}

# Rzeczywiste wartości (podane ręcznie)
actual = {
    "Open": 16017.37,
    "Close": 16017.87,
    "High": 16021.18,
    "Low": 16016.77
}

# Obliczenie błędu procentowego dla każdej cechy
percentage_errors = {}
for key in predicted.keys():
    error = abs(predicted[key] - actual[key]) / actual[key]
    percentage_errors[key] = error * 100

# Średni błąd procentowy
average_error = sum(percentage_errors.values()) / len(percentage_errors)
model_accuracy = 100 - average_error
print(f"Skuteczność modelu: {model_accuracy} %")


Skuteczność modelu: 99.99736239138554 %


In [None]:
# Stworzenie DataFrame z jedną świecą
df_predicted = pd.DataFrame([predicted])

fig = go.Figure(data=[go.Candlestick(
                open=df_predicted['Open'],
                high=df_predicted['High'],
                low=df_predicted['Low'],
                close=df_predicted['Close'])])

fig.update_layout(
    title='Wykres Prognozowanej Świecy M5 US100',
    xaxis_title='Świeca M5',
    yaxis_title='Cena',
    xaxis_rangeslider_visible=False,
    yaxis=dict(
        tickformat=".2f"
    ),
    xaxis=dict(
        type='category'
    )
)

fig.show()
