Time decomposition
- Time series data can be often be decomposed into its trend (long term), seasonal (periodic) and residual components (random noice in data)

In [1]:
import pandas as pd
import numpy as np

In [13]:
dates = pd.date_range(start='2022-01-01' , end='2023-03-31' )
values = np.random.randint(low=1, high=100, size=len(dates))

df = pd.DataFrame({'date':dates, 'value':values})

df = df.set_index('date')
df.head()

Unnamed: 0_level_0,value
date,Unnamed: 1_level_1
2022-01-01,53
2022-01-02,73
2022-01-03,42
2022-01-04,19
2022-01-05,90


In [14]:
# !pip install statsmodels

In [18]:
from statsmodels.tsa.seasonal import seasonal_decompose
decomposition = seasonal_decompose(df['value'], period=7)

df['trend'] = decomposition.trend
df['seasonal'] = decomposition.seasonal
df['residual'] = decomposition.resid

df

Unnamed: 0_level_0,value,trend,seasonal,residual
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2022-01-01,53,,-1.783830,
2022-01-02,73,,-3.185616,
2022-01-03,42,,3.363491,
2022-01-04,19,48.285714,3.513251,-32.798965
2022-01-05,90,42.000000,-4.658830,52.658830
...,...,...,...,...
2023-03-27,51,41.571429,3.363491,6.065080
2023-03-28,58,42.142857,3.513251,12.343892
2023-03-29,4,,-4.658830,
2023-03-30,18,,0.448312,


# Time series forcasting with ARIMA and SARIMA models
- (ARIMA - Auto Regressive Integrated Moving Average)
-
- ARIMA models are used for time series data that does not have a seasonal component
- SARIMA models are used for time series data that does have a seasonal component

In [19]:
dates = pd.date_range(start = '2022-01-01', end = '2023-03-31')
values = np.random.randint(low=1, high=100, size=len(dates))

df = pd.DataFrame({'date':dates, 'value':values})
df = df.set_index('date')
df.head()

Unnamed: 0_level_0,value
date,Unnamed: 1_level_1
2022-01-01,29
2022-01-02,90
2022-01-03,99
2022-01-04,52
2022-01-05,40


In [21]:
from statsmodels.tsa.arima.model import ARIMA

#Fiting an ARIMA (1,1,1) model to the time series data
model = ARIMA(df['value'], order=(1,1,1))
model_fit = model.fit()

# Making a 7-day forecast
forecast = model_fit.forecast(steps=7)
forecast

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  self._init_dates(dates, freq)


2023-04-01    50.377216
2023-04-02    49.878912
2023-04-03    49.860685
2023-04-04    49.860018
2023-04-05    49.859994
2023-04-06    49.859993
2023-04-07    49.859993
Freq: D, Name: predicted_mean, dtype: float64

- order(1,1,1)  (auto_regressive, integration, moving)

- auto-reg : it uses the past values to predict the future values (it uses the most resent values of the timeseries to make prediction)
- integration part :  it is use for differencing 
- moving avg: It uses the errors or residuals from the past predictions to predict future values

# SARIMA model

In [22]:
#sarima model
from statsmodels.tsa.statespace.sarimax import SARIMAX 

model = SARIMAX(df['value'], order=(1,1,1), seasonal_order=(1,1,1,7))
model_fit = model.fit()

#make a 7-day forecast
forecast = model_fit.forecast(steps=7)
forecast

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)


2023-04-01    51.638986
2023-04-02    59.200647
2023-04-03    54.148018
2023-04-04    52.709756
2023-04-05    47.397327
2023-04-06    55.394511
2023-04-07    53.249178
Freq: D, Name: predicted_mean, dtype: float64