### ARIMA (Aouto Regression Moving Average):

#### It has three components 
1-	Auto regression :  Forecast using a linear combination of past values
2-	Integrated : How many times we difference data to get stationary
3-	Moving average : smooth out the noise for time series
#### It has also three parameters: 
p: no. of lag observation included in the model

d: the no. of times that row observation are diffrenced

q: size of moving average window

Given a time series data Xt where t is an integer index, and the Xt are real numbers, an model is given by

![](https://wikimedia.org/api/rest_v1/media/math/render/svg/433c765f004fe1138737630568375a41f4e4d659)

Right hand side is (AR p component), left hand side is for (MA q component)


or equivalently by

![](https://wikimedia.org/api/rest_v1/media/math/render/svg/1d1c4e1959722b56ce5e5841b9dc3167ec9072b3)

An ARIMA(p,d,q) is given by:

![](https://wikimedia.org/api/rest_v1/media/math/render/svg/b6ebbe31d07e994b209c391e3d6f8f5d88e267c3)


Note:
- SARIMA is the same as ARIMA but adding Seasonal component
- I will use Pyramid ARIMA library for determine p,d,q order rather than ACF&PACF plots
- Payramid ARIMA library choose the best order aotomaticly by AIC & BIC equation 

![](https://wikimedia.org/api/rest_v1/media/math/render/svg/fe67d436d9064a370cbe800b24b05ee8a68d491b)

Where k is no. of estimated parameters, L maximum value of likelihood function for model 

In [None]:
pip install pmdarima

### 1- Import libraries

In [None]:
import numpy as np
import pandas as pd 
from statsmodels.tsa.arima_model import ARIMA
from pmdarima import auto_arima #Payramid ARIMA library

### 2- Load data

In [None]:
df= pd.read_csv('../input/real-manufacturing-and-trade-inventories-2020/INVCMRMT.csv', index_col='DATE', parse_dates=True)
df['INVCMRMT']=df['INVCMRMT'].astype(int)
df.index.freq= 'MS'


In [None]:

df['INVCMRMT'].plot(figsize=(12,6)) # Seems like add model 
# for more information about add and mul model check sources

### 3- Check seasonal component:


In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose
res= seasonal_decompose(df['INVCMRMT'], model='add')
res.plot();
#Looks like there is a seasonal component but it's very small  "thousand from millions"


In [None]:
res.seasonal.plot(figsize=(12,8)); #ignore seasonal component

In [None]:
auto_arima(df['INVCMRMT'], seasonal=False).summary()

### 4- Dickey fuller test for data after differencing to make sure its stationary
#### To check whether data is stationary or not
for more information check sources

In [None]:
from statsmodels.tsa.statespace.tools import diff
df['diff_1']= diff(df['INVCMRMT'],k_diff=1)

In [None]:
from statsmodels.tsa.stattools import adfuller
def adf_test(series, title=''):
    #Pass time series and optimal title, return an ADF report
    print(f'Augmented Dickey-Fuller Test : {title}')
    result = adfuller(series.dropna(),autolag='AIC')#drop nan values
    labels =['ADF Test statistic','P-value','# lags used', '#observations']
    out= pd.Series(result[0:4],index=labels)
    
    for key,val in result[4].items():
        out[f'critical value({key})']=val
    print(out.to_string())
    if result[1]<= 0.05:
        print('Strong evidence against the null hypothesis')
        print('Reject the null hypothesis')
        print('Data has no unit root & is stationary')
    else:
        print('Week evidence against the null hypothesis')
        print('Fail to reject the null hypothesis')
        print('Data has a unit root & is non-stationary')

In [None]:
adf_test(df['diff_1'])

### 5- Creating model

In [None]:
len(df) #We will grap the last year for forecasting

In [None]:
train= df.iloc[:271]
test = df.iloc[271:]

model = ARIMA(train['INVCMRMT'], order=(1,1,1))
results= model.fit()
results.summary()

### 5- Predictions:


In [None]:
start = len(train)
end = len(train) + len(test) - 1
predictions = results.predict(start,end,typ='levels').rename('ARIMA Predections') #we choose 'levels' to predict as original data variables

In [None]:
test['INVCMRMT'].plot(figsize=(12,8), legend=True)
predictions.plot(legend=True)

## I only can predict general trend!

In [None]:
from statsmodels.tools.eval_measures import rmse
error= rmse(test['INVCMRMT'],predictions)
error


In [None]:
test['INVCMRMT'].mean()

### Error is quite big cause of seasonal component.

### 6- Forecasting:

In [None]:
len(df)

In [None]:
model= ARIMA(df['INVCMRMT'],order=(1,1,1))
result= model.fit()
forecast= result.predict(len(df),len(df)+11,typ='levels').rename('ARIMA FORECAST')

df['INVCMRMT'].plot(figsize=(12,8), legend=True)
forecast.plot(legend=True)

#### Source1 : [ACF&PACF](https://www.kaggle.com/taghredsalah199/time-series-correlation-acf-pacf)
#### Source2 : [Descriptive statistical tests](https://www.kaggle.com/taghredsalah199/time-series-descriptive-statistical-tests)