# Functions to check stationarity

In this notebook, we want to check the stationarity of the data using the adf method. <br>

$ADF (Augmented Dickey-Fuller)$ test: It is used to check the stationarity of the data, if the $p-value$  is less than $0.05$, we can say with $95%$ confidence that the time series is stationary.<br>
A simple example for $p-value$ :<br>
We have two pledges. The first group takes special vitamin tablets and the second group does not. Question: Is the average weight of two groups equal? <br>
Null hypothesis: The average weight of two groups is equal. Now we will calculate the p value, if the p value is less than $0.05$, it means that the Null hypothesis is wrong and the mean weight is not equal.<br>
Here, in fact, in the $(ADF)$ test, the null hypothesis: the time series is unstable and $non-stationary$.
 So, if $p-value$ is less than the significant level (which is usually $0.05$), we can reject the null hypothesis and conclude that the time series has stationary properties.<br>
3 different functions which are actually the same are presented here. You can use the desired function according to the output type.

In [1]:
import numpy as np
import statsmodels.api as sm
import pandas as pd
from statsmodels.tsa.stattools import adfuller, acf
import matplotlib.pyplot as plt


In [2]:
def FirstـTest(time_series):
    adf_result = adfuller(time_series)
    p_value = adf_result[1]
    is_stationary = p_value < 0.05
    print("p_value : " , p_value)
    print(f"Second-order stationarity: {'Yes' if is_stationary else 'No'}")
    return is_stationary

In [3]:
def Second_Test(timeseries):
    result = adfuller(timeseries, autolag='AIC')
    print('ADF Statistic:', result[0])
    print('p-value:', result[1])
    print('Critical Values:')
    for key, value in result[4].items():
        print(f'    {key}: {value}')

    if result[1] < 0.05:
        print("The time series is stationary (reject the null hypothesis).")
    else:
        print("The time series is not stationary (fail to reject the null hypothesis).")

In [4]:
def Third_Test(timeseries):
    dftest = adfuller(timeseries, autolag='AIC')
    p_value = dftest[1]
    return p_value

In [5]:
np.random.seed(0)
#data = np.random.randn(100)
#data = np.random.uniform(1,100,10000)
data = np.random.normal(loc=0, scale=1, size=10000)
#data = [-3,-2,-1,0,1,2,3,4,5,6]
#data =  [3, 5, 7, 9, 11,12,15,23,54,23,12,43]
# Load the dataset
# df = pd.read_csv('your data')                               # 'QUOTAS_3only.csv'
# df = df[(df != 0).all(axis=1) & df.notna().all(axis=1)]     # Remove rows that contain zeros or are empty
# print(df.columns)
#data = df['The name of the column in your data'].values      #'log_lbol' , 'log_bh'

In [6]:
#data = pd.Series(data)
FirstـTest(data)

p_value :  0.0
Second-order stationarity: Yes


True

In [7]:
Second_Test(data)

ADF Statistic: -99.49026786274287
p-value: 0.0
Critical Values:
    1%: -3.4310041633725734
    5%: -2.861829101294412
    10%: -2.566923883481157
The time series is stationary (reject the null hypothesis).


In [8]:
p_value = Third_Test(data)
if p_value < 0.05:
    print("The dataset is stationary (p-value =", p_value, ")")
else:
    print("The dataset is not stationary (p-value =", p_value, ")")

Third_Test(data)

The dataset is stationary (p-value = 0.0 )


0.0