## Methods to Check Stationarity
### `KPSS Test` : 

KPSS (Kwiatkowski-Phillips-Schmidt-Shin) checks the stationarity of a time series (slightly less popular than the Dickey Fuller test). The null and alternate hypothesis for the KPSS test are opposite that of the ADF test, which often creates confusion.

The authors of the KPSS test have defined the null hypothesis as the process is trend stationary, to an alternate hypothesis of a unit root series. We will understand the trend stationarity in detail in the next section. For now, let’s focus on the implementation and see the results of the KPSS test.

Null Hypothesis: The process is trend stationary.

Alternate Hypothesis: The series has a unit root (series is not stationary).checks the stationarity of a time series (slightly less popular than the Dickey Fuller test). The null and alternate hypothesis for the KPSS test are opposite that of the ADF test, which often creates confusion.



### `ADF Test` :
The Dickey Fuller test is one of the most popular statistical tests. It can be used to determine the presence of unit root in the series, and hence help us understand if the series is stationary or not. The null and alternate hypothesis of this test are:

Null Hypothesis: The series has a unit root (value of a =1)

Alternate Hypothesis: The series has no unit root.

If we fail to reject the null hypothesis, we can say that the series is non-stationary. This means that the series can be linear or difference stationary (we will understand more about difference stationary in the next section).

Example: <br>
`ARMA` - Stationery Time Series Data <br>
`ARIMA` - Non-Stationery Time Series Data

In [1]:
##
import os
from datetime import datetime

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px

import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10, 8)
mpl.rcParams['axes.grid'] =  False

In [2]:
##
df = pd.read_csv('https://raw.githubusercontent.com/srivatsan88/YouTubeLI/master/dataset/amazon_revenue_profit.csv')
df.head()

Unnamed: 0,Quarter,Revenue,Net Income
0,3/31/2020,75452,2535
1,12/31/2019,87437,3268
2,9/30/2019,69981,2134
3,6/30/2019,63404,2625
4,3/31/2019,59700,3561


In [3]:
##
df['Quarter'] = pd.to_datetime(df['Quarter'])

df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 61 entries, 0 to 60
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   Quarter     61 non-null     datetime64[ns]
 1   Revenue     61 non-null     int64         
 2   Net Income  61 non-null     int64         
dtypes: datetime64[ns](1), int64(2)
memory usage: 1.6 KB


In [4]:
# check
df.head()

Unnamed: 0,Quarter,Revenue,Net Income
0,2020-03-31,75452,2535
1,2019-12-31,87437,3268
2,2019-09-30,69981,2134
3,2019-06-30,63404,2625
4,2019-03-31,59700,3561


In [5]:
##
fig = px.line(df, x = 'Quarter', y = 'Revenue', title = 'Amazon Profits')
fig.update_xaxes(
    rangeslider_visible = True,
)

fig.show()

#### `KPSS Test`  :
`Null hypothesis` - Series is stationary <br>
`Alternate hypothesis` - Series is not stationary

In [6]:
from statsmodels.tsa.stattools import kpss


pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.



In [7]:
stats, p, lags, critical_values = kpss(df['Revenue'], 'ct') # 'ct' is trend component





In [8]:
print(f'Test Statistics: {stats}')
print(f'p-value: {p}')
print(f'Critical Values: {critical_values}')

if p < 0.05:
    print("Series is not stationary")
else:
    print("Series is stationary")

Test Statistics: 0.170051682108309
p-value: 0.029956931576409152
Critical Values: {'10%': 0.119, '5%': 0.146, '2.5%': 0.176, '1%': 0.216}
Series is not stationary


#### `ADF Test` :
`Null hypothesis` - Series possesses a unit root and hence is not stationary <br>
`Alternate hypothesis` - Series is stationary

In [9]:
from statsmodels.tsa.stattools import adfuller

In [10]:
result =  adfuller(df['Revenue'])

In [11]:
print(f'Test Statistics: {result[0]}')
print(f'p-value: {result[1]}')
print(f'Critical Values: {result[4]}')

if result[1] > 0.05:
    print("Series is not stationary")
else:
    print("Series is stationary")

Test Statistics: -2.444836038197234
p-value: 0.1294794312183875
Critical Values: {'1%': -3.568485864, '5%': -2.92135992, '10%': -2.5986616}
Series is not stationary
