## Stationarity Test

It is important to ensure the time series is stationary before it can be used to fit forecasting models. If the time series is non-stationary (i.e if the means, variances, covariances changes with time), it cannot be used for predicting future data points.

In [None]:
import import_ipynb
import data_analysis
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import adfuller

In [2]:
ds = data_analysis.dataset

In [3]:
#  Function referenced from statsmodel docs
def adf_test(timeseries):
    print("Results of Dickey-Fuller Test:")
    dftest = adfuller(timeseries, autolag="AIC")
    dfoutput = pd.Series(
        dftest[0:4],
        index=[
            "Test Statistic",
            "p-value",
            "#Lags Used",
            "Number of Observations Used",
        ],
    )
    for key, value in dftest[4].items():
        dfoutput["Critical Value (%s)" % key] = value
    print(dfoutput)

### Verizon

In [4]:
adf_test(ds['VZ'])

Results of Dickey-Fuller Test:
Test Statistic                  -3.088076
p-value                          0.027448
#Lags Used                      13.000000
Number of Observations Used    494.000000
Critical Value (1%)             -3.443657
Critical Value (5%)             -2.867408
Critical Value (10%)            -2.569896
dtype: float64


The Augmented Dickey-Fuller Test for Verizon stock prices shows the p-value obtained is less than 0.05. So we can reject the null hypothesis that the time series is a random walk. The data is stationary.

### AT&T

In [5]:
adf_test(ds['T'])

Results of Dickey-Fuller Test:
Test Statistic                  -1.252554
p-value                          0.650590
#Lags Used                      13.000000
Number of Observations Used    494.000000
Critical Value (1%)             -3.443657
Critical Value (5%)             -2.867408
Critical Value (10%)            -2.569896
dtype: float64


The Augmented Dickey-Fuller Test for Verizon stock prices shows the p-value obtained is greater than 0.05. So the null hypothesis that the time series is a random walk holds true. A random walk is non-stationary. In this case non-stationarity is removed by calculating the returns(or taking the difference)

In [6]:
diff_T = ds['T'].diff()

Non-stationarity is removed as we observed the results by running the test after taking differences on AT&T stock time series.

In [7]:
adf_test(diff_T.dropna())

Results of Dickey-Fuller Test:
Test Statistic                -6.760867e+00
p-value                        2.794516e-09
#Lags Used                     1.200000e+01
Number of Observations Used    4.940000e+02
Critical Value (1%)           -3.443657e+00
Critical Value (5%)           -2.867408e+00
Critical Value (10%)          -2.569896e+00
dtype: float64
