<a href="https://colab.research.google.com/github/cagBRT/timeSeries/blob/main/3b_TestingForStationarity.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# Clone the entire repo.
!git clone -s https://github.com/cagBRT/timeSeries.git cloned-repo
%cd cloned-repo

In [None]:
import pandas as pd
import numpy as np


# **Methods for testing for stationarity**<br>
The stationarity of a series can be established by looking at the plot of the series<br>
Split the series into 2 or more contiguous parts and computing the summary statistics like the mean, variance and the autocorrelation. If the stats are quite different, then the series is not likely to be stationary.<br>


To quantitatively determine if a given series is stationary or not. This can be done using statistical tests called ‘Unit Root Tests’. There are multiple variations of this, where the tests check if a time series is non-stationary and possess a unit root.

There are a number of implementations of Unit Root tests. <br>
We will look at two :<br>

- Augmented Dickey Fuller test (ADF Test)
- Kwiatkowski-Phillips-Schmidt-Shin – KPSS test (trend stationary)


The most commonly used is the ADF test, where the null hypothesis is the time series possesses a unit root and is non-stationary. So, id the P-Value in ADH test is less than the significance level (0.05), you reject the null hypothesis.
<br><BR>


In [None]:
from statsmodels.tsa.stattools import adfuller, kpss
df = pd.read_csv('timeSeriesExample.csv', parse_dates=['date'])

# ADF Test
result = adfuller(df.value.values, autolag='AIC')
print(f'ADF Statistic: {result[0]}')
print(f'p-value: {result[1]}')
for key, value in result[4].items():
    print('Critial Values:')
    print(f'   {key}, {value}')

The KPSS test, on the other hand, is used to test for trend stationarity. The null hypothesis and the P-Value interpretation is just the opposite of ADH test. 

In [None]:
# KPSS Test
result = kpss(df.value.values, regression='c')
print('\nKPSS Statistic: %f' % result[0])
print('p-value: %f' % result[1])
for key, value in result[3].items():
    print('Critial Values:')
    print(f'   {key}, {value}')

# **White Noise**

White noise is not a function of time, its mean and variance does not change over time. The difference is, **the white noise is completely random with a mean of 0**.

In white noise there is no pattern whatsoever. If you consider the sound signals in an FM radio as a time series, the blank sound you hear between the channels is white noise.

Mathematically, a sequence of completely random numbers with mean zero is a white noise.



In [None]:
randvals = np.random.randn(1000)
pd.Series(randvals).plot(title='Random White Noise', color='k')