* Mathematically, such a (continuous) time series is referred to as an __Ornstein-Uhlenbeck__ process.
* Does a series behave like a __Ornstein-Uhlenbeck__ one?
* Use __statsmodel__ to test for such characteristics of the time series
* Test for the __stationarity__ of a time series

* A continuous mean-reverting time series can be represented by an Ornstein-Uhlenbeck stochastic differential equation
* $dx_t = θ(μ − x_t )dt + σdW_t$
* $θ$ is the rate of reversion to the mean, $μ$ is the mean, $σ$ is the variance and $W_t$ is a Wiener Process or Brownian Motion.
* the change of the price series in the next continuous time period is proportional to the difference between the mean price and the current price, with the addition of Gaussian noise.

* Augmented Dickey-Fuller Test (ADF Test)
* Linear lag model of order p
* $∆y_t = α + βt + γy_{t−1} + δ_1 ∆y_{t−1} + · · · + δ_{p−1} ∆y_{t−p+1} + \epsilon_t$
* The null hypothesis is that $δ$ is zero
* If the hypothesis is rejected then the movement of the price series is proportional to the current price and thus it is unlikely to be random walk.
* You usually reject the null when the p-value is less than or equal to a specified significance level

In [2]:
from __future__ import print_function

import statsmodels.tsa.stattools as ts
from pandas_datareader import data, wb


In [4]:
sym_data = data.DataReader("ESNC", 'yahoo')
ts.adfuller(sym_data['Adj Close'], 1)

(-2.5949259690789424,
 0.094050507801525685,
 1,
 1948,
 {'1%': -3.4337113644639197,
  '10%': -2.5675604733017385,
  '5%': -2.8630248480815408},
 -1682.5640218905046)

* First value is the calculated test-statistic
* Second value is the p-value.
* The fourth is the number of data points in the sample.
* The fifth value, the dictionary, contains the critical values of the test-statistic at the 1, 5 and 10 percent values respectively.
* The test statistic is smaller (in absolute value) than the 10% critical value. Then you could reject the null and claim that your variable is stationary.
* There is 9.4% chance that the model has a unit root
* There is 9.4% chance that the process is non-stationary

In [5]:
sym_data = data.DataReader("AMZN", 'yahoo')
ts.adfuller(sym_data['Adj Close'], 1)

(0.86448272705383988,
 0.9926088618372102,
 0,
 1949,
 {'1%': -3.4337096375233331,
  '10%': -2.567560067343214,
  '5%': -2.863024085652552},
 13391.622280081636)

* Since the calculated value of the test statistic is larger than any of the critical values at the 1, 5 or 10 percent levels, we cannot reject the null hypothesis of γ=0 and thus we are unlikely to have found a mean reverting time series.

* A time series (or stochastic process) is defined to be strongly stationary if its joint probability distribution is invariant under translations in time or space. In particular, and of key importance for traders, the mean and variance of the process do not change over time or space and they each do not follow a trend.

In [12]:
from __future__ import print_function
from numpy import cumsum, log, polyfit, sqrt, std, subtract
from numpy.random import randn

def hurst(ts):
    """Returns the Hurst Exponent of the time series vector ts"""
    # Create the range of lag values
    lags = range(2, 100)
    # Calculate the array of the variances of the lagged differences
    tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags]
    # Use a linear fit to estimate the Hurst Exponent
    poly = polyfit(log(lags), log(tau), 1)
    # Return the Hurst exponent from the polyfit output
    return poly[0]*2.0

# Create a Gometric Brownian Motion, Mean-Reverting and Trending Series
gbm = log(cumsum(randn(100000))+1000)
mr = log(randn(100000)+1000)
tr = log(cumsum(randn(100000)+1)+1000)

# Output the Hurst Exponent for each of the above series
# and the price of Amazon (the Adjusted Close price) for
# the ADF test given above in the article
print("Hurst(GBM):%s" % hurst(gbm))
print("Hurst(MR):%s" % hurst(mr))
print("Hurst(TR):%s" % hurst(tr))

cape = data.DataReader("CAPE", 'yahoo')
esnc = data.DataReader("ESNC", 'yahoo')
amzn = data.DataReader("AMZN", 'yahoo')

# Assuming you have run the above code to obtain ’amzn’!
print("Hurst(CAPE): %s" % hurst(cape['Adj Close']))
print("Hurst(ESNC): %s" % hurst(esnc['Adj Close']))
print("Hurst(AMZN): %s" % hurst(amzn['Adj Close']))

Hurst(GBM):0.49785803683
Hurst(MR):-8.99982688747e-05
Hurst(TR):0.951720909908
Hurst(CAPE): 0.330720994267
Hurst(ESNC): 0.448092257917
Hurst(AMZN): 0.474886254127
