# Practice Notebook

### This is a notebook just for me to practice the strategies and concepts that I am learning through reading books and reading blog articles

In [2]:
import numpy as np
import pandas as pd
import os, os.path
import statsmodels.api as sm
import matplotlib.pyplot as plt
import pandas_datareader as pdr
import fix_yahoo_finance as yf
import datetime
import statsmodels.api as sm


yf.pdr_override()
%matplotlib inline

    Auto-overriding of pandas_datareader's get_data_yahoo() is deprecated and will be removed in future versions.
    Use pdr_override() to explicitly override it.


## Augmented Dickey-Fuller Test (ADF test)

Purpose: To hypothesis test if a price series is mean reverting or not. If it is mean-reverting, then we can predict the price series' next move to a certain degree of confidence. If price level > mean, then it is moving downward; vice-versa for price level < mean. 

Hypothesis Test: Proportionality constant, lambda == 0. If it can be rejected, that means the next move in price series is dependent on current level, and is NOT a random walk.

Test statistic: Lambda / Standard Error(Lambda). This HAS to be negative for a mean-reverting series. If positive, then it is not mean-reverting; instead, it is trending. We can utilize a p-value critical test in order to evaluate statistical significance and implement hypothesis testing.

Other info: We will assume drift ~ 0 since actual fluctuations in price is much smaller in practical, rather than theoretical applications. In addition, we want to keep the calculation simple

We are going to be using the USD.JPY currency exchange rate price series to implement the ADF test, and find out if it exhibits a mean-reverting price series.

In [3]:
#Pull data from yahoo finance
symbol = 'JPY=X'
start = datetime.datetime(2010, 9, 1)
end = datetime.datetime(2017, 9, 1)
jpy_exchange_rate = pdr.get_data_yahoo(symbol, start, end)
jpy_exchange_rate.head()

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-08-31,84.139999,84.658997,83.732002,84.139999,84.139999,0.0
2010-09-01,84.370003,84.550003,84.022003,84.406998,84.406998,0.0
2010-09-02,84.269997,85.192001,84.169998,84.25,84.25,0.0
2010-09-05,84.419998,84.482002,84.029999,84.43,84.43,0.0
2010-09-06,84.129997,84.251999,83.519997,84.120003,84.120003,0.0


In [10]:
from statsmodels.tsa.stattools import adfuller

#drop NaN values to avoid messing up the calculations in the test
jpy_exchange_rate.dropna(inplace=True)
result = adfuller(jpy_exchange_rate['Adj Close'])
result

(-1.00708018379616,
 0.7506732152030301,
 6,
 1817,
 {'1%': -3.4339540519343137,
  '10%': -2.5676175211639354,
  '5%': -2.8631319880806281},
 3382.587173599476)

The result, which is a t-score of about -1.0, cannot reject the null hypothesis (that lambda == 0) at 90% confidence. Consequently, we cannot prove (via ADF test) that the exchange rate is NOT a geometric random walk. However, we can at least reassure that it is not a trending strategy, since lambda is not positive.

## Calculating the Hurst Exponent

By calculating the Hurst Exponent, via the formula for measuring the speed of diffusion, we can find out whether or not a series is stationary or non-stationary. The null hypothesis for the Hurst exponent (0 < H < 1), is H == 0.5, which indicates a Geomtric Brownian Motion. If H < 0.5, the series is more mean-reverting and more likely to be stationary. On the other hand, if H > 0.5, then the series is more trending and more likely to be non-stationary. The closer H is to 1, the more strongly trending it is; likewise, the closer H is to 0, the more strongly mean-reverting the series is.

autocorrelation: The correlation between two different versions of the same time series - once in its normal form, and another that is a lagged version of the time series

In [21]:
from numpy import cumsum, log, polyfit, sqrt, std, subtract
from numpy.random import randn

#source code: https://www.quantstart.com/articles/Basics-of-Statistical-Mean-Reversion-Testing
def hurst(ts):
    """Returns the Hurst Exponent of the time series vector ts"""
    # Create the range of lag values
    lags = range(2, 100)

    # Calculate the array of the variances of the lagged differences
    tau = [sqrt(std(subtract(ts[lag:], ts[:-lag]))) for lag in lags]

    # Use a linear fit to estimate the Hurst Exponent
    poly = polyfit(log(lags), log(tau), 1)

    # Return the Hurst exponent from the polyfit output
    return poly[0]*2.0

# Create a Gometric Brownian Motion, Mean-Reverting and Trending Series
gbm = log(cumsum(randn(100000))+1000)
mr = log(randn(100000)+1000)
tr = log(cumsum(randn(100000)+1)+1000)

# Output the Hurst Exponent for each of the above series
# and the price of Google (the Adjusted Close price) for 
# the ADF test given above in the article. Testing the hurst function
print(hurst(gbm))
print(hurst(mr))
print(hurst(tr))
# Assuming you have run the above code to obtain 'goog'!
print(hurst(jpy_exchange_rate['Adj Close']))

0.505024502023
-0.000154920578908
0.954821520778
0.546462512592


Here, we can see that the Hurst Exponent obtained for the USD.JPY currency exchange rate is approximately 0.55. In other words, the Hurst Exponent is telling us that the USD.JPY series is slightly trending.