<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

# Python for Algorithmic Trading

**Chapter 04 &mdash; Vectorized Backtesting**

In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('seaborn')
import matplotlib as mpl
mpl.rcParams['font.family'] = 'serif'

## Making Use of Vectorization

### Vectorization with NumPy 

In [2]:
v = [1, 2, 3, 4, 5]

In [3]:
sm = [2 * i for i in v]

In [4]:
sm

[2, 4, 6, 8, 10]

In [5]:
2 * v

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [6]:
import numpy as np

In [7]:
a = np.array(v)

In [8]:
a

array([1, 2, 3, 4, 5])

In [9]:
type(a)

numpy.ndarray

In [10]:
2 * a 

array([ 2,  4,  6,  8, 10])

In [11]:
0.5 * a + 2

array([ 2.5,  3. ,  3.5,  4. ,  4.5])

In [12]:
a = np.arange(12).reshape((4, 3))

In [13]:
a

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

In [14]:
2 * a

array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16],
       [18, 20, 22]])

In [15]:
a ** 2

array([[  0,   1,   4],
       [  9,  16,  25],
       [ 36,  49,  64],
       [ 81, 100, 121]])

In [16]:
a.mean()

5.5

In [17]:
np.mean(a)

5.5

In [18]:
a.mean(axis=0)

array([ 4.5,  5.5,  6.5])

In [19]:
np.mean(a, axis=1)

array([  1.,   4.,   7.,  10.])

### Vectorization with pandas

In [20]:
import numpy as np

In [21]:
import pandas as pd

In [22]:
a = np.arange(15).reshape(5, 3)

In [23]:
a

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11],
       [12, 13, 14]])

In [24]:
columns = list('abc')

In [25]:
columns

['a', 'b', 'c']

In [26]:
index = pd.date_range('2017-7-1', periods=5, freq='B')

In [27]:
index

DatetimeIndex(['2017-07-03', '2017-07-04', '2017-07-05', '2017-07-06',
               '2017-07-07'],
              dtype='datetime64[ns]', freq='B')

In [28]:
df = pd.DataFrame(a, columns=columns, index=index)

In [29]:
df

Unnamed: 0,a,b,c
2017-07-03,0,1,2
2017-07-04,3,4,5
2017-07-05,6,7,8
2017-07-06,9,10,11
2017-07-07,12,13,14


In [30]:
2 * df

Unnamed: 0,a,b,c
2017-07-03,0,2,4
2017-07-04,6,8,10
2017-07-05,12,14,16
2017-07-06,18,20,22
2017-07-07,24,26,28


In [31]:
df.sum()

a    30
b    35
c    40
dtype: int64

In [32]:
np.mean(df)

a    6.0
b    7.0
c    8.0
dtype: float64

In [33]:
df['a'] + df['c']

2017-07-03     2
2017-07-04     8
2017-07-05    14
2017-07-06    20
2017-07-07    26
Freq: B, dtype: int32

In [34]:
0.5 * df.a + 2 * df.b - df.c

2017-07-03     0.0
2017-07-04     4.5
2017-07-05     9.0
2017-07-06    13.5
2017-07-07    18.0
Freq: B, dtype: float64

In [35]:
df['a'] > 5

2017-07-03    False
2017-07-04    False
2017-07-05     True
2017-07-06     True
2017-07-07     True
Freq: B, Name: a, dtype: bool

In [36]:
df[df['a'] > 5]

Unnamed: 0,a,b,c
2017-07-05,6,7,8
2017-07-06,9,10,11
2017-07-07,12,13,14


In [37]:
df['c'] > df['b']

2017-07-03    True
2017-07-04    True
2017-07-05    True
2017-07-06    True
2017-07-07    True
Freq: B, dtype: bool

In [38]:
0.15 * df.a + df.b > df.c

2017-07-03    False
2017-07-04    False
2017-07-05    False
2017-07-06     True
2017-07-07     True
Freq: B, dtype: bool

## Strategies based on Simple Moving Averages

### Getting into the Basics 

In [1]:
%matplotlib inline
import pandas as pd
import datetime

In [2]:
from pandas_datareader import data as web

In [3]:
start = datetime.datetime(2010, 1, 1)
end = datetime.datetime(2016, 10, 31)
data = web.DataReader('AAPL', "google", start, end)['Close']

The Google Finance API has not been stable since late 2017. Requests seem
to fail at random. Failure is especially common when bulk downloading.



RemoteDataError: Unable to read URL: https://finance.google.com/finance/historical?q=AAPL&startdate=Jan+01%2C+2010&enddate=Oct+31%2C+2016&output=csv
Response Text:
<html><head><meta http-equiv="content-type" content="text/html; charset=utf-8"/><title>Sorry...</title><style> body { font-family: verdana, arial, sans-serif; background-color: #fff; color: #000; }</style></head><body><div><table><tr><td><b><font face=sans-serif size=10><font color=#4285f4>G</font><font color=#ea4335>o</font><font color=#fbbc05>o</font><font color=#4285f4>g</font><font color=#34a853>l</font><font color=#ea4335>e</font></font></b></td><td style="text-align: left; vertical-align: bottom; padding-bottom: 15px; width: 50%"><div style="border-bottom: 1px solid #dfdfdf;">Sorry...</div></td></tr></table></div><div style="margin-left: 4em;"><h1>We're sorry...</h1><p>... but your computer or network may be sending automated queries. To protect our users, we can't process your request right now.</p></div><div style="margin-left: 4em;">See <a href="https://support.google.com/websearch/answer/86640">Google Help</a> for more information.<br/><br/></div><div style="text-align: center; border-top: 1px solid #dfdfdf;"><a href="https://www.google.com">Google Home</a></div></body></html>

In [None]:
data

In [None]:
data = pd.DataFrame(data)

In [None]:
data.rename(columns={'Close': 'price'}, inplace=True)

In [None]:
data.info()

In [None]:
data['SMA1'] = data['price'].rolling(42).mean()

In [None]:
data['SMA2'] = data['price'].rolling(252).mean()

In [None]:
data.tail()

In [None]:
data.plot(title='AAPL stock price | 42 & 252 days SMAs', figsize=(10, 6))
# plt.savefig('../../images/ch04/sma_plot_1.png')

In [None]:
import numpy as np

In [None]:
data['position'] = np.where(data['SMA1'] > data['SMA2'], 1, -1)

In [None]:
data.dropna(inplace=True)

In [None]:
data

In [None]:
data['position'].plot(ylim=[-1.1, 1.1], title='Market Positioning')
# plt.savefig('../../images/ch04/sma_plot_2.png')

In [None]:
data['returns'] = np.log(data['price'] / data['price'].shift(1))

In [None]:
data['test_ret'] = (data['price'] / data['price'].shift(1)) #-1

In [None]:
data['returns'].hist(bins=35)
# plt.savefig('../../images/ch04/sma_plot_3.png')

In [None]:
data['test_ret'].hist(bins=35)

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data['test_strat'] = data['position'].shift(1) * data['test_ret']
data

In [None]:
data[['returns', 'strategy']].sum()

In [None]:
data[['returns', 'strategy']].cumsum().plot(figsize=(10, 6))

In [None]:
#data[['test_ret', 'test_strat']].cumsum().plot(figsize=(10, 6))

In [None]:
data[['test_ret', 'test_strat']].cumprod().plot(figsize=(10, 6))

In [None]:
data[['returns', 'strategy']].cumsum().apply(np.exp).plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/sma_plot_4.png')

In [None]:
data[['returns', 'strategy']].mean() * 252

In [None]:
data[['returns', 'strategy']].std() * 252 ** 0.5

In [None]:
data['cumret'] = data['strategy'].cumsum().apply(np.exp)

In [None]:
data['cummax'] = data['cumret'].cummax()

In [None]:
data[['cumret', 'cummax']].plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/sma_plot_5.png')

In [None]:
drawdown = (data['cummax'] - data['cumret'])
drawdown

In [None]:
drawdown.max()

In [None]:
temp = drawdown[drawdown == 0]
temp

In [None]:
periods = (temp.index[1:].to_pydatetime() - temp.index[:-1].to_pydatetime())
periods

In [None]:
periods[12:15]

In [None]:
periods.max()

### Generalizing the Approach

In [None]:
import SMAVectorBacktester as SMA

In [None]:
#from importlib import reload
#reload(SMA)

In [None]:
smabt = SMA.SMAVectorBacktester('AAPL', 42, 252, '2010-1-1', '2016-10-31')

In [None]:
smabt.run_strategy()

In [None]:
%time smabt.optimize_parameters((30, 50, 2), (200, 300, 2))

In [None]:
smabt.plot_results()
# plt.savefig('../../images/ch04/sma_plot_6.png')

## Strategies based on Momentum

### Getting into the Basics

In [None]:
%matplotlib inline
import numpy as np

In [None]:
import pandas as pd

In [None]:
from pandas_datareader import data as web

In [None]:
data = web.DataReader('AAPL', data_source='google', end='2016-10-31')['Close']

In [None]:
data = pd.DataFrame(data)

In [None]:
data.rename(columns={'Close': 'price'}, inplace=True)

In [None]:
data['returns'] = np.log(data['price'] / data['price'].shift(1))

In [None]:
data['position'] = np.sign(data['returns'])

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data[['returns', 'strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/mom_plot_1.png')

In [None]:
data['position'] = np.sign(data['returns'].rolling(2).mean()) 

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data[['returns', 'strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/mom_plot_2.png')

In [None]:
data['position'] = np.sign(data['returns'].rolling(6).mean()) 

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data[['returns', 'strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/mom_plot_3.png')

In [None]:
data['position'] = 0 - np.sign(data['returns'].rolling(6).mean()) 

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data[['returns', 'strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6))

In [None]:
h5 = pd.HDFStore('../data/AAPL_1min_11112016.h5')

In [None]:
data = h5['AAPL']

In [None]:
h5.close()

In [None]:
data['returns'] = np.log(data['close'] / data['close'].shift(1))

In [None]:
to_plot = ['returns']

In [None]:
for m in [1, 3, 5]:
    data['position_%d' % m] = np.sign(data['returns'].rolling(m).mean())
    data['strategy_%d' % m] = data['position_%d' % m].shift(1) * data['returns']
    to_plot.append('strategy_%d' % m)

In [None]:
data[to_plot].dropna().cumsum().apply(np.exp).plot(
    title='AAPL intraday 11. November 2016',
    figsize=(10, 6), style=['-', '--', '--', '--'])
# plt.savefig('../../images/ch04/mom_plot_4.png')

In [None]:
h5 = pd.HDFStore('../data/SP500_1min_11112016.h5')

In [None]:
data = h5['GSPC']

In [None]:
h5.close()

In [None]:
data['returns'] = np.log(data['close'] / data['close'].shift(1))

In [None]:
to_plot = ['returns']

In [None]:
for m in [1, 3, 5]:
    data['position_%d' % m] = np.sign(data['returns'].rolling(m).mean())
    data['strategy_%d' % m] = data['position_%d' % m].shift(1) * data['returns']
    to_plot.append('strategy_%d' % m)

In [None]:
data[to_plot].dropna().cumsum().apply(np.exp).plot(
    title='S&P 500 (^GSPC) intraday 11. November 2016',
    figsize=(10, 6), style=['-', '--', '--', '--'])
# plt.savefig('../../images/ch04/mom_plot_5.png')

### Generalizing the Approach

In [None]:
import MomVectorBacktester as Mom

In [None]:
reload(Mom)

In [None]:
mombt = Mom.MomVectorBacktester('AAPL', '2010-1-1', '2016-10-31', 10000, 0.0)

In [None]:
mombt.run_strategy(momentum=2)

In [None]:
mombt.plot_results()
# plt.savefig('../../images/ch04/mom_plot_6.png')

In [None]:
mombt = Mom.MomVectorBacktester('AAPL', '2010-1-1', '2016-10-31', 10000, 0.001)

In [None]:
mombt.run_strategy(momentum=2)

In [None]:
mombt.plot_results()
# plt.savefig('../../images/ch04/mom_plot_7.png')

## Strategies based on Mean-Reversion

### Getting into the Basics

In [None]:
import numpy as np

In [None]:
import pandas as pd

In [None]:
from pandas_datareader import data as web

In [None]:
data = web.DataReader('GLD', data_source='google', end='2016-10-31')['Close']

In [None]:
data = pd.DataFrame(data)

In [None]:
data.rename(columns={'Close': 'price'}, inplace=True)

In [None]:
data['returns'] = np.log(data['price'] / data['price'].shift(1))

In [None]:
SMA = 50

In [None]:
data['SMA'] = data['price'].rolling(SMA).mean()

In [None]:
threshold = 10

In [None]:
data['distance'] = data['price'] - data['SMA']

In [None]:
import matplotlib.pyplot as plt

In [None]:
data['distance'].dropna().plot(figsize=(10, 6), legend=True)
plt.axhline(threshold, color='r')
plt.axhline(-threshold, color='r')
plt.axhline(0, color='r')
# plt.savefig('../../images/ch04/mr_plot_1.png')

In [None]:
data['position'] = np.where(data['distance'] > threshold, -1, np.nan)

In [None]:
data['position'] = np.where(data['distance'] < -threshold, 1, data['position'])

In [None]:
data['position'] = np.where(data['distance'] *
            data['distance'].shift(1) < 0, 0, data['position'])

In [None]:
data['position'] = data['position'].ffill().fillna(0)

In [None]:
data['position'].ix[SMA:].plot(ylim=[-1.1, 1.1], figsize=(10, 6))
# plt.savefig('../../images/ch04/mr_plot_2.png')

In [None]:
data['strategy'] = data['position'].shift(1) * data['returns']

In [None]:
data[['returns', 'strategy']].dropna().cumsum().apply(np.exp).plot(figsize=(10, 6))
# plt.savefig('../../images/ch04/mr_plot_3.png')

### Generalizing the Approach 

In [None]:
import MRVectorBacktester as MR

In [None]:
mrbt = MR.MRVectorBacktester('GDX', '2010-1-1', '2016-10-31', 10000, 0.0025)

In [None]:
mrbt.run_strategy(SMA=50, threshold=5)

In [None]:
mrbt.plot_results()
# plt.savefig('../../images/ch04/mr_plot_4.png')

<img src="http://hilpisch.com/tpq_logo.png" alt="The Python Quants" width="35%" align="right" border="0"><br>

<a href="http://tpq.io" target="_blank">http://tpq.io</a> | <a href="http://twitter.com/dyjh" target="_blank">@dyjh</a> | <a href="mailto:training@tpq.io">training@tpq.io</a>

**Python Quant Platform** |
<a href="http://quant-platform.com">http://quant-platform.com</a>

**Python for Finance** |
<a href="http://python-for-finance.com" target="_blank">Python for Finance @ O'Reilly</a>

**Derivatives Analytics with Python** |
<a href="http://derivatives-analytics-with-python.com" target="_blank">Derivatives Analytics @ Wiley Finance</a>

**Listed Volatility and Variance Derivatives** |
<a href="http://lvvd.tpq.io" target="_blank">Listed VV Derivatives @ Wiley Finance</a>

**Python Training** |
<a href="http://training.tpq.io" target="_blank">Python for Finance University Certificate</a>