# 3.X Entry Signals - Checks

Check the new data against the old data.  It looks like the new summaries display significantly higher returns than the old summaries.

Differences are likely due to having more data in a good performing year (2014 and 2015) and changes to historical adjusted prices from Yahoo (see CBA).  It looks like the new adjusted prices from Yahoo take into account franking credits in dividends.

In [1]:
import pandas as pd
import numpy as np
import datetime as dt
from itertools import product
lib_new = '/shares/models/phase_2/data/3_entry'
lib_old = '/shares/models/phase_1/analysis/entry/DonchianHighPullback'

In [2]:
#Read the old and new datasets
new_data = pd.read_pickle(lib_new+'/DonchianHigh_300_3_30.p')
old_data = pd.read_pickle(lib_old+'/Inv_Donchian_pullback_300_55_30_3.p')

In [3]:
#Common processing
excludeList = ['Energy',
               'Materials',
               'Pharmaceuticals & Biotechnology',
               'Semiconductors & Semiconductor Equipment',
               'Utilities']

new_data = new_data[new_data['z_vol_avg30'] > 100000]
new_data = new_data[new_data['GICS'].map(lambda x: str(x) not in excludeList)]

#Loop across intervals
temp = new_data.copy()
temp.sort(['symbol','entry_date'],inplace=True)
temp.index = range(temp.shape[0])
for j in [200,250]:
    #Create temp2 dataframe with non-overlapping trades
    temp2 = temp[temp['p_MAE_'+str(j)].map(lambda x: pd.isnull(x)==False)]
    temp2['p_include_'+str(j)] = False
    for i in temp2.index:
        if i==temp2.index[0]: 
            temp2.ix[i,'p_include_'+str(j)] = True
            ret_symbol = temp.ix[i,'symbol']
            ret_date = temp.ix[i,'entry_date'] + dt.timedelta(days=j)
            continue
        if temp2.ix[i,'symbol'] != ret_symbol:
            temp2.ix[i,'p_include_'+str(j)] = True
            ret_symbol = temp.ix[i,'symbol']
            ret_date = temp.ix[i,'entry_date'] + dt.timedelta(days=j)
            continue
        if ret_date > temp.ix[i,'entry_date']: continue #Same symbol and overlapping trade intervals
        temp2.ix[i,'p_include_'+str(j)] = True
        ret_date = temp.ix[i,'entry_date'] + dt.timedelta(days=j)
new_data = temp2.copy()

old_data = old_data[old_data['GICS'].map(lambda x: str(x) not in excludeList)]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


In [5]:
#Print number of observations
print 'new_data shape: ' + str(new_data.shape)
print 'old_data shape: ' + str(old_data.shape)

new_data shape: (11896, 42)
old_data shape: (11314, 34)


In [10]:
#Check number of common triggers
common_triggers = pd.merge(old_data,new_data,on=['entry_date','symbol'])
print common_triggers.shape

(7597, 74)


Note the old DER figures below do not include transaction cost.

In [8]:
#Check DER for common triggers
common_triggers2 = common_triggers[common_triggers['include_233']=='Y']
print 'old DER 233: ' + str(np.mean(common_triggers2['DER_233']))
print 'new DER 200: ' + str(np.mean(common_triggers2['p_DER_200']))
print 'new DER 250: ' + str(np.mean(common_triggers2['p_DER_250']))

old DER 233: 0.110440754081
new DER 200: 0.122308419246
new DER 250: 0.0969519002412


In [16]:
#Check DER for common triggers
common_triggers2 = common_triggers[common_triggers['p_include_250']]
print 'old DER 233: ' + str(np.mean(common_triggers2['DER_233']))
print 'new DER 200: ' + str(np.mean(common_triggers2['p_DER_200']))
print 'new DER 250: ' + str(np.mean(common_triggers2['p_DER_250']))

old DER 233: 0.122006486343
new DER 200: 0.129159550401
new DER 250: 0.103118256832


In [17]:
#Check DER overall
print 'old DER 233: ' + str(np.mean(old_data[old_data['include_233']=='Y']['DER_233']))
print 'new DER 250: ' + str(np.mean(new_data[new_data['p_include_250']]['p_DER_250']))

old DER 233: 0.0971553446859
new DER 250: 0.116559186543


In [19]:
#Check CBA trades
test = new_data[(new_data['p_include_250']) & (new_data['symbol']=='CBA')]
print np.mean(test['p_DER_250'])

0.148293442969


In [22]:
test[['date','adjClose','close','entry_date','entry_price','p_DER_250']]

Unnamed: 0,date,adjClose,close,entry_date,entry_price,p_DER_250
2742,2000-05-22,8.61799,27.95,2000-06-09,8.23257,0.191382
2750,2001-03-05,10.43962,31.65,2001-03-08,10.17574,0.027219
2757,2002-02-04,11.41282,33.381,2002-02-06,10.92699,-0.064612
2765,2004-01-14,12.27247,30.94,2004-01-28,12.01861,0.13496
2770,2004-11-18,13.90222,32.19,2004-11-26,13.68196,0.328207
2780,2005-08-08,17.59106,39.38,2005-08-10,17.26497,0.334483
2795,2006-04-19,22.36022,46.5,2006-05-04,22.02361,0.161318
2807,2007-02-19,26.29671,50.9,2007-02-28,25.81107,0.318254
2816,2007-11-01,33.19645,61.65,2007-11-05,32.35106,-0.552097
2820,2009-08-13,29.42505,47.53,2009-08-21,27.93721,0.4498
