# ep4-1 prioritization of buy signals

In this episode, we will compute the average gain % produced by all historical buy signals for all stocks in S&P 500.  In order to do so, we will first load the results from the optimization performed in ep4.

In [1]:
import pandas as pd
df_SP500 = pd.read_csv('opti.csv')
df_SP500.set_index(['Symbol'],inplace=True)
df_SP500.head(10)
df_SP500[['Benchmark Gain %','Max. Strategy Gain %']].apply(['mean','std'],axis=0)

Unnamed: 0,Benchmark Gain %,Max. Strategy Gain %
mean,11.525896,29.708911
std,35.790061,24.767419


As can be seen from the table above, the max. strategy gain % is statistically higher than benchmark gain %.  This is not unexpected as the max. strategy gain % is produced from optimized trading strategies.  We use the following code block to inspect the information included in one role of `df_SP500` dataframe.

In [9]:
df_SP500.loc['MMM']  # inspect the properties of one row

Security                              3M Company
GICSSector                           Industrials
GICS Sub-Industry       Industrial Conglomerates
Benchmark Gain %                               2
d                                              5
r                                              3
Max. Strategy Gain %                          15
Name: MMM, dtype: object

As can be seen from the statistics above, trading with stock-specifically parameterized strategies generate better gain when compared to benchmark.  It is possible that the trading algorithm generates multiple buying signals on one particular day and the trader needs to decide which buy signal to follow.  Therefore, there is a need to perform an analysis to devise a rule to prioritize buy signals.  Intuitively, a buy signal that can realize the most profit should be prioritized.  As a result, we will compute the statistics of the profit realized by buy signals for each stock.

In [2]:
from my_stock import read_stock,buy_sig,sell_sig,proc_stock
import numpy as np

# we modify the paper_trade function developed previously to have it return the statistics of the profits
# generated by following buy signals

def paper_trade_buy_stat(df,cash,sg,show_steps,show_plot):        
    try:
        n_stock = 0
        pos = 0
        net_val = []
        benchmark_index = df[df['Buy']==True].iloc[0].name
        ls_gain = []

        for i in range(0,len(df)):
            if (df.iloc[i]['Buy']==True) and cash > 0:
                n_stock = cash / df.iloc[i]['Close']
                pos = df.iloc[i]['Close']
                cash = 0
                if show_steps:
                    print('executed buy at','{:.2f}'.format(df.iloc[i]['Close']),'on',df.iloc[i]['Date'])
            if ((df.iloc[i]['Sell']==True) and n_stock > 0) or (df.iloc[i]['Close']<=(1-sg/100)*pos):
                ls_gain.append((df.iloc[i]['Close']-pos)/pos*100)
                cash = n_stock * df.iloc[i]['Close']
                n_stock = 0
                pos = 0                
                if show_steps:
                    print('executed sell at','{:.2f}'.format(df.iloc[i]['Close']),'on',df.iloc[i]['Date'])
            net_val.append(cash + n_stock * df.iloc[i]['Close'])

        df['Net Value']=net_val

        # df['Benchmark Gain %'] = (df['Close'] - df.iloc[benchmark_index]['Close']) / df.iloc[benchmark_index]['Close']*100
        df['Benchmark Gain %'] = (df['Close'] - df.iloc[0]['Close']) / df.iloc[benchmark_index]['Close']*100
        df['Strategy Gain %'] = (df['Net Value'] - df.iloc[benchmark_index]['Net Value']) / df.iloc[benchmark_index]['Net Value'] * 100

        benchmark_gain = int(df.iloc[-1]['Benchmark Gain %'])
        strategy_gain = int(df.iloc[-1]['Strategy Gain %'])

        if show_plot:            
            plt.plot(df.iloc[benchmark_index:]['Benchmark Gain %'],label='benchmark, '+str(benchmark_gain)+'%')
            plt.plot(df.iloc[benchmark_index:]['Strategy Gain %'],label='strategy, '+str(strategy_gain)+'%')
            plt.legend()
            plt.xlabel('trading day')
            plt.ylabel('% gain')        
            plt.show()
        if len(np.array(ls_gain))!=0:
            return np.array(ls_gain).mean()
        else:
            return np.nan
        # return [benchmark_gain,strategy_gain]
        
    except:
        return np.nan
        # return [np.nan,np.nan]

def trade_single(row):        
    df = read_stock(row.name)
    df = proc_stock(df,20,row['d'],row['r'])
    return paper_trade_buy_stat(df,1000000,5,False,False)
          
import time
sim_start = time.time()
df_avg_gain_per_signal = df_SP500.apply(trade_single,axis=1)
sim_end = time.time()
print('simulation took',sim_end-sim_start,'seconds')

simulation took 653.4723653793335 seconds


In [3]:
df_SP500['Signal Avg. Gain %'] = df_avg_gain_per_signal
df_SP500.to_csv('strategy_table.csv')

In [4]:
df_SP500 = pd.read_csv('strategy_table.csv')
df_SP500.set_index(['Symbol'],inplace=True)
df_SP500.sort_values(by=['Signal Avg. Gain %'],ascending=False)

Unnamed: 0_level_0,Security,GICSSector,GICS Sub-Industry,Benchmark Gain %,d,r,Max. Strategy Gain %,Signal Avg. Gain %
Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
FANG,Diamondback Energy,Energy,Oil & Gas Exploration & Production,-60.0,9,9,52,23.654608
FTI,TechnipFMC,Energy,Oil & Gas Equipment & Services,-66.0,8,8,33,15.786360
PHM,PulteGroup,Consumer Discretionary,Homebuilding,23.0,10,9,30,14.325438
COP,ConocoPhillips,Energy,Oil & Gas Exploration & Production,-35.0,8,10,37,8.308708
UNM,Unum Group,Financials,Life & Health Insurance,-27.0,6,10,31,7.341985
...,...,...,...,...,...,...,...,...
FE,FirstEnergy Corp,Utilities,Electric Utilities,-41.0,7,2,-3,-0.477674
WRB,W. R. Berkley Corporation,Financials,Property & Casualty Insurance,-9.0,8,9,-4,-0.958520
DLR,Digital Realty Trust Inc,Real Estate,Specialized REITs,11.0,9,2,-1,-1.465738
MCK,McKesson Corp.,Health Care,Health Care Distributors,18.0,10,9,-7,-2.409180


Now let's develop a module that determines whether today's price point of a particular stock generates a buy signal.

In [5]:
def check_buy(row):
    df = read_stock(row.name)
    df = proc_stock(df,20,row['d'],row['r'])
    if len(df) == 0:
        return False
    else:
        return df.iloc[-1]['Buy']

check_buy_start = time.time()

df_check_buy = df_SP500.apply(check_buy,axis=1)

check_buy_end = time.time()
print('checking buy signal took:',check_buy_end - check_buy_start,'seconds')

checking buy signal took: 357.7061982154846 seconds


In [6]:
df_SP500[df_check_buy].sort_values(by=['Signal Avg. Gain %'],ascending=False).iloc[0].name

'LB'

This analysis suggests that today (when this program was last executed) would be a good day to purchase some stocks of `LB`.