# Build a Long/Short Pairs Portfolio to maximum the PnL

1.   Base on **stocksInfo** and **researchData** to **identify  Pairs** and calculate **trading parameters**. 
2.   Based on **testData** to **backtest** the Pairs portfolio with **signals** and **dollarValue**
3.   Calculate the **PnL** of the backtested Pairs portfolio



# Rules
 

*   **No lookahead bias**: The testData cannot be used for Pairs identification nor the trading parameters calculations
*   **No overfitting**: The Pairs cannot be hand-picked and must be based on rules. Similarly for the trading parameters and dollarValue, apart from the obvious numbers (e.g. 0.05, 0.1, 0.5, 1, 100, 1000 etc.), they must be based on rules also.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%load_ext google.colab.data_table 
%matplotlib inline

# Download and import pairslib for calculating PnL
!wget https://github.com/kenwkliu/ideas/raw/master/colab/pairslib.py
import pairslib

# Load the stockInfo, researchData and testData
stocksInfo = pd.read_excel('https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/data/hkStocksQuotes.xlsx')
researchData = pd.read_csv('https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/data/researchHKStocksAdjClosePx.csv', index_col=0)
testData = pd.read_csv('https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/data/testHKStocksAdjClosePx.csv', index_col=0)


--2021-10-29 13:41:47--  https://github.com/kenwkliu/ideas/raw/master/colab/pairslib.py
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/pairslib.py [following]
--2021-10-29 13:41:48--  https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/pairslib.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2949 (2.9K) [text/plain]
Saving to: ‘pairslib.py’


2021-10-29 13:41:48 (49.8 MB/s) - ‘pairslib.py’ saved [2949/2949]



# **0. Strategies**
Problems of original strategies:

1.   Lossing big money when both stocks are in uptrend as we shorting stocks in uptrend market.
2.   Correlated currently might not imply correlated in future. Only trade stocks in same industry. 

Improvements:

1.   Check whether two stocks are related

> Simple variance (filtered by industry)

> Co-integration

2.   Check whether two stocks are both undergoing uptrend or downtrend and only accept trading range

> Using exponential moving average

> Using simple moving average

> Drawing trend line or bollinger channels 

3.   Dollar Value

> Betting More when winning money

4.   Also work with negative correlation pairs (optional)

> Going on the same direction, then short/ long the lower volatility stocks



# 0. Useful functions


In [None]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)

from datetime import datetime
import numpy as np
import pandas as pd
import pandas_datareader.data as web

# Download the font to display Chinese
!wget https://github.com/kenwkliu/ideas/raw/master/colab/data/simhei.ttf
import matplotlib.pyplot as plt
from matplotlib.font_manager import FontProperties
CNFont = FontProperties(fname='/content/simhei.ttf')

# Yahoo Finance
!pip install yfinance
import yfinance as yf

# Google colab interactive table
%load_ext google.colab.data_table 
%matplotlib inline


### Helper functions
# Display the stock info in Chinese or not
def useChinese(use=True):
  # return STOCK_INFO_FILE, RESEARCH_AJD_CLOSE_FILE, TEST_AJD_CLOSE_FILE
  if use:
    return 'hkStocksQuotesChi.xlsx', 'researchHKStocksAdjClosePxChi.csv', 'testHKStocksAdjClosePxChi.csv'

  else:
    return 'hkStocksQuotes.xlsx', 'researchHKStocksAdjClosePx.csv', 'testHKStocksAdjClosePx.csv'


STOCK_INFO_FILE, RESEARCH_AJD_CLOSE_FILE, TEST_AJD_CLOSE_FILE = useChinese(False)

CHART_SIZE_X, CHART_SIZE_Y = 12, 8
SMALL_CHART_SIZE_X, SMALL_CHART_SIZE_Y = 8, 6


# Plot stock pair chart
def plotPair(df, stockA, stockB, sizeX, sizeY):
  ax1 = df[stockA].plot(label=stockA, legend=True, figsize = (sizeX, sizeY))
  ax1.set_ylim(df[stockA].min(), df[stockA].max())

  ax2 = df[stockB].plot(secondary_y=True, label=stockB, legend=True, figsize = (sizeX, sizeY))
  ax2.set_ylim(df[stockB].min(), df[stockB].max())

  ax1.legend(prop=CNFont, loc=2)
  ax2.legend(prop=CNFont, loc=1)

  plt.show()


# Plot all many Pairs at the same time
def plotManyPair(pairsDf):
  for index, row in pairsDf.iterrows():
      print('\n', index, ':', row['stockA'], 'vs', row['stockB'], '(', row['corr'], ')')
      plotPair(stocks, row['stockA'], row['stockB'], SMALL_CHART_SIZE_X, SMALL_CHART_SIZE_Y)


# Filter the correlated stock pairs with the THRESHOLD
def getCorrelatedPairs(stocksCorr, THRESHOLD=0.95):
  # filter the pairs with correlation values above the THRESHOLD
  highCorr = stocksCorr[((stocksCorr >= THRESHOLD) & (stocksCorr < 1))]
  highCorr = highCorr.unstack().sort_values(ascending=False).drop_duplicates()
  highCorr.dropna(inplace=True)
  highCorrDf = highCorr.to_frame().reset_index()
  highCorrDf.rename(columns = {'level_0':'stockA', 'level_1':'stockB', 0:'corr'}, inplace=True)

  # looks up the sectors for the stocksA and stockB
  cols = ['stockA', 'stockB', 'corr', 'sector_A', 'sector_B']
  pairsDf = highCorrDf.merge(stocksFilteredInfo[['shortName', 'sector']], how='left', left_on='stockA', right_on='shortName').merge(stocksFilteredInfo[['shortName', 'sector']], how='left', left_on='stockB', right_on='shortName', suffixes=('_A', '_B'))[cols]
  pairsDf['sameSector'] = (pairsDf['sector_A'] == pairsDf['sector_B'])
  
  return pairsDf


### back test related functions

# based on the reserch data to determind the trading params (Enter/Exit Points)
def researchTradingParams(researchData, stockA, stockB, threshold=0.05, dollarValue=10000):
  cols = [stockA, stockB]
  research_df = researchData[cols].copy()
  research_df.dropna(inplace = True)

  tradingParams = {}
  tradingParams['dollarValue'] = dollarValue

  # Calculate avgPxRatio for Exit (convergence)
  research_df['ratio'] = research_df[stockA] / research_df[stockB]
  avgPxRatio = research_df['ratio'].mean()
  tradingParams['avgPxRatio'] = avgPxRatio

  # Calculate shortA_longB_ratio for Entry (Divergence)
  shortA_longB_ratio = avgPxRatio * (1 + threshold)
  tradingParams['shortA_longB_ratio'] = shortA_longB_ratio

  # Calculate longA_shortB_ratio from Entry (Divergence)
  longA_shortB_ratio = avgPxRatio * (1 - threshold)
  tradingParams['longA_shortB_ratio'] = longA_shortB_ratio

  return tradingParams


# Determind the signal and dollarValue in the test data
# signal == -1: Long stockA Short stockB
# signal == 1: Short stockA Long stockB
# signal == 0: flat position
def backTest(testData, tradingParams, stockA, stockB):
  cols = [stockA, stockB]
  backTest_df = testData[cols].copy()
  backTest_df.dropna(inplace = True)

  # Get the tradingParams
  dollarValue = tradingParams['dollarValue']
  avgPxRatio = tradingParams['avgPxRatio']
  shortA_longB_ratio = tradingParams['shortA_longB_ratio']
  longA_shortB_ratio = tradingParams['longA_shortB_ratio']

  # Calculate the Price ratio in backTest_df
  backTest_df['pxRatio'] = backTest_df[stockA] / backTest_df[stockB]
  backTest_df['dollarValue'] = dollarValue
  
  # initialize the signal to 0
  backTest_df['signal'] = 0
  signal = 0

  # Determine the signal in each row of the backTest_df
  for index, row in backTest_df.iterrows():
    pxRatio = row['pxRatio']

    # mark signal = 1 if pxRatio > shortA_longB_ratio (Diverge outside the upper band)
    if pxRatio > shortA_longB_ratio:
      signal = 1

    # mark signal = -1 if pxRatio < longA_shortB_ratio (Diverge outside the lower band)
    elif pxRatio < longA_shortB_ratio:
      signal = -1

    else:
      # continue to mark signal = 1 if previous signal == 1 and pxRatio > avgPxRatio (Trade entered but not converge back yet)
      if signal == 1 and pxRatio > avgPxRatio:
        signal = 1

      # continue to mark signal = -1 if previous signal == -1 and pxRatio < avgPxRatio (Trade entered but not converge back yet)
      elif signal == -1 and pxRatio < avgPxRatio:
        signal = -1

      else:
        signal = 0

    backTest_df.loc[index, 'signal'] = signal

  return backTest_df


# determine pSignal and nSignal for up/down markers in plot
# pSignal and nSignal is for displaying the up/down markers in plotting chart only, they're not required for backtest calculation
def addSignalMarker(backTest_df):
  backTest_df['pSignal'] = np.where(backTest_df['signal'] == 1, backTest_df['pxRatio'], np.nan)
  backTest_df['nSignal'] = np.where(backTest_df['signal'] == -1, backTest_df['pxRatio'], np.nan)

  return backTest_df


# Combine the research and backtest for a Portfolio of Pairs
def researchAndBackTestPortfolio(pairsDf, researchData, testData, printOutput=True):
  pairsPortfolioBackTest = []

  for index, row in pairsDf.iterrows():
    stockA, stockB = row['stockA'], row['stockB']
    if printOutput: print(stockA, 'vs', stockB)
    tradingParams = researchTradingParams(researchData, stockA, stockB)
    pairsPortfolioBackTest.append(backTest(testData, tradingParams, stockA, stockB)[[stockA, stockB, 'signal', 'dollarValue']])

  return pairsPortfolioBackTest


# Download and import pairslib for calculating PnL
!wget https://github.com/kenwkliu/ideas/raw/master/colab/pairslib.py
import pairslib

--2021-10-27 06:10:42--  https://github.com/kenwkliu/ideas/raw/master/colab/data/simhei.ttf
Resolving github.com (github.com)... 140.82.114.3
Connecting to github.com (github.com)|140.82.114.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/data/simhei.ttf [following]
--2021-10-27 06:10:42--  https://raw.githubusercontent.com/kenwkliu/ideas/master/colab/data/simhei.ttf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9751960 (9.3M) [application/octet-stream]
Saving to: ‘simhei.ttf.2’


2021-10-27 06:10:42 (124 MB/s) - ‘simhei.ttf.2’ saved [9751960/9751960]

The google.colab.data_table extension is already loaded. To reload it, use:
  %reload_ext google.colab.data_tabl

# 1. Variable initiation
stocksInfo: all the background information needed

researchData: date and prices

industry_name: array of industries

sector_name: array of sectors



In [None]:
industry_name = stocksInfo['industry'].unique()
len(industry_name)

23

In [None]:
sector_name = stocksInfo['sector'].unique()
len(sector_name)

123

In [None]:
researchData.shape

(249, 173)

# 2. Pairs Selection

> Cointegration between stocks



conintegration function


> input: dataframe (n, 2) stockA_price_stockB_price; String A_name; String B_name

> Output: statistics. The lower the value the higher significance of the relatinships of two stocks


Useless code

In [None]:
#for index, row in researchData[["BABA-SW"]].iterrows():
#    print(index, row[0])
import statsmodels.api as sm

def cointegration(stockA_price_stockB_price, A_name, B_name, mean_A, mean_B, sum_var_A, sum_cov_A_B):
  slope = sum_cov_A_B / sum_var_A
# print(slope)
  interception = -slope * mean_A + mean_B
# print(interception)
  expected_y = []
  for i in range(stockA_price_stockB_price.shape[0]):
    if (np.isnan(stockA_price_stockB_price.iloc[i][0])):
      expected_y = np.append(expected_y, np.nan)
    else:
      expected_y = np.append(expected_y, slope * stockA_price_stockB_price.iloc[i][0] + interception)
# print(expected_y)

  diff_expecti_real_y = []
  for i in range(stockA_price_stockB_price.shape[0]):
    if (np.isnan(stockA_price_stockB_price.iloc[i][1])):
      diff_expecti_real_y = np.append(diff_expecti_real_y, np.nan)
    else:
      diff_expecti_real_y = np.append(diff_expecti_real_y, expected_y[i] - stockA_price_stockB_price.iloc[i][1])
# print(diff_expecti_real_y)
  diff_of_diff = diff_expecti_real_y[1:] - diff_expecti_real_y[:-1]
# print(diff_of_diff)

  compress_diff_of_diff, compress_diff_expecti_real_y = [], []
  for i in range(len(diff_of_diff)):
    if np.isnan(diff_of_diff[i]):
      pass
    else:
      compress_diff_of_diff = np.append(compress_diff_of_diff, diff_of_diff[i])
      compress_diff_expecti_real_y = np.append(compress_diff_expecti_real_y, diff_expecti_real_y[i])
  reg = sm.OLS(compress_diff_of_diff, compress_diff_expecti_real_y)
  res = reg.fit()
  t_stat = res.params[0]/ res.bse[0]
  return t_stat

# cointegration(researchData[["BABA-SW", "TENCENT"]], "BABA-SW", "TENCENT")

  import pandas.util.testing as tm


need 22m48s to generate the heatmap

In [None]:
import statsmodels.tsa.stattools as ts


df = pd.DataFrame(index=researchData.columns)  
for i in researchData.columns:
  tmp = []
  for j in researchData.columns:
    if i == j:
      tmp = np.append(tmp, 0)
    else:
      compressed_A = []
      compressed_B = []
      for k in range(researchData.shape[0]):
        if np.isnan(researchData[i].iloc[k]) or np.isnan(researchData[j].iloc[k]):
          pass
        else:
          compressed_A = np.append(compressed_A, researchData[i].iloc[k])
          compressed_B = np.append(compressed_B, researchData[j].iloc[k])
      coint_tfloat, pvalue, _ = ts.coint(compressed_A, compressed_B) 
      tmp = np.append(tmp, pvalue)
# print(tmp)
  df[i] = tmp

stocksCorr = df
stocksCorr.style.background_gradient(cmap='coolwarm', axis=None)

Unnamed: 0,BABA-SW,TENCENT,CCB,MEITUAN-W,CHINA MOBILE,AIA,HSBC HOLDINGS,PING AN,HKEX,ICBC,CNOOC,XIAOMI-W,BUD APAC,EVERGRANDE,EVERG HEALTH,ALI HEALTH,SHK PPT,CHINA OVERSEAS,HANG SENG BANK,SANDS CHINA LTD,MTR CORPORATION,CHINA RES LAND,BOC HONG KONG,LONGFOR GROUP,GALAXY ENT,BANK OF CHINA,COUNTRY GARDEN,SMIC,WUXI BIO,HANSOH PHARMA,ANTA SPORTS,HK & CHINA GAS,CKH HOLDINGS,SINO BIOPHARM,HAIDILAO,CLP HOLDINGS,CHINA RES BEER,CM BANK,SUNAC,GEELY AUTO,SUNNY OPTICAL,CK ASSET,TECHTRONIC IND,MENGNIU DAIRY,SHENZHOU INTL,CHINA LIFE,CHINA FEIHE,CHINA UNICOM,PA GOODDOCTOR,CG SERVICES,CHINA GAS HOLD,SHIMAO GROUP,CSPC PHARMA,ZHONGSHENG HLDG,ENN ENERGY,SUNART RETAIL,WH GROUP,YIHAI INTL,NEWWORLDDEV-NEW,WEIGAO GROUP,CHINA RES GAS,SINOPEC CORP,ABC,WHARF REIC,GUANGDONG INV,HENGAN INT'L,CONCH CEMENT,MICROPORT,AAC TECH,CHINARES CEMENT,BYD COMPANY,KINGDEE INT'L,XINYI SOLAR,WYNN MACAU,INNOVENT BIO,CHINA TOWER,CHINA JINMAO,LI NING,CPIC,CONCH VENTURE,A-LIVING,BYD ELECTRONIC,PETROCHINA,ESR,LENOVO GROUP,KUNLUN ENERGY,CIFI HOLD GP,KINGSOFT,CHINA LIT,CITIC BANK,CHINA TAIPING,SJM HOLDINGS,CHINA VANKE,XINYI GLASS,TSINGTAO BREW,CHINA SHENHUA,HUA HONG SEMI,PICC P&C,CNBM,CITIC SEC,MINSHENG BANK,KWG GROUP,WHARF HOLDINGS,BRILLIANCE CHI,ASM PACIFIC,ND PAPER,CICC,GREENTOWN SER,BEIJING ENT,WEICHAI POWER,CANSINOBIO-B,MAN WAH HLDGS,KOOLEARN,CHINA TELECOM,NCI,GENSCRIPT BIO,AVICHINA,AK MEDICAL,CHINA YOUZAN,ZIJIN MINING,MEIDONG AUTO,WUXI APPTEC,HUABAO INTL,JXR,FIT HON TENG,GREATWALL MOTOR,SINOPHARM,HAITONG SEC,3SBIO,GOME RETAIL,WEIMOB INC,EVERSUNSHINE LS,HTSC,PICC GROUP,AIR CHINA,GAC GROUP,ZA ONLINE,TIANNENG POWER,FOSUN PHARMA,XD INC,HENGTEN NET,CGS,ZTE,HOPE EDU,VIVA BIOTECH,SD GOLD,YEAHKA,YONGDA AUTO,DONGFENG GROUP,CHINASOFT INT'L,CSC,CRRC TIMES ELEC,UNITED LAB,CC NEW LIFE,POLY PPT DEV,COFCO MEAT,CMOC,PHARMARON,FUYAO GLASS,ZOOMLION,MEILAN AIRPORT,ZHAOJIN MINING,GANFENGLITHIUM,HKTV,COMEC,EB SECURITIES,CHINA DONGXIANG,FLAT GLASS,HEC PHARM,BROAD HOMES,FI2 CSOP HSI,FL2 CSOP HSI,LINK REIT
BABA-SW,0.0,0.842813,0.024998,0.834846,0.579582,0.893794,0.359878,0.632617,0.962019,0.024852,0.461094,0.993807,0.117632,0.489726,0.636984,0.529521,0.240059,0.456534,0.36085,0.27999,0.224963,0.01148,0.137604,0.195556,0.260894,0.040723,0.046973,0.568843,1.0,0.419244,0.989509,0.285696,0.270911,0.748193,0.851409,0.001971,0.982351,0.901885,0.098739,0.98779,0.863032,0.060427,0.720204,0.984564,0.949641,0.139446,0.337101,0.220543,0.388153,0.653434,0.618867,0.687217,0.559854,0.650507,0.93152,0.523243,0.035058,0.691657,0.043133,0.654364,0.005816,0.091963,0.038274,0.470269,0.019441,0.233162,0.304314,0.834526,0.21488,0.130559,0.982143,0.98853,1.0,0.135681,0.979333,0.365733,0.563556,0.992299,0.621407,0.1365,0.320428,0.654618,0.181384,0.900076,0.993738,0.596875,0.148092,0.586871,0.462796,0.16842,0.149415,0.338627,0.079941,1.0,0.928957,0.382302,0.966371,0.15582,0.031039,0.099455,0.118716,0.027397,0.891098,0.072765,0.433861,0.869904,0.684601,0.152675,0.400885,0.030623,0.544304,0.986032,0.040318,0.368742,0.072115,0.525356,0.772502,0.59656,0.931065,0.993613,0.708002,0.982511,0.831215,0.985873,0.305925,1.0,0.248736,0.160325,0.619589,0.550414,0.896045,0.552977,0.276133,0.075271,0.136595,0.405676,0.354147,0.761652,0.456493,0.621296,0.01877,0.152455,0.107987,0.442914,0.57117,0.237397,0.60584,0.946969,0.91149,0.992126,0.361875,0.301833,0.733382,0.93981,0.3633,0.521894,0.99272,0.947764,0.978454,0.846619,0.46631,0.002418,0.989877,0.356684,0.096284,0.257395,0.119518,0.993435,0.431606,0.164461,0.543581,0.239703,0.395394
TENCENT,0.752371,0.0,0.068483,0.123036,0.228912,0.685719,0.533542,0.455944,0.521658,0.096217,0.428636,0.959324,0.080594,0.491041,0.187082,0.293174,0.251239,0.176244,0.475143,0.320665,0.259099,0.048671,0.146947,0.202042,0.156355,0.075898,0.026168,0.687391,0.99144,0.251032,0.96438,0.100122,0.339809,0.814779,0.731176,0.006951,0.817257,0.471492,0.058038,0.936019,0.528081,0.068693,0.243011,0.64339,0.790401,0.272328,0.116729,0.207144,0.6352,0.423582,0.78509,0.763362,0.597231,0.000441,0.673147,0.531203,0.023009,0.58039,0.191232,0.578583,0.081614,0.115041,0.129297,0.474705,0.077315,0.251813,0.290483,0.404396,0.246142,0.607922,0.98228,0.691802,0.9664,0.158664,0.442227,0.013189,0.341106,0.956516,0.414066,0.12642,0.390091,0.455764,0.194567,0.714099,0.983415,0.367082,0.057615,0.095641,0.064623,0.159231,0.194162,0.471593,0.089822,0.978191,0.387458,0.012016,0.495923,0.140683,0.078063,0.258133,0.155062,0.052742,0.863989,0.067437,0.430478,0.439096,0.215358,0.164343,0.367215,0.010714,0.518061,0.899049,0.051723,0.257496,0.089769,0.406665,0.311604,0.691825,0.419069,0.987845,0.422279,0.547531,0.309352,0.955306,0.352947,0.993772,0.327867,0.158746,0.550098,0.627955,0.374562,0.484219,0.303024,0.080278,0.185886,0.353643,0.52286,0.232389,0.109131,0.273357,0.025217,0.029394,0.068295,0.511702,0.479557,0.363237,0.811036,0.60073,0.865031,0.932047,0.516297,0.222079,0.864498,0.795926,0.436731,0.595714,0.970064,0.260478,0.95926,0.274788,0.549299,0.001483,0.989778,0.365196,0.224733,0.403493,0.393948,0.946788,0.132341,0.809343,0.229464,0.235379,0.532963
CCB,0.533394,0.797067,0.0,0.908283,0.545442,0.94477,0.467455,0.886969,0.972992,0.141144,0.238771,0.976887,0.349753,0.576805,0.870885,0.557131,0.156491,0.958979,0.325348,0.390601,0.405248,0.107854,0.161695,0.656416,0.558397,0.328642,0.247328,0.690372,0.992166,0.605005,0.991807,0.466377,0.233198,0.707629,0.903409,0.343431,0.993842,0.98224,0.389307,0.990944,0.891507,0.142655,0.920067,0.986163,0.912192,0.57178,0.643829,0.389104,0.385201,0.643175,0.62487,0.620048,0.525077,0.906659,0.92959,0.796236,0.15447,0.764478,0.460014,0.802567,0.238689,0.224936,0.104357,0.669435,0.298918,0.288768,0.389515,0.907795,0.460074,0.280825,0.992985,0.986524,1.0,0.272577,0.958804,0.964742,0.969009,0.987085,0.957332,0.139183,0.491363,0.791589,0.039718,0.893141,0.993617,0.823767,0.359068,0.838157,0.834933,0.347072,0.17012,0.557071,0.132242,0.991979,0.956137,0.822449,0.959684,0.334313,0.299983,0.578428,0.943449,0.561052,0.93808,0.189433,0.575178,0.851436,0.889872,0.155941,0.657505,0.346199,0.62035,0.9918,0.041422,0.597124,0.652252,0.602732,0.880552,0.622778,0.949291,0.993286,0.900954,0.944453,0.953578,0.986413,0.67784,1.0,0.300991,0.317855,0.618168,0.664591,0.873479,0.655877,0.46116,0.193191,0.300145,0.551648,0.613632,0.906353,0.707108,0.840565,0.598118,0.874946,0.273846,0.552901,0.671688,0.721908,0.189136,0.968017,0.984084,0.987385,0.702159,0.441101,0.37891,0.921783,0.492856,0.606951,0.992434,0.957576,0.993939,0.719958,0.663373,0.051955,1.0,0.798643,0.486749,0.57524,0.46399,0.989197,0.820538,0.569945,0.921557,0.559289,0.470774
MEITUAN-W,0.595504,0.072221,0.057347,0.0,0.251813,0.507064,0.619968,0.413479,0.393377,0.103453,0.146441,0.931529,0.063116,0.523033,0.134379,0.310029,0.276605,0.097035,0.581179,0.337702,0.272995,0.040678,0.157697,0.06499,0.109249,0.077044,0.031773,0.667543,0.990094,0.310203,0.92344,0.107672,0.34067,0.791407,0.431937,0.006938,0.892843,0.406234,0.048712,0.854762,0.467693,0.050763,0.053719,0.399234,0.511635,0.249026,0.108588,0.202804,0.50196,0.054119,0.802384,0.758827,0.584438,0.041207,0.634106,0.595819,0.018697,0.464633,0.189667,0.824228,0.118812,0.138935,0.121345,0.436966,0.037887,0.190354,0.265092,0.627669,0.250962,0.705768,0.951454,0.352282,0.988585,0.160229,0.263717,0.013501,0.278699,0.917359,0.375195,0.106457,0.333836,0.144163,0.198643,0.227748,0.975917,0.411817,0.082545,0.177865,0.257177,0.043991,0.197541,0.452584,0.08803,0.982177,0.315648,0.009137,0.55418,0.146484,0.310423,0.307053,0.087988,0.068057,0.83055,0.093838,0.451786,0.321561,0.242348,0.143747,0.432253,0.034892,0.478151,0.964329,0.03913,0.320084,0.135469,0.401181,0.27176,0.600695,0.242015,0.86484,0.003937,0.522503,0.171564,0.934203,0.493075,1.0,0.283256,0.162973,0.498036,0.687207,0.440854,0.299012,0.305982,0.081595,0.162773,0.327834,0.649844,0.558576,0.074808,0.261691,0.031085,0.051721,0.049586,0.60984,0.688203,0.574336,0.76829,0.622188,0.846854,0.774277,0.503365,0.223074,0.889565,0.724989,0.349249,0.693884,0.96278,0.157547,0.882462,0.157549,0.23496,0.005782,0.970242,0.256922,0.093736,0.264278,0.393478,0.474417,0.186291,0.918894,0.237576,0.237893,0.563033
CHINA MOBILE,0.593416,0.304808,0.311848,0.495394,0.0,0.693519,0.736066,0.613252,0.666209,0.43282,0.344301,0.471112,0.218086,0.534022,0.732279,0.085912,0.435958,0.216801,0.792216,0.583083,0.519779,0.136086,0.324763,0.17892,0.393101,0.323723,0.126083,0.622346,0.345998,0.251936,0.731006,0.25245,0.598524,0.722521,0.684919,0.003282,0.614602,0.373179,0.025803,0.730081,0.702503,0.26162,0.683279,0.487094,0.700884,0.53701,0.209351,0.514837,0.434483,0.537701,0.894276,0.714377,0.493505,0.420694,0.549173,0.763762,0.036084,0.67952,0.399846,0.713307,0.587932,0.253092,0.378365,0.694932,0.586379,0.157303,0.277386,0.368053,0.332339,0.685127,0.474736,0.370159,0.769587,0.226923,0.464747,0.204173,0.233298,0.668557,0.613522,0.12456,0.328764,0.536639,0.285255,0.552935,0.967588,0.792253,0.210001,0.398598,0.410515,0.117424,0.444013,0.610277,0.18756,0.77897,0.676172,0.129449,0.463603,0.21776,0.32058,0.515773,0.530552,0.511626,0.947132,0.167593,0.701804,0.464464,0.59427,0.150794,0.627811,0.20792,0.460259,0.601445,0.035676,0.31634,0.49045,0.177749,0.612564,0.64829,0.361717,0.677174,0.443718,0.51502,0.389313,0.826175,0.632436,0.871286,0.593322,0.296381,0.304022,0.660845,0.419532,0.284973,0.350251,0.08105,0.265965,0.549692,0.613631,0.551795,0.175579,0.481373,0.392545,0.394064,0.177806,0.57436,0.681364,0.769857,0.459246,0.453784,0.79722,0.701611,0.558143,0.48407,0.805962,0.006501,0.446314,0.541827,0.960319,0.38601,0.857488,0.167792,0.702804,0.023833,0.804574,0.538705,0.416258,0.507957,0.673639,0.79149,0.373585,0.900712,0.597029,0.685179,0.667953
AIA,0.710206,0.493363,0.181275,0.518401,0.400263,0.0,0.369615,0.04339,0.58289,0.173865,0.691349,0.149405,0.155114,0.610957,0.609805,0.278971,0.568065,0.128978,0.676426,0.444551,0.189799,0.209592,0.679445,0.38494,0.224933,0.191566,0.012955,0.662196,0.306139,0.442578,0.422428,0.432923,0.596443,0.586333,0.508144,0.166865,0.678025,0.150634,0.10279,0.191501,0.292856,0.233063,0.394638,0.491678,0.396696,0.532494,0.276991,0.473895,0.417828,0.401803,0.679228,0.567904,0.604308,0.608322,0.239208,0.083612,0.149459,0.614709,0.609476,0.688239,0.218565,0.229619,0.200532,0.215367,0.21667,0.072686,0.07548,0.641166,0.349233,0.834256,0.208462,0.387841,0.403784,0.304372,0.683533,0.457711,0.190867,0.465925,0.052269,0.026798,0.164792,0.502,0.191514,0.125832,0.012616,0.305206,0.206281,0.316901,0.542078,0.314949,0.198269,0.74555,0.119423,0.219838,0.632354,0.430559,0.187122,0.329207,0.44686,0.521285,0.134543,0.539562,0.377378,0.165725,0.273487,0.223353,0.540744,0.091231,0.55708,0.281995,0.616075,0.240831,0.014637,0.545673,0.428493,0.436567,0.697958,0.2081,0.344852,0.358742,0.482556,0.269669,0.396118,0.41142,0.740282,0.79793,0.417444,0.205671,0.523958,0.648952,0.468716,0.360946,0.357964,0.173471,0.101065,0.265462,0.673387,0.720094,0.461939,0.375768,0.534665,0.07853,0.35159,0.558219,0.772624,0.632491,0.592204,0.155544,0.105991,0.041794,0.710915,0.093264,0.713716,0.401469,0.276942,0.764901,0.275487,0.285824,0.211661,0.335747,0.688809,0.026876,0.367811,0.306305,0.39367,0.54542,0.509095,0.409448,0.265306,0.846167,0.57693,0.3061,0.515606
HSBC HOLDINGS,0.552287,0.771331,0.033438,0.964742,0.838142,0.948066,0.0,0.796939,0.98226,0.013075,0.346619,1.0,0.319792,0.530782,0.813385,0.482107,0.157098,0.960973,0.005249,0.285543,0.148263,0.111496,0.130363,0.720752,0.487414,0.03508,0.161479,0.607508,1.0,0.482799,0.991603,0.025456,0.002639,0.493133,0.98589,0.010689,0.993878,0.945211,0.300043,0.989377,0.918585,0.153927,0.956185,0.990292,0.974798,0.502563,0.616847,0.235683,0.266099,0.851152,0.343389,0.420232,0.590772,0.924652,0.925124,0.944648,0.007389,0.814526,0.365882,0.701564,0.115485,0.025361,0.161271,0.761696,0.089337,0.315381,0.215401,0.906008,0.508829,0.321024,0.989799,0.993311,1.0,0.37104,0.990045,0.900305,0.910146,1.0,0.834822,0.092477,0.22775,0.94657,0.024721,0.973086,0.992949,0.824079,0.181272,0.879622,0.850859,0.027657,0.139076,0.564872,0.01688,0.993899,0.960467,0.719784,0.959325,0.149841,0.234377,0.527108,0.806173,0.516943,0.968692,0.082519,0.55305,0.928373,0.878795,0.165648,0.330971,0.224702,0.273211,1.0,0.041229,0.266546,0.525885,0.734343,0.876052,0.652179,0.977076,1.0,0.959221,0.991452,0.863175,0.98482,0.526006,1.0,0.031637,0.080276,0.698612,0.464692,0.900294,0.750834,0.304028,0.07779,0.260058,0.494712,0.4868,0.873961,0.769588,0.89308,0.840561,0.51737,0.069809,0.522061,0.725391,0.348394,0.430649,0.97399,0.989366,0.989437,0.565359,0.442432,0.338378,0.710099,0.630038,0.424565,0.990669,0.987141,1.0,0.762202,0.887984,0.03057,1.0,0.94809,0.28217,0.373772,0.649456,1.0,0.815987,0.778743,0.859589,0.634253,0.034332
PING AN,0.711792,0.648486,0.292698,0.801884,0.720346,0.134184,0.756395,0.0,0.803745,0.267831,0.772879,0.909073,0.105445,0.672926,0.749177,0.399959,0.401126,0.583037,0.718816,0.374656,0.092245,0.19194,0.245455,0.700486,0.211989,0.471054,0.00904,0.643456,0.916844,0.598484,0.811692,0.774641,0.683305,0.620448,0.836661,0.309311,0.934681,0.738641,0.317126,0.728549,0.11776,0.497395,0.794415,0.868634,0.853957,0.795921,0.571475,0.61892,0.409687,0.756241,0.591059,0.520171,0.665129,0.802414,0.298821,0.380326,0.434395,0.82822,0.341392,0.809343,0.234342,0.272658,0.361838,0.099125,0.500883,0.167598,0.135122,0.812708,0.576034,0.767875,0.775636,0.878181,0.9264,0.505455,0.943389,0.728387,0.679652,0.920669,0.203649,0.03502,0.138717,0.788649,0.547331,0.746106,0.630571,0.228189,0.204616,0.632275,0.710538,0.432957,0.365171,0.65814,0.175391,0.915572,0.854325,0.710809,0.456765,0.677304,0.452424,0.60494,0.45191,0.394259,0.035028,0.181128,0.039571,0.707531,0.783444,0.144145,0.830007,0.334665,0.635074,0.702251,0.020477,0.681193,0.779275,0.765538,0.847665,0.265645,0.697564,0.873413,0.794306,0.800852,0.679179,0.249046,0.720349,0.988016,0.625256,0.508432,0.762004,0.653789,0.75981,0.682415,0.487284,0.267327,0.09101,0.037088,0.67294,0.8385,0.664202,0.575578,0.788739,0.11868,0.642933,0.565275,0.836671,0.691622,0.540003,0.551684,0.538203,0.525651,0.728518,0.059862,0.603301,0.519714,0.317096,0.781264,0.269407,0.816167,0.626942,0.732098,0.81959,0.030501,0.883762,0.667112,0.515012,0.682577,0.698669,0.819716,0.883471,0.870061,0.531697,0.16233,0.479381
HKEX,0.813377,0.404006,0.064169,0.331234,0.211003,0.501728,0.654603,0.352124,0.0,0.120633,0.428103,0.926488,0.038795,0.54275,0.096499,0.293823,0.257473,0.156635,0.57238,0.3246,0.265218,0.046076,0.155045,0.33628,0.100637,0.082836,0.024594,0.711226,0.954704,0.275125,0.884792,0.086165,0.34742,0.849787,0.652825,0.006346,0.811652,0.536146,0.114411,0.858802,0.340035,0.071126,0.198606,0.233014,0.658545,0.267285,0.222522,0.204451,0.580698,0.168649,0.842501,0.812449,0.627594,0.722485,0.708376,0.594367,0.033836,0.574146,0.189822,0.879482,0.149864,0.139883,0.110658,0.428479,0.035006,0.252797,0.295922,0.503354,0.219262,0.794793,0.939859,0.522683,0.987162,0.158329,0.385579,0.138627,0.350485,0.948968,0.38109,0.11908,0.377997,0.52059,0.206636,0.619059,0.948233,0.228896,0.039934,0.007139,0.356473,0.160075,0.172893,0.386,0.084303,0.950133,0.028354,0.014925,0.262408,0.185668,0.347909,0.118706,0.324408,0.037498,0.822818,0.06647,0.374225,0.439078,0.239091,0.165999,0.413562,0.016875,0.607125,0.82003,0.049276,0.292521,0.102129,0.486343,0.158855,0.647561,0.064669,0.886708,0.694993,0.188702,0.127448,0.893101,0.554344,0.993885,0.283554,0.110743,0.578545,0.682584,0.151312,0.159875,0.320799,0.076787,0.158952,0.346837,0.623624,0.180484,0.071508,0.031936,0.055592,0.004634,0.084565,0.652269,0.734528,0.582323,0.946231,0.665555,0.754366,0.803286,0.665565,0.128518,0.911655,0.859717,0.421398,0.667342,0.926798,0.291968,0.94307,0.01837,0.466597,0.007485,0.992036,0.577868,0.186469,0.433062,0.29473,0.949172,0.263197,0.958371,0.143665,0.196817,0.442176
ICBC,0.259002,0.696427,0.08226,0.921284,0.603167,0.933414,0.207462,0.8285,0.913809,0.0,0.447566,0.992173,0.297781,0.588976,0.784169,0.591732,0.123572,0.987747,0.049036,0.34932,0.274634,0.081597,0.120638,0.507444,0.478526,0.076793,0.265842,0.694045,0.993082,0.460664,0.988807,0.218195,0.112923,0.648026,0.934633,0.024328,0.992772,0.939318,0.578361,0.992325,0.919033,0.141855,0.833762,0.982608,0.930202,0.585177,0.265237,0.547001,0.278522,0.645899,0.33113,0.41714,0.521065,0.867451,0.927172,0.922006,0.045367,0.614802,0.438921,0.695326,0.077132,0.133582,0.37885,0.672325,0.060071,0.300239,0.301977,0.870173,0.6332,0.078809,0.993527,0.98765,1.0,0.318316,0.966054,0.945904,0.970201,0.99008,0.925229,0.138346,0.452956,0.804724,0.062293,0.923251,0.993765,0.820276,0.275162,0.808276,0.734761,0.806697,0.535964,0.491259,0.23028,0.993628,0.919203,0.658327,0.954575,0.209576,0.10592,0.464923,0.96569,0.436561,0.923194,0.137692,0.548641,0.832881,0.78843,0.160408,0.700496,0.223816,0.544188,0.991736,0.040322,0.72769,0.528787,0.651575,0.809054,0.615127,0.950643,0.992272,0.852647,0.972743,0.931203,0.984381,0.51906,1.0,0.251446,0.630148,0.690959,0.538702,0.835567,0.427968,0.403668,0.234171,0.417018,0.343598,0.373514,0.834107,0.631268,0.844148,0.362253,0.578295,0.281687,0.511837,0.52549,0.534185,0.131457,0.938303,0.974854,0.986744,0.423988,0.430706,0.163517,0.906263,0.5703,0.353187,0.990109,0.964155,0.993805,0.58473,0.750775,0.032189,1.0,0.791556,0.366299,0.385939,0.379918,0.99283,0.731564,0.292625,0.868955,0.676904,0.229291


In [None]:
pairs = []
p_values = []
for i in df.index:
  p_value = 1
  related_stock = ""
  for k in df.index:
    if i != k and df[i].loc[k] < p_value:
      p_value = df[i].loc[k]
      related_stock = k
  related_stock = related_stock 
  pairs = np.append(pairs, related_stock)
  p_values = np.append(p_values, p_value)

pair_stocks = pd.DataFrame(index=df.index)
pair_stocks["Pairs"] = pairs
pair_stocks["P_value"] = p_values
pair_stocks.sort_values(by=["P_value"], ascending=True)

Unnamed: 0,Pairs,P_value
SUNAC,ZTE,0.000014
CHINA DONGXIANG,YEAHKA,0.000072
HENGAN INT'L,CONCH CEMENT,0.000165
YONGDA AUTO,ANTA SPORTS,0.000170
CONCH CEMENT,HENGAN INT'L,0.000198
...,...,...
BROAD HOMES,GUANGDONG INV,0.061154
INNOVENT BIO,CHINA TOWER,0.062459
SJM HOLDINGS,CGS,0.068593
LI NING,CHINA JINMAO,0.069946


Useless code

In [None]:
"""#  mean_A = stockA_price_stockB_price[A_name].mean()
#  mean_B = stockA_price_stockB_price[B_name].mean()
#  sum_cov_A_B = 0 
#  for i in range(stockA_price_stockB_price.shape[0]):
#    if np.isnan(stockA_price_stockB_price.iloc[i][0]) or np.isnan(stockA_price_stockB_price.iloc[i][1]):
#      pass
#    else:
#      sum_cov_A_B += (stockA_price_stockB_price.iloc[i][0] - mean_A) * (stockA_price_stockB_price.iloc[i][1] - mean_B)
# print(sum_cov_A_B)
# print(i)

#  sum_var_A = 0
#  for i in range(stockA_price_stockB_price.shape[0]):
#    if (np.isnan(stockA_price_stockB_price.iloc[i][0])):
#      pass
#    else:
#      sum_var_A += (stockA_price_stockB_price.iloc[i][0] - mean_A) * (stockA_price_stockB_price.iloc[i][0] - mean_A)
# print(sum_var_A)

means = pd.DataFrame()
for i in researchData.columns:
  means[i] = [researchData[i].mean()]
# print(researchData['BABA-SW'].mean()) 
# print(means['BABA-SW'])
# print("-------------------------")


vars = pd.DataFrame()
for i in researchData.columns:
  var = 0
  for k in range(researchData[i].shape[0]):
    if  np.isnan(researchData[i].iloc[k]):
      pass
    else:
      var += (researchData[i].iloc[k] - means[i]) * (researchData[i].iloc[k] - means[i])
  vars[i] = [var]

covs = pd.DataFrame()
for i in researchData.columns:
  tmp = []
  for j in researchData.columns:
    sum_cov_A_B = 0
    for k in range(researchData[i].shape[0]):
      if np.isnan(researchData[i].iloc[k]) or np.isnan(researchData[j].iloc[k]):
        pass
      else:
        sum_cov_A_B += (researchData[i].iloc[k] - means[i]) * (researchData[j].iloc[k] - means[j])
    # print(sum_cov_A_B)
    tmp = np.append(tmp, sum_cov_A_B)
  # print(tmp)
  covs[i] = tmp

df = pd.DataFrame(index=researchData.columns)  
for i in researchData.columns:
  tmp = []
  for j in researchData.columns:
    if i == j:
      tmp = np.append(tmp, 0)
    else:
      tmp = np.append(tmp, cointegration(researchData[[i, j]], i, j, means[i], means[j], vars[i], covs[i].iloc[j]))
  df[i] = tmp

stocksCorr = df
stocksCorr.style.background_gradient(cmap='coolwarm', axis=None)"""

'#  mean_A = stockA_price_stockB_price[A_name].mean()\n#  mean_B = stockA_price_stockB_price[B_name].mean()\n#  sum_cov_A_B = 0 \n#  for i in range(stockA_price_stockB_price.shape[0]):\n#    if np.isnan(stockA_price_stockB_price.iloc[i][0]) or np.isnan(stockA_price_stockB_price.iloc[i][1]):\n#      pass\n#    else:\n#      sum_cov_A_B += (stockA_price_stockB_price.iloc[i][0] - mean_A) * (stockA_price_stockB_price.iloc[i][1] - mean_B)\n# print(sum_cov_A_B)\n# print(i)\n\n#  sum_var_A = 0\n#  for i in range(stockA_price_stockB_price.shape[0]):\n#    if (np.isnan(stockA_price_stockB_price.iloc[i][0])):\n#      pass\n#    else:\n#      sum_var_A += (stockA_price_stockB_price.iloc[i][0] - mean_A) * (stockA_price_stockB_price.iloc[i][0] - mean_A)\n# print(sum_var_A)\n\nmeans = pd.DataFrame()\nfor i in researchData.columns:\n  means[i] = [researchData[i].mean()]\n# print(researchData[\'BABA-SW\'].mean()) \n# print(means[\'BABA-SW\'])\n# print("-------------------------")\n\n\nvars = pd.DataFr

# 3a. Algo (abritrate)

> constant geometry mean of two stocks

Reason:
1.   Even two stocks are correlated, which mean they rise their price together and drop together in a similar way most of the time. The ratio between price could be broken up when the market share of the two stocks change. --> we only focus on 90days geometry mean of two stocks which implies the normally period of the released of quarterly performance. 

[improvement: change to moving geometry mean/ using cross entropy to signal a recalculation of the geometry mean]
2.   Geometric means will be a better options when considering time series

> input: pairs of stocks' price data, number of standard scores

> Algo: generate trading signals based on threshold value and pair relationships

> Output: new dataframe of squece of trading signals


In [None]:
from scipy.stats.mstats import gmean

def signal_pre_generate(stockA_price_stockB_price, A_name, B_name, st_score):
  geo_mean_A = float(stockA_price_stockB_price[[A_name]].apply(lambda row: gmean(row[~row.isna()]), axis=0))
  geo_mean_B = float(stockA_price_stockB_price[[B_name]].apply(lambda row: gmean(row[~row.isna()]), axis=0))
  geo_ratio = geo_mean_A/ geo_mean_B
# print(geo_ratio)

  A_v_B = pd.DataFrame(index=stockA_price_stockB_price[A_name].index)
  A_v_B["pxRatio"] = stockA_price_stockB_price[A_name] / stockA_price_stockB_price[B_name]
  
  sd_A_to_B = A_v_B["pxRatio"].std()
# print(sd_A_to_B)

  A_v_B["Stock A Prices"] = stockA_price_stockB_price[A_name]
  A_v_B["Stock B Prices"] = stockA_price_stockB_price[B_name]

  shortA_longB_ratio = geo_ratio + sd_A_to_B * st_score
  longA_shortB_ratio = geo_ratio - sd_A_to_B * st_score
  signal = 0
# print(geo_ratio)
# print(shortA_longB_ratio)
# print(longA_shortB_ratio)

# Determind the signal and dollarValue in the test data
# signal == -1: Long stockA Short stockB
# signal == 1: Short stockA Long stockB
# signal == 0: flat position
  for index, row in A_v_B.iterrows():
    pxRatio = row['pxRatio']
    if pxRatio > shortA_longB_ratio:
      signal = 1

    elif pxRatio < longA_shortB_ratio:
      signal = -1

    else:
      if signal == 1 and pxRatio > geo_ratio:
        signal = 1

      elif signal == -1 and pxRatio < geo_ratio:
        signal = -1

      else:
        signal = 0

    A_v_B.loc[index, 'Signals'] = signal
  return A_v_B

signal_pre_generate(researchData[["GALAXY ENT", "BUD APAC"]], "GALAXY ENT", "BUD APAC", 1.5)

Unnamed: 0_level_0,pxRatio,Stock A Prices,Stock B Prices,Signals
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-02,2.306269,60.077793,26.049772,0.0
2020-01-03,2.212828,58.296242,26.344673,0.0
2020-01-06,2.226185,57.553932,25.853168,0.0
2020-01-07,2.236617,58.593166,26.197224,0.0
2020-01-08,2.265903,58.692139,25.902319,0.0
...,...,...,...,...
2020-12-24,2.354823,58.750000,24.948799,0.0
2020-12-28,2.341968,57.500000,24.552000,0.0
2020-12-29,2.403688,58.299999,24.254400,0.0
2020-12-30,2.374022,59.700001,25.147200,0.0


# 3b. Algo (Trend Checking)

1.   Long short MA

> Long MA > Short MA for consecutives three days --> downtrend

> Using Exponential MA

2.   Choosing Parameters

> Randomly Assign(90, 5) 90: period of release of quarterly performance; 5: reaction time for stocks price. 

> Select by heatmap

3.   Return: 1 when uptrend; -1 when downtrend; 0 when trading range.

Usage:

      yet entered market + double same trend--> not entering

      entered market + double same trend--> if winning --> continue

      entered market + double same trend--> if lossing --> terminate trade

      *current appraoch: entered market + double same trend --> continue*  

In [None]:
import math 
def trend_calculation(stockA, stockA_name, st_score, long_trend=90, short_trend=5, alpha=1-(math.e/ math.pi)):
  long_ema = stockA.ewm(min_periods=long_trend, span=long_trend).mean()
  short_ema = stockA.ewm(min_periods=short_trend,span=short_trend).mean()
  long_ema_sd = long_ema[stockA_name].std()

  signal = 0
  signals = []
  for i in range(short_ema.shape[0]):
    if np.isnan(long_ema[stockA_name].iloc[i]):
      signals = np.append(signals, int(0))
    else:
      if short_ema[stockA_name].iloc[i] > long_ema[stockA_name].iloc[i] + long_ema_sd:
        signals = np.append(signals, int(1))
      elif short_ema[stockA_name].iloc[i] < long_ema[stockA_name].iloc[i] - long_ema_sd:
        signals = np.append(signals, int(-1))
      else:
        signals = np.append(signals, int(0))
  df = pd.DataFrame(index=long_ema.index)
  df[stockA_name] = stockA
  df["Long EMA"] = long_ema
  df["Short EMA"] = short_ema
  df["Signals"] = signals
  return df

trend_calculation(researchData[["SUNAC"]], 'SUNAC', 0.5)

Unnamed: 0_level_0,SUNAC,Long EMA,Short EMA,Signals
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2020-01-02,42.783207,,,0.0
2020-01-03,41.864548,,,0.0
2020-01-06,40.420944,,,0.0
2020-01-07,40.508434,,,0.0
2020-01-08,40.027233,,40.635525,0.0
...,...,...,...,...
2020-12-24,24.977028,27.592253,24.977058,-1.0
2020-12-28,24.567568,27.525027,24.840561,-1.0
2020-12-29,25.250000,27.474475,24.977041,-1.0
2020-12-30,25.250000,27.425059,25.068027,-1.0


# 4. Algo (merging & backtest)

signal_pre_generate: return df["pxRatio", "Stock A Prices", "Stock B Price", "Signals"]

trend_calculation: return df["Stock", "Long EMA", "Short EMA", "Signals"]

afterall: return (Date, stockA, stockB, signal, dollarValue)

In [None]:
def backTest(pair_stocks_df):
  st_score_for_ab = 1.5
  st_score_for_trend = 0.5
  pairsPortfolioBackTest = []
  for i in pair_stocks.sort_values(by=["P_value"], ascending=True).index:
    cointe = signal_pre_generate(researchData[[i, pair_stocks["Pairs"].loc[i]]], i, pair_stocks["Pairs"].loc[i], st_score_for_ab)
    df = pd.DataFrame(index=researchData.index)
    # print(researchData[i])
    # print(i)
    trend_A = trend_calculation(researchData[[i]], i, st_score_for_trend)
    trend_B = trend_calculation(researchData[[pair_stocks["Pairs"].loc[i]]], pair_stocks["Pairs"].loc[i], st_score_for_trend)
    for j in range(1, cointe.shape[0]):
      if (trend_A["Signals"].iloc[j - 1] == 1 and trend_B["Signals"].iloc[j - 1] == 1) or (trend_A["Signals"].iloc[j - 1] == -1 and trend_B["Signals"].iloc[j - 1] == -1):
        if (cointe["Signals"].iloc[j - 1] == 0):
          cointe["Signals"].iloc[j] = 0
    df[[i]] = researchData[[i]]
    df[[pair_stocks["Pairs"].loc[i]]] = researchData[[pair_stocks["Pairs"].loc[i]]]
    df["signal"] = cointe["Signals"].astype(int)
    # print(pair_stocks["P_value"].loc[i])
    if  pair_stocks["P_value"].loc[i] < 0.0005:
      df["dollarValue"] = 0
    elif pair_stocks["P_value"].loc[i] < 0.005 and pair_stocks["P_value"].loc[i] >= 0.0005:
      df["dollarValue"] = 10000
    elif pair_stocks["P_value"].loc[i] < 0.01 and pair_stocks["P_value"].loc[i] >= 0.005:
      df["dollarValue"] = 20000
    elif pair_stocks["P_value"].loc[i] < 0.03 and pair_stocks["P_value"].loc[i] >= 0.01:
      df["dollarValue"] = 10000
    elif pair_stocks["P_value"].loc[i] >= 0.03:
      df["dollarValue"] = 0
    df["dollarValue"] = df["dollarValue"].astype(int)
    pairsPortfolioBackTest.append(df)
  return pairsPortfolioBackTest

pairsPortfolioBackTest = backTest(pair_stocks)

In [None]:
for index,row in pairsPortfolioBackTest[0].iterrows():
  print(row['dollarValue'])

# 5. Calculating overall results
pairsPortfolioBackTest[201, 4] = [Stock A's price, Stock B's price, signal, accumulated dollar value]

signal = 1 --> short A long B

signal = -1 --> long A short B


In [None]:
# Implement your logic to construct "pairsPortfolioBackTest"
# pairsPortfolioBackTest needs to be same format as in https://colab.research.google.com/github/kenwkliu/ideas/blob/master/colab/HKStocksCorrelation.ipynb
# It is a list of backtested Pairs
# Each backtested Pairs is a dataframe with at least 5 columns (Date, stockA, stockB, signal, dollarValue)

# signal is -1, 0, 1
# signal == -1: Long stockA Short stockB
# signal == 1: Short stockA Long stockB
# signal == 0: flat position

# Calcuate the PnL of the Pairs portfolio
pnl, pnlDf = pairslib.calcPortfolio(pairsPortfolioBackTest)
pnlDf

SUNAC vs ZTE ---> $ 0.0
CHINA DONGXIANG vs YEAHKA ---> $ 0.0
HENGAN INT'L vs CONCH CEMENT ---> $ 0.0
YONGDA AUTO vs ANTA SPORTS ---> $ 0.0
CONCH CEMENT vs HENGAN INT'L ---> $ 0.0
MENGNIU DAIRY vs ZIJIN MINING ---> $ 0.0
WH GROUP vs CHINA FEIHE ---> $ 0.0
CLP HOLDINGS vs KINGSOFT ---> $ 0.0
KOOLEARN vs CONCH CEMENT ---> $ 0.0
TENCENT vs ZHONGSHENG HLDG ---> $ 0.0
XIAOMI-W vs FLAT GLASS ---> $ 0.0
ZIJIN MINING vs MENGNIU DAIRY ---> $ 0.0
PICC GROUP vs ASM PACIFIC ---> $ 0.0
GUANGDONG INV vs YEAHKA ---> $ 0.0
ZHONGSHENG HLDG vs TENCENT ---> $ 0.0
ZHAOJIN MINING vs HENGTEN NET ---> $ 756.533878833769
WYNN MACAU vs SINOPHARM ---> $ 506.4670549462719
NEWWORLDDEV-NEW vs CHINA LIFE ---> $ 219.49778474473203
WEICHAI POWER vs COMEC ---> $ 190.107677831483
COUNTRY GARDEN vs ENN ENERGY ---> $ 360.31045803009295
CGS vs EVERG HEALTH ---> $ 668.0166933859135
FLAT GLASS vs XIAOMI-W ---> $ -109.79971526439294
CHINASOFT INT'L vs GEELY AUTO ---> $ 466.1920063410929
SHK PPT vs A-LIVING ---> $ 433.63820722

Unnamed: 0,stockA,stocksB,Pnl
0,SUNAC,ZTE,0.0
1,CHINA DONGXIANG,YEAHKA,0.0
2,HENGAN INT'L,CONCH CEMENT,0.0
3,YONGDA AUTO,ANTA SPORTS,0.0
4,CONCH CEMENT,HENGAN INT'L,0.0
...,...,...,...
168,BROAD HOMES,GUANGDONG INV,0.0
169,INNOVENT BIO,CHINA TOWER,0.0
170,SJM HOLDINGS,CGS,0.0
171,LI NING,CHINA JINMAO,0.0
