# 14 SKEWNESS

##### Content:
* approaches to measuring total skewness, co-skewness (aka systematic skewness), and idiosyncratic skewness
* the ability of these variables to predict future stock returns

##### Motivation:
* The empirical failures of the CAPM prompted researchers to search for other models to describe expected security returns.

##### Total skewness:
* Arditti (1967, 1971) shows theoretically and empirically that investors demand a higher (lower) rate of return on investments whose return distributions are negatively (positively) skewed.
* Scott and Horvath (1980) extend this analysis to include not just the third moment, but all higher moments of the distribution of returns.

#####  Systematic skewness: 
* Kraus and Litzenberger (1976) shows expected security returns are determined not only by the amount of systematic (undiversifiable) variance associated with the security but also by the security's systematic skewness.
* Harvey and Siddique (2000) find that systematic skewness commands a risk premium

##### Idiosyncratic skewness: 
* Kane (1982), Beedles (1978) and Conine and Tamarkin (1981): idiosyncratic skewness may be relevant to the pricing of securities
* Boyer, Mitton, and Vorkink (2010), Bali, Cakici, and Whitelaw (2011): a strong negative cross-sectional relation with future stock returns

## 14.1 MEASURING SKEWNESS

![skew12.png](https://i.loli.net/2020/05/01/JYoNIOdj5b1KaA7.png)

![skew3.png](https://i.loli.net/2020/05/01/lzTw6YqH71FAyei.png)

#####  Several different measures(vary in the length of the measurement period and the frequency of the data used to calculate):
* calculate each of the variables using one, three, six, and 12 months worth of daily return data(require a minimum of 15, 50, 100, and 200 days of valid returns during the measurement period respectively)
* calculate each of the variables using one, two, three, and five years worth of monthly return data(require a minimum of 10, 20, 24, and 24 months of valid returns during the measurement period respectively)

In [2]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf
from scipy.stats.mstats import winsorize
import openpyxl  # 用于向excel中写入数据

In [None]:
#############################################拼接初步处理收益数据
############日度数据处理
a1 = pd.read_csv('1.csv')
a2 = pd.read_csv('2.csv')
a3 = pd.read_csv('3.csv')
a4 = pd.read_csv('4.csv')
a5 = pd.read_csv('5.csv')
a6 = pd.read_csv('6.csv')
a7 = pd.read_csv('7.csv')
a8 = pd.read_csv('8.csv')
a9 = pd.read_csv('9.csv')
a10 = pd.read_csv('10.csv')
a11 = pd.read_csv('11.csv')
a12 = pd.read_csv('12.csv')
a13 = pd.read_csv('13.csv')

daily_data = pd.concat([a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13])##concat之后就可以删掉前面单独的数据
daily_data['date'] =  pd.to_datetime(daily_data['date'])
daily_data['rt'] = daily_data['rt']*100
daily_data['year'] = daily_data['date'].dt.year
daily_data['month'] = daily_data['date'].dt.month
daily_data = daily_data[daily_data['year']>=1995]
daily_data = daily_data[daily_data['year']<=2019]

Acode = pd.read_csv('Acode.csv')
daily_data = pd.merge(daily_data,Acode,on='code')
daily_data = daily_data[(daily_data['exchcd']!=2)&(daily_data['exchcd']!=8)]# Sample selection

factor_daily = pd.read_csv('fivefactor_daily.csv')
factor_daily['date'] =  pd.to_datetime(factor_daily['date'])
factor_daily.iloc[:,1:] = factor_daily.iloc[:,1:]*100
factor_daily = factor_daily[factor_daily['date'].dt.year>=1995]
factor_daily = factor_daily[factor_daily['date'].dt.year<=2019]
rf_daily = factor_daily[['date','rf']]
daily_data = pd.merge(daily_data,rf_daily,on='date')
daily_data['rt'] = daily_data['rt']-daily_data['rf']

daily_exrt = pd.pivot_table(daily_data,index='date',columns='code',values='rt')
daily_exrt['month_num'] = (daily_exrt.index.year-1995)*12 + daily_exrt.index.month

mktrf_daily = factor_daily[['mkt_rf']]
mktrf_daily['mkt_rf2'] = mktrf_daily['mkt_rf']**2
mktrf_daily.index = factor_daily['date']

threefactor_daily = factor_daily[['mkt_rf','smb','hml']]# FF-factor
threefactor_daily.index = factor_daily['date']

In [3]:
##############月度数据处理
monthly = pd.read_csv('monthly.csv')
monthly['mktcap'] = monthly['mktcap']*1000
monthly = monthly[(monthly['type']!=2)&(monthly['type']!=8)]
monthly['date'] = pd.to_datetime(monthly['date'])
monthly['month'] = monthly['date'].dt.month
monthly['year'] = monthly['date'].dt.year
monthly = monthly[monthly['year']>=1995]
monthly = monthly[monthly['year']<=2019]
monthly['rt'] = monthly['rt']*100

factor_monthly = pd.read_csv('fivefactor_monthly.csv')
factor_monthly['year'] = factor_monthly['date']//100
factor_monthly['month'] = factor_monthly['date']%100
factor_monthly = factor_monthly[(factor_monthly['year']>=1995)&(factor_monthly['year']<=2019)]
del factor_monthly['date']
rf_monthly = factor_monthly[['year','month','rf']]
factor_monthly = factor_monthly[(factor_monthly['year']>=2000)&(factor_monthly['year']<=2019)]
factor_monthly['month_num'] = (factor_monthly['year']-2000)*12+factor_monthly['month']
factor_monthly[['mkt_rf','smb','hml','mom','rf']] = factor_monthly[['mkt_rf','smb','hml','mom','rf']]*100

monthly = pd.merge(monthly,rf_monthly,on=['year','month'])
monthly['rt'] = monthly['rt'] - monthly['rf']
monthly['month_num'] = (monthly['year']-1995)*12+monthly['month']
monthly_exrt = pd.pivot_table(monthly,index='month_num',columns='code',values='rt')
monthly_exrt['month_num'] = monthly_exrt.index

mktrf_monthly = factor_monthly[['mkt_rf']]
mktrf_monthly['mkt_rf2'] = mktrf_monthly['mkt_rf']**2
mktrf_monthly.index = factor_monthly['month_num']

threefactor_monthly = factor_monthly[['mkt_rf','smb','hml']]
threefactor_monthly.index= range(1,241)
FFCPSfactor_monthly = factor_monthly[['mkt_rf','smb','hml','mom']]
FFCPSfactor_monthly.index = range(1,241)

psl = pd.read_csv('PSL.csv',index_col=0)
psl.index = range(1,241)
psl.columns = ['pls']
FFCPSfactor_monthly = pd.concat([FFCPSfactor_monthly,psl],axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy


In [4]:
def cut2000(df):
    x = df.loc[:,61:]
    x.columns = range(1,241)
    return x

In [None]:
######################################################三个偏度指标计算
#############skew
def skew_calculator(data,span,low_limit):
    '''
    用来计算skew指标的表格函数，输出计算好的skew的表格。
    
    输入
    ----------
    data是以date为index，code为columns（最后一列是date对应的month_num），rt为value
    span是每次回归跨度月份数，一年为12
    low_limit是计算beta的最低样本数（天数或月数），一个月为10，三个月为50或者一年为10，两年为20等
    
    输出
    -------
    index为股票代码，columns为月份编号，value为对应样本期限长度算出skew的df
    '''
    X = pd.DataFrame()
    for i in range(max(data['month_num'])-span+1):
        same_time_data = data[(data['month_num']>i)&(data['month_num']<=i+span)]
        same_time = []
        code_list = list(data.columns[:-1])
        for code in code_list:
            temp_data = same_time_data[code]
            if temp_data.notna().sum() >= low_limit:
                skew = temp_data.skew()
            else:
                skew = np.nan
            same_time.append(skew)
        same_time = pd.Series(same_time,index = code_list,name = i+span)
        X = pd.concat([X,same_time],axis=1,sort=False)
    return X    

skew_1m = skew_calculator(daily_exrt,1,10)
skew_3m = skew_calculator(daily_exrt,3,50)
skew_6m = skew_calculator(daily_exrt,6,100)
skew_12m = skew_calculator(daily_exrt,12,200)
skew_1y = skew_calculator(monthly_exrt,12,10)
skew_2y = skew_calculator(monthly_exrt,24,20)
skew_3y = skew_calculator(monthly_exrt,36,24)
skew_5y = skew_calculator(monthly_exrt,60,24)

skew_1m = cut2000(skew_1m)
skew_3m = cut2000(skew_3m)
skew_6m = cut2000(skew_6m)
skew_12m = cut2000(skew_12m)
skew_1y = cut2000(skew_1y)
skew_2y = cut2000(skew_2y)
skew_3y = cut2000(skew_3y)
skew_5y = cut2000(skew_5y)

skew_1m.to_csv('skew_1m.csv')
skew_3m.to_csv('skew_3m.csv')
skew_6m.to_csv('skew_6m.csv')
skew_12m.to_csv('skew_12m.csv')
skew_1y.to_csv('skew_1y.csv')
skew_2y.to_csv('skew_2y.csv')
skew_3y.to_csv('skew_3y.csv')
skew_5y.to_csv('skew_5y.csv')

In [5]:
skew_1m = pd.read_csv('skew_1m.csv',index_col=0)
skew_1m.columns = range(1,241)
skew_3m = pd.read_csv('skew_3m.csv',index_col=0)
skew_3m.columns = range(1,241)
skew_6m = pd.read_csv('skew_6m.csv',index_col=0)
skew_6m.columns = range(1,241)
skew_12m = pd.read_csv('skew_12m.csv',index_col=0)
skew_12m.columns = range(1,241)
skew_1y = pd.read_csv('skew_1y.csv',index_col=0)
skew_1y.columns = range(1,241)
skew_2y = pd.read_csv('skew_2y.csv',index_col=0)
skew_2y.columns = range(1,241)
skew_3y = pd.read_csv('skew_3y.csv',index_col=0)
skew_3y.columns = range(1,241)
skew_5y = pd.read_csv('skew_5y.csv',index_col=0)
skew_5y.columns = range(1,241)

In [None]:
#############coskew
def coskew_calculator(data,factor,span,low_limit):
    '''
    用来计算coskew指标的表格函数，输出计算好的coskew的表格。
    
    输入
    ----------
    data是以date为index，code为columns（最后一列是date对应的month_num），rt为value
    factor是市场因子的数据，index为日期或者月份数
    span是每次回归跨度月份数，一年为12
    low_limit是计算beta的最低样本数（天数或月数），一个月为10，三个月为50或者一年为10，两年为20等
    
    输出
    -------
    index为股票代码，columns为月份编号，value为对应样本期限长度算出coskew的df
    '''
    X = pd.DataFrame()
    for i in range(max(data['month_num'])-span+1):
        same_time_data = data[(data['month_num']>i)&(data['month_num']<=i+span)]
        same_time = []
        code_list = list(data.columns[:-1])
        for code in code_list:
            temp_data = same_time_data[code]
            temp_data.name = 'rt'
            reg_data = pd.concat([temp_data,factor],axis=1,sort=False,join='inner')
            if reg_data['rt'].notna().sum() >= low_limit:
                model = smf.ols('rt~mkt_rf+mkt_rf2',reg_data,missing='drop').fit()
                coskew = model.params[2]
            else:
                coskew = np.nan
            same_time.append(coskew)
        same_time = pd.Series(same_time,index = code_list,name = i+span)
        X = pd.concat([X,same_time],sort=False,axis=1)
        print(i)
    return X

coskew_1m = coskew_calculator(daily_exrt,mktrf_daily,1,10)
coskew_3m = coskew_calculator(daily_exrt,mktrf_daily,3,50)
coskew_6m = coskew_calculator(daily_exrt,mktrf_daily,6,100)
coskew_12m = coskew_calculator(daily_exrt,mktrf_daily,12,200)
coskew_1y = coskew_calculator(monthly_exrt,mktrf_monthly,12,10)
coskew_2y = coskew_calculator(monthly_exrt,mktrf_monthly,24,20)
coskew_3y = coskew_calculator(monthly_exrt,mktrf_monthly,36,24)
coskew_5y = coskew_calculator(monthly_exrt,mktrf_monthly,60,24)

coskew_1m = cut2000(coskew_1m)
coskew_3m = cut2000(coskew_3m)
coskew_6m = cut2000(coskew_6m)
coskew_12m = cut2000(coskew_12m)
coskew_1y = cut2000(coskew_1y)
coskew_2y = cut2000(coskew_2y)
coskew_3y = cut2000(coskew_3y)
coskew_5y = cut2000(coskew_5y)

coskew_1m.to_csv('coskew_1m.csv')
coskew_3m.to_csv('coskew_3m.csv')
coskew_6m.to_csv('coskew_6m.csv')
coskew_12m.to_csv('coskew_12m.csv')
coskew_1y.to_csv('coskew_1y.csv')
coskew_2y.to_csv('coskew_2y.csv')
coskew_3y.to_csv('coskew_3y.csv')
coskew_5y.to_csv('coskew_5y.csv')

In [6]:
coskew_1m = pd.read_csv('coskew_1m.csv',index_col=0)
coskew_1m.columns = range(1,241)
coskew_3m = pd.read_csv('coskew_3m.csv',index_col=0)
coskew_3m.columns = range(1,241)
coskew_6m = pd.read_csv('coskew_6m.csv',index_col=0)
coskew_6m.columns = range(1,241)
coskew_12m = pd.read_csv('coskew_12m.csv',index_col=0)
coskew_12m.columns = range(1,241)
coskew_1y = pd.read_csv('coskew_1y.csv',index_col=0)
coskew_1y.columns = range(1,241)
coskew_2y = pd.read_csv('coskew_2y.csv',index_col=0)
coskew_2y.columns = range(1,241)
coskew_3y = pd.read_csv('coskew_3y.csv',index_col=0)
coskew_3y.columns = range(1,241)
coskew_5y = pd.read_csv('coskew_5y.csv',index_col=0)
coskew_5y.columns = range(1,241)

In [None]:
####################idioskew
def idioskew_calculator(data,factor,span,low_limit):
    '''
    用来计算coskew指标的表格函数，输出计算好的coskew的表格。
    
    输入
    ----------
    data是以date为index，code为columns（最后一列是date对应的month_num），rt为value
    factor是三因子的数据，index为日期或者月份数
    span是每次回归跨度月份数，一年为12
    low_limit是计算beta的最低样本数（天数或月数），一个月为10，三个月为50或者一年为10，两年为20等
    
    输出
    -------
    index为股票代码，columns为月份编号，value为对应样本期限长度算出coskew的df
    '''
    X = pd.DataFrame()
    for i in range(max(data['month_num'])-span+1):
        same_time_data = data[(data['month_num']>i)&(data['month_num']<=i+span)]
        same_time = []
        code_list = list(data.columns[:-1])
        for code in code_list:
            temp_data = same_time_data[code]
            temp_data.name = 'rt'
            reg_data = pd.concat([temp_data,factor],axis=1,sort=False,join='inner')
            if reg_data['rt'].notna().sum() >= low_limit:
                model = smf.ols('rt~mkt_rf+smb+hml',reg_data,missing='drop').fit()
                temp = model.resid
                idioskew = temp.skew()
            else:
                idioskew = np.nan
            same_time.append(idioskew)
        same_time = pd.Series(same_time,index = code_list,name = i+span)
        X = pd.concat([X,same_time],sort=False,axis=1)
        print(i)
    return X

idioskew_1m = idioskew_calculator(daily_exrt,threefactor_daily,1,10)
idioskew_3m = idioskew_calculator(daily_exrt,threefactor_daily,3,50)
idioskew_6m = idioskew_calculator(daily_exrt,threefactor_daily,6,100)
idioskew_12m = idioskew_calculator(daily_exrt,threefactor_daily,12,200)
idioskew_1y = idioskew_calculator(monthly_exrt,threefactor_monthly,12,10)
idioskew_2y = idioskew_calculator(monthly_exrt,threefactor_monthly,24,20)
idioskew_3y = idioskew_calculator(monthly_exrt,threefactor_monthly,36,24)
idioskew_5y = idioskew_calculator(monthly_exrt,threefactor_monthly,60,24)

idioskew_1m = cut2000(idioskew_1m)
idioskew_3m = cut2000(idioskew_3m)
idioskew_6m = cut2000(idioskew_6m)
idioskew_12m = cut2000(idioskew_12m)
idioskew_1y = cut2000(idioskew_1y)
idioskew_2y = cut2000(idioskew_2y)
idioskew_3y = cut2000(idioskew_3y)
idioskew_5y = cut2000(idioskew_5y)

idioskew_1m.to_csv('idioskew_1m.csv')
idioskew_3m.to_csv('idioskew_3m.csv')
idioskew_6m.to_csv('idioskew_6m.csv')
idioskew_12m.to_csv('idioskew_12m.csv')
idioskew_1y.to_csv('idioskew_1y.csv')
idioskew_2y.to_csv('idioskew_2y.csv')
idioskew_3y.to_csv('idioskew_3y.csv')
idioskew_5y.to_csv('idioskew_5y.csv')

In [7]:
idioskew_1m = pd.read_csv('idioskew_1m.csv',index_col=0)
idioskew_1m.columns = range(1,241)
idioskew_3m = pd.read_csv('idioskew_3m.csv',index_col=0)
idioskew_3m.columns = range(1,241)
idioskew_6m = pd.read_csv('idioskew_6m.csv',index_col=0)
idioskew_6m.columns = range(1,241)
idioskew_12m = pd.read_csv('idioskew_12m.csv',index_col=0)
idioskew_12m.columns = range(1,241)
idioskew_1y = pd.read_csv('idioskew_1y.csv',index_col=0)
idioskew_1y.columns = range(1,241)
idioskew_2y = pd.read_csv('idioskew_2y.csv',index_col=0)
idioskew_2y.columns = range(1,241)
idioskew_3y = pd.read_csv('idioskew_3y.csv',index_col=0)
idioskew_3y.columns = range(1,241)
idioskew_5y = pd.read_csv('idioskew_5y.csv',index_col=0)
idioskew_5y.columns = range(1,241)

In [8]:
########################################################样本筛选
codelist = pd.read_csv('codelist.csv',index_col=0)
codelist.columns = range(1,241)
codelist = codelist.astype(bool)

skew_1m,skew_3m,skew_6m,skew_12m,skew_1y,skew_2y,skew_3y,skew_5y=skew_1m[codelist],skew_3m[codelist],skew_6m[codelist],skew_12m[codelist],skew_1y[codelist],skew_2y[codelist],skew_3y[codelist],skew_5y[codelist]
coskew_1m,coskew_3m,coskew_6m,coskew_12m,coskew_1y,coskew_2y,coskew_3y,coskew_5y = coskew_1m[codelist],coskew_3m[codelist],coskew_6m[codelist],coskew_12m[codelist],coskew_1y[codelist],coskew_2y[codelist],coskew_3y[codelist],coskew_5y[codelist]
idioskew_1m,idioskew_3m,idioskew_6m,idioskew_12m,idioskew_1y,idioskew_2y,idioskew_3y,idioskew_5y = idioskew_1m[codelist],idioskew_3m[codelist],idioskew_6m[codelist],idioskew_12m[codelist],idioskew_1y[codelist],idioskew_2y[codelist],idioskew_3y[codelist],idioskew_5y[codelist]

In [9]:
########################################################描述性统计表1
def data_statistic(list_of_data):
    X = pd.DataFrame()
    name_of_data = ['1M','3M','6M','12M','1Y','2Y','3Y','5Y']
    for i in range(len(list_of_data)):
        x = list_of_data[i]
        new = pd.Series([x.mean().mean(),x.std().mean(),x.skew().mean(),x.kurt().mean(),x.min().mean(),x.quantile(.05).mean(),x.quantile(.25).mean(),x.median().mean(),x.quantile(.75).mean(),x.quantile(.95).mean(),x.max().mean(),x.count().mean()],index = ['Mean','SD','Skew','Kurt','Min','5%','25%','Median','75%','95%','Max','n'],name = name_of_data[i])
        X = pd.concat([X,new],axis=1)
        cols = ['Mean','SD','Skew','Kurt','Min','5%','25%','Median','75%','95%','Max','n']
        X = X.loc[cols,:]
    X = X.T
    X = X.applymap(lambda x:round(x, 2))
    return X

skew_list = [skew_1m,skew_3m,skew_6m,skew_12m,skew_1y,skew_2y,skew_3y,skew_5y]
coskew_list = [coskew_1m,coskew_3m,coskew_6m,coskew_12m,coskew_1y,coskew_2y,coskew_3y,coskew_5y]
idioskew_list = [idioskew_1m,idioskew_3m,idioskew_6m,idioskew_12m,idioskew_1y,idioskew_2y,idioskew_3y,idioskew_5y]

table1A = data_statistic(skew_list)
table1B = data_statistic(coskew_list)
table1C = data_statistic(idioskew_list)
table1 = pd.concat([table1A,table1B,table1C],keys=['Skew','CoSkew','IdioSkew'])
table1.loc[:,'n'] = table1.loc[:,'n'].astype('int')

![Table1.png](https://i.loli.net/2020/05/01/N4bSYBWz6LJreDc.png)

* The summary statistics show that the measured values of return skewness increase as the measurement period gets longer because the mean, as well as each percentile of the cross-sectional distribution (with the exception of the minimum value) increases when the measurement period is extended.

In [10]:
#######################################################相关系数表2,3,4&5
#####同算法因子相关性矩阵
def personcorr_calculator(dataname1,dataname2):
    X = []
    if len(dataname1.columns)>=len(dataname2.columns):
        month_list = dataname2.columns
    else:
        month_list = dataname1.columns
    for y in month_list:
        x1 = dataname1[y]
        x2 = dataname2[y]
        x = pd.concat([x1,x2],axis=1)
        x = x.dropna(axis=0,how='any')
        person_corr = x.corr().iloc[0,1]
        X.append(person_corr)
    X = pd.Series(X)
    x = X.mean()
    return x

def spearman_calculator(dataname1,dataname2):
    X = []
    if len(dataname1.columns)>=len(dataname2.columns):
        month_list = dataname2.columns
    else:
        month_list = dataname1.columns
    for y in month_list:
        x1 = dataname1[y]
        x2 = dataname2[y]
        x = pd.concat([x1,x2],axis=1)
        x = x.dropna(axis=0,how='any')
        spearman_corr = x.corr(method = 'spearman').iloc[0,1]
        X.append(spearman_corr)
    X = pd.Series(X)
    x = X.mean()
    return x

def corr_in_list(list_of_data):
    '''
    *data的list顺序要固定
    输出对角上半部分为斯皮尔曼系数，对角下半部分为皮尔森系数
    '''
    name_of_data = ['1M','3M','6M','12M','1Y','2Y','3Y','5Y']
    X = pd.DataFrame([],index = name_of_data,columns = name_of_data)
    for i in range(len(list_of_data)):
        for j in range(len(list_of_data)):
            if i<=j:
                X.iloc[i,j] = spearman_calculator(list_of_data[i],list_of_data[j])
            else:
                X.iloc[i,j] = personcorr_calculator(list_of_data[i],list_of_data[j])
    X = X.applymap(lambda x:round(x, 2))
    return X

table2 = corr_in_list(skew_list)
table3 = corr_in_list(coskew_list)
table4 = corr_in_list(idioskew_list)

![Table2_3_4.png](https://i.loli.net/2020/05/01/5YqVRmkJwdSlsv7.png)

* The correlations increase as the amount of overlap in the estimation periods increases.
* For a fixed amount of data overlap, the correlations are decreasing as the amount of nonoverlapping data increases.
* These patterns are likely to be highly mechanical.

In [11]:
#########不同算法因子相关性矩阵
def corr_in_list2(list_of_data1,list_of_data2,corr_type='person'):
    '''
    输入
    corr_type选择'person'或者'spearman' 
    *两个datalist顺序要对应
    '''
    X = []
    if corr_type =='person':
        for i in range(len(list_of_data1)):
            corr = personcorr_calculator(list_of_data1[i],list_of_data2[i])
            X.append(corr)
    else:
        for i in range(len(list_of_data1)):
            corr = spearman_calculator(list_of_data1[i],list_of_data2[i])
            X.append(corr)      
    X = pd.Series(X,index=['1M','3M','6M','12M','1Y','2Y','3Y','5Y'])
    return X

def corr_df(list_of_data1,list_of_data2,list_of_data3):
    x1 = corr_in_list2(list_of_data1,list_of_data2)
    x2 = corr_in_list2(list_of_data1,list_of_data3)
    x3 = corr_in_list2(list_of_data2,list_of_data3)
    X1 = pd.concat([x1,x2,x3],axis=1)
    X1.columns = ['Skew-CoSkew','Skew-IdioSkew','CoSkew-IdioSkew']
    x4 = corr_in_list2(list_of_data1,list_of_data2,corr_type='spearman')
    x5 = corr_in_list2(list_of_data1,list_of_data3,corr_type='spearman')
    x6 = corr_in_list2(list_of_data2,list_of_data3,corr_type='spearman')    
    X2 = pd.concat([x4,x5,x6],axis=1)
    X2.columns = ['Skew-CoSkew','Skew-IdioSkew','CoSkew-IdioSkew']
    X = pd.concat([X1.T,X2.T],axis=0,keys=['person','spearman'])
    return X
        
table5 = corr_df(skew_list,coskew_list,idioskew_list)

![Table5.png](https://i.loli.net/2020/05/01/2rA4aSh9TQtP1JN.png)

* a positive cross-sectional correlation between skewness and co-skewness
* a positive cross-sectional correlation between co-skewness and idiosyncratic skewness
* a weak negative cross-sectional correlation between co-skewness and idiosyncratic skewness

In [12]:
#######################################################与其他因子的相关系数表6
def corr_in_list3(list_of_data1,list_of_data2,skew_name):
    '''
    list1为skew相关因子，list2为其他因子，顺序按照课本表格上顺序
    skew_name为因子名，填'skew','coskew'或者'idioskew'
    '''
    index_name = [skew_name+'1M',skew_name+'3M',skew_name+'6M',skew_name+'12M',skew_name+'1Y',skew_name+'2Y',skew_name+'3Y',skew_name+'5Y']
    X1 = pd.DataFrame(index = index_name,columns=['β','Size','BM','Mom','Rev','Illiq'])
    X2 = X1.copy()
    for i in range(len(list_of_data1)):
        for j in range(len(list_of_data2)):
            X1.iloc[i,j] = personcorr_calculator(list_of_data1[i],list_of_data2[j])
            X2.iloc[i,j] = spearman_calculator(list_of_data1[i],list_of_data2[j])
    X = pd.concat([X1,X2],axis=1,keys=['person','spearman'])
    return X

#########读其他因子值
beta = pd.read_csv('beta.csv',index_col=0)
beta.columns = range(1,241)
size = pd.read_csv('size.csv',index_col=0)
size.columns = range(1,241)
bm = pd.read_csv('bm.csv',index_col=0)
valid = [x for x in bm.index if x in beta.index]
bm = bm.loc[valid]
bm.columns = range(1,241)
mom = pd.read_csv('mom.csv',index_col=0)
mom.columns = range(1,241)
rev = pd.read_csv('rev.csv',index_col=0)
rev.columns = range(1,241)
illiq = pd.read_csv('illiq12.csv',index_col=0)
valid = [x for x in bm.index if x in beta.index]
illiq = illiq.loc[valid]
illiq.columns = range(1,241)
#########其他因子筛选
beta,size,bm,mom,rev,illiq = beta[codelist],size[codelist],bm[codelist],mom[codelist],rev[codelist],illiq[codelist]

otherfactor_list = [beta,size,bm,mom,rev,illiq]
table6A = corr_in_list3(skew_list,otherfactor_list,'Skew')
table6B = corr_in_list3(coskew_list,otherfactor_list,'CoSkew')
table6C = corr_in_list3(idioskew_list,otherfactor_list,'IdioSkew')

Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike


![Table6.png](https://i.loli.net/2020/05/01/sqfPlnoCItviVam.png)

In [13]:
#########################################################持续性表格7,8,9
def Persistence_calculator(df):
    corr = df.corr()
    delay_list = [1,3,6,12,24,36,48,60,120]
    X = pd.DataFrame([],index = df.columns,columns = delay_list)
    for x in range(len(df.columns)):
        for y in range(9):
            if x+delay_list[y] < df.shape[1]:
                X.iloc[x,y] = corr.iloc[x,x+delay_list[y]]
    stats_df = X.mean()
    return stats_df

def data_autocorr(list_of_data,name):
    X = pd.DataFrame()
    name_list=[name+'1M',name+'3M',name+'6M',name+'12M',name+'1Y',name+'2Y',name+'3Y',name+'5Y']
    for i in list_of_data:
        x = Persistence_calculator(i)
        X = pd.concat([X,x],axis=1)
    del_list = [1,2,3,3,4,5,7]
    for j in range(7):
        k = del_list[j]
        X.iloc[:k,j+1] = np.nan
    X.columns = name_list
    X = X.applymap(lambda x:round(x, 2))
    return X          

table7 = data_autocorr(coskew_list,'CoSkew')        
table7[table7.isna()] = ' '
table8 = data_autocorr(idioskew_list,'IdioSkew')
table8[table8.isna()] = ' '
table9 = data_autocorr(skew_list,'Skew')    
table9[table9.isna()] = ' '

![Table7_8_9.png](https://i.loli.net/2020/05/01/ZInzuftSERkg8Dr.png)

* very little cross-sectional persistence in the these variables

In [14]:
#########################################################单因子回归10,11,12
#生成单因子等权函数
def reg_equal_single(factor,factor_name,rt,ffc_factor):
    '''
    输入因子和收益数据，输出等权超额收益和CAPM调整后的α以及两个数值NW调整滞后六期t检验
    
    参数
    factor：因子值表格，index为股票代码，columns为月份编号
    factor_name: 用于分组的因子名
    rt: 超额收益，格式同上
    ffc_factor：四因子因子，index为月份编号，columns为四个因子缩写
    
    输出
    df，每一行分别为：超额收益，超额收益t检验，CAPMα，α的t检验
    columns为1到10还有10-1组
    '''
    rt_list = pd.DataFrame()
    for i in rt.columns:
        temp_factor = factor[i]
        temp_rt = rt[i]
        x = pd.concat([temp_factor,temp_rt],axis=1)
        x.columns = ['factor','rt']
        x = x.dropna()
        x['group'] = pd.qcut(x['factor'],10,duplicates='drop',labels=False)
        rt_avg_i = x.groupby('group')['rt'].mean()
        rt_list = pd.concat([rt_list,rt_avg_i],axis=1)
    rt_list.columns = rt.columns
    rt_list = rt_list.T
    rt_list.columns = range(1,11)
    rt_list['10-1'] = rt_list[10] - rt_list[1]
    j ='10-1'
    reg_list = pd.concat([rt_list[j],ffc_factor],axis=1,join='inner')
#    reg_list.columns = ['rt','mkt_rf','smb','hml','mom']
    reg_list.columns = ['rt','mkt_rf','smb','hml','mom','pls']#########
    model = smf.ols('rt~mkt_rf+smb+hml+mom',reg_list).fit(cov_type='HAC',cov_kwds={'maxlags':6})
    alpha = model.params[0]
    alpha_t = model.tvalues[0]
    model2 = sm.OLS(rt_list[j],[1]*len(rt_list[j]), missing='drop').fit(cov_type='HAC',cov_kwds={'maxlags':6})
    avg_t = model2.tvalues[0]
# =============================================================================
    model3 = smf.ols('rt~mkt_rf+smb+hml+mom+pls',reg_list).fit(cov_type='HAC',cov_kwds={'maxlags':6})
    FFCPSalpha = model3.params[0]
    FFCPSalpha_t = model3.tvalues[0]
# =============================================================================
    rt_avg = round(rt_list.mean(),2)
    rt_avg.name = factor_name
#    temp = pd.Series([],index = [1,2,3,4,5,6,7,8,9,10,'10-1','FFCα'],name = ' ')
    temp = pd.DataFrame(index=[1,2,3,4,5,6,7,8,9,10,'10-1','FFC α'],columns= [' '])
    X = pd.concat([rt_avg,temp],axis=1).T
    X.loc[' ','10-1'] = '('+str(round(avg_t, 2))+')'
    X.loc[factor_name,'FFC α'] = round(alpha, 2)
    X.loc[' ','FFC α'] = '('+str(round(alpha_t, 2))+')'
# =============================================================================
    X.loc[factor_name,'FFCPS α'] = round(FFCPSalpha, 2)
    X.loc[' ','FFCPS α'] = '('+str(round(FFCPSalpha_t, 2))+')'
# =============================================================================
    return X

def reg_mktweight_single(factor,factor_name,rt,mkt,ffc_factor):
    '''
    输入因子和收益数据，输出等权超额收益和CAPM调整后的α以及两个数值NW调整滞后六期t检验
    
    参数
    factor：因子值表格，index为股票代码，columns为月份编号
    factor_name: 用于分组的因子名
    rt: 超额收益，格式同上
    mkt：用来加权的值，这里是市值，格式同上
    ffc_factor：四因子因子，index为月份编号，columns为四个因子缩写
    
    输出
    df，每一行分别为：超额收益，超额收益t检验，CAPMα，α的t检验
    columns为1到10还有10-1组
    '''
    rt_list = pd.DataFrame()
    for i in rt.columns:
        temp_factor = factor[i]
        temp_rt = rt[i]
        temp_mkt = mkt[i]
        x = pd.concat([temp_factor,temp_rt,temp_mkt],axis=1)
        x.columns = ['factor','rt','mktcap']
        x['rt*mkt'] = x['rt']*x['mktcap']
        x = x.dropna()
        x['group'] = pd.qcut(temp_factor,10,duplicates='drop',labels=False)
        rt_avg_i = x.groupby('group')['rt*mkt'].sum()/x.groupby('group')['mktcap'].sum()
        rt_list = pd.concat([rt_list,rt_avg_i],axis=1)
    rt_list.columns = rt.columns
    rt_list = rt_list.T
    rt_list.columns = range(1,11)
    rt_list['10-1'] = rt_list[10] - rt_list[1]
    j ='10-1'
    reg_list = pd.concat([rt_list[j],ffc_factor],axis=1,join='inner')
#    reg_list.columns = ['rt','mkt_rf','smb','hml','mom']
    reg_list.columns = ['rt','mkt_rf','smb','hml','mom','pls']#########
    model = smf.ols('rt~mkt_rf+smb+hml+mom',reg_list).fit(cov_type='HAC',cov_kwds={'maxlags':6})
    alpha = model.params[0]
    alpha_t = model.tvalues[0]
    model2 = sm.OLS(rt_list[j],[1]*len(rt_list[j]), missing='drop').fit(cov_type='HAC',cov_kwds={'maxlags':6})
    avg_t = model2.tvalues[0]
# =============================================================================
    model3 = smf.ols('rt~mkt_rf+smb+hml+mom+pls',reg_list).fit(cov_type='HAC',cov_kwds={'maxlags':6})
    FFCPSalpha = model3.params[0]
    FFCPSalpha_t = model3.tvalues[0]
# =============================================================================
    rt_avg = round(rt_list.mean(),2)
    rt_avg.name = factor_name
#    temp = pd.Series([],index = [1,2,3,4,5,6,7,8,9,10,'10-1','FFCα'],name = ' ')
    temp = pd.DataFrame(index=[1,2,3,4,5,6,7,8,9,10,'10-1','FFC α'],columns= [' '])
    X = pd.concat([rt_avg,temp],axis=1).T
    X.loc[' ','10-1'] = '('+str(round(avg_t, 2))+')'
    X.loc[factor_name,'FFC α'] = round(alpha, 2)
    X.loc[' ','FFC α'] = '('+str(round(alpha_t, 2))+')'
# =============================================================================
    X.loc[factor_name,'FFCPS α'] = round(FFCPSalpha, 2)
    X.loc[' ','FFCPS α'] = '('+str(round(FFCPSalpha_t, 2))+')'
# =============================================================================
    return X

def reg_list(list_of_factor,list_of_factor_name,rt,mkt,ffc_factor,mktweight=False):
    '''
    list_of_factor:要进行分类的因子，每个因子表格的格式为：index为股票代码，columns为月份编号
    list_of_factor_name：进行分类因子的名称，与factorlist中的顺序要对应
    rt: 超额收益，格式同因子数据
    mkt：用来加权的值，这里是市值，格式同因子数据
    ffc_factor：四因子因子，index为月份编号，columns为四个因子缩写
    mktweight:是否用市值加权,输入True和False
    '''
    X = pd.DataFrame()
    if mktweight:
        for i in range(len(list_of_factor)):
            x = reg_mktweight_single(list_of_factor[i],list_of_factor_name[i],rt,mkt,ffc_factor)
            X = pd.concat([X,x],axis=0) 
    else:
        for i in range(len(list_of_factor)):
            x = reg_equal_single(list_of_factor[i],list_of_factor_name[i],rt,ffc_factor)
            X = pd.concat([X,x],axis=0)
    return X

In [15]:
mktcap = pd.pivot_table(monthly,index='code',columns='month_num',values='mktcap')
mktcap = cut2000(mktcap)
mktcap = mktcap[codelist]
monthly_rt = pd.pivot_table(monthly,index='code',columns='month_num',values='rt')
monthly_rt = cut2000(monthly_rt)
monthly_rt = monthly_rt[codelist]
monthly_rt = monthly_rt.shift(-1,axis=1)

skew_name = ['Skew1M','Skew3M','Skew6M','Skew12M','Skew1Y','Skew2Y','Skew3Y','Skew5Y']
table10A =  reg_list(skew_list,skew_name,monthly_rt,mktcap,FFCPSfactor_monthly)
table10A[table10A.isna()] = ' '
table10B =  reg_list(skew_list,skew_name,monthly_rt,mktcap,FFCPSfactor_monthly,mktweight=True)
table10B[table10B.isna()] = ' '

coskew_name = ['CoSkew1M','CoSkew3M','CoSkew6M','CoSkew12M','CoSkew1Y','CoSkew2Y','CoSkew3Y','CoSkew5Y'] 
table11A =  reg_list(coskew_list,coskew_name,monthly_rt,mktcap,FFCPSfactor_monthly)
table11A[table11A.isna()] = ' '
table11B =  reg_list(coskew_list,coskew_name,monthly_rt,mktcap,FFCPSfactor_monthly,mktweight=True)
table11B[table11B.isna()] = ' '

idioskew_name = ['IdioSkew1M','IdioSkew3M','IdioSkew6M','IdioSkew12M','IdioSkew1Y','IdioSkew2Y','IdioSkew3Y','IdioSkew5Y']         
table12A =  reg_list(idioskew_list,idioskew_name,monthly_rt,mktcap,FFCPSfactor_monthly)
table12A[table12A.isna()] = ' '
table12B =  reg_list(idioskew_list,idioskew_name,monthly_rt,mktcap,FFCPSfactor_monthly,mktweight=True)
table12B[table12B.isna()] = ' '

![Table10.png](https://i.loli.net/2020/05/01/NEpneKlxskrZ5dF.png)

![Table11.png](https://i.loli.net/2020/05/01/2nJXxybKVDmRhSE.png)

![Table12.png](https://i.loli.net/2020/05/01/JWZV8Iq6sGoaRwb.png)

* the relation between expected stock returns and total skewness: EW(negative and highly statistically significant), VW(negative and statistically insignificant except Skew1M)

* the relation between expected stock returns and co-skewness: both statistically significant

* the relation between expected stock returns and idiosyncratic skewness: EW(negative and statistically insignificant except IdioSkew1M), VW(negative and statistically insignificant)

* consistent with conclusions from Replicating Anomalies in China
![re.png](https://i.loli.net/2020/05/02/oK1daONA7PHY4f6.png)

In [17]:
######################################################### FM回归13.14.15
final = pd.DataFrame()
for temp in [monthly_rt,rev,beta,size,bm,mom,illiq,skew_1m,skew_3m,skew_6m,skew_12m,skew_1y,skew_2y,skew_3y,skew_5y,\
          coskew_1m,coskew_3m,coskew_6m,coskew_12m,coskew_1y,coskew_2y,coskew_3y,coskew_5y,\
          idioskew_1m,idioskew_3m,idioskew_6m,idioskew_12m,idioskew_1y,idioskew_2y,idioskew_3y,idioskew_5y]:
    temp=temp.stack()
    final = pd.concat([final,temp],axis=1, join='outer')
a = final.reset_index()
a.columns = ['code','month_num','rt_rf','rev','beta','size','bm','mom','illiq','Skew1M','Skew3M','Skew6M','Skew12M','Skew1Y','Skew2Y','Skew3Y','Skew5Y',\
             'CoSkew1M','CoSkew3M','CoSkew6M','CoSkew12M','CoSkew1Y','CoSkew2Y','CoSkew3Y','CoSkew5Y','IdioSkew1M','IdioSkew3M','IdioSkew6M','IdioSkew12M','IdioSkew1Y','IdioSkew2Y','IdioSkew3Y','IdioSkew5Y']

def FM_regression1(independent):
    coefs = []
    adj_R = []
    number = []
    # 筛选出所需指标数据
    df = a.copy()
    FM_df = df[(['month_num','rt_rf'] + independent)].copy()
    month = FM_df[['month_num']].drop_duplicates()
    month = month.sort_values(by = 'month_num')
    month.index = range(1,241)
    for i in month['month_num'][:239]: # 最后一列全nan
        temp = FM_df[FM_df['month_num'] == i].copy()        
        temp = temp.dropna()
        number.append(len(temp)) #样本量
        temp = temp.drop(columns = 'month_num')
        temp[independent] = winsorize(temp[independent], limits=(0.005, 0.005))
        Y = temp['rt_rf']
        X = temp[independent]
        model = sm.OLS(Y.values,sm.add_constant(X).values).fit()
        coefs.append(model.params)
        adj_R.append(model.rsquared_adj)
    col = ['Intercept']+independent    
    result = pd.DataFrame(
            coefs, 
            index = month['month_num'][:239],
            columns = col
            )
    result['adj_R'] = adj_R
    result['n'] = number
    return result

def NWtest_1sample(a, lags=6):
    adj_a = np.array(a)
    # 对常数回归
    model = sm.OLS(adj_a, [1] * len(adj_a)).fit(cov_type='HAC', cov_kwds={'maxlags': lags})
    return adj_a.mean(), float(model.tvalues)

def Table131415fun(table,name,index,colname):
    temp = pd.DataFrame()    
    for i in range(len(data_list)):
        data = data_list[i]
        value1 = data.iloc[:, :-2].apply(NWtest_1sample)
        value1 = np.array([list(x) for x in value1.values]).reshape(-1)
        value1[0:len(value1)-1:2] = [round(x,3) for x in value1[0:len(value1)-1:2]]
        value1 = list(value1)
        value1[1:len(value1):2] = ['('+str(round(x,2))+')' for x in value1[1:len(value1):2]]
        value2 = [round(x,3) for x in data.iloc[:, -2:].mean().values]
        value = pd.DataFrame(value1 + value2)
        inx = ['Intercept','t']+index[i]+['Adj_R2','n']
        value.index = inx
        temp = pd.concat([temp,value],axis = 1,join = 'outer')
    temp.columns = name
    df = temp.T
    df = df[colname]
    df = df.T
    return df

In [18]:
# Table 13
A1  = FM_regression1(['Skew1M'])
A2  = FM_regression1(['Skew3M'])
A3  = FM_regression1(['Skew6M'])
A4  = FM_regression1(['Skew12M'])
A5  = FM_regression1(['Skew1Y'])
A6  = FM_regression1(['Skew2Y'])
A7  = FM_regression1(['Skew3Y'])
A8  = FM_regression1(['Skew5Y'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['Skew1M','Skew1M_t','Skew3M','Skew3M_t','Skew6M','Skew6M_t','Skew12M','Skew12M_t','Skew1Y','Skew1Y_t','Skew2Y','Skew2Y_t','Skew3Y','Skew3Y_t','Skew5Y','Skew5Y_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['Skew1M','Skew1M_t'],['Skew3M','Skew3M_t'],['Skew6M','Skew6M_t'],['Skew12M','Skew12M_t'],['Skew1Y','Skew1Y_t'],['Skew2Y','Skew2Y_t'],['Skew3Y','Skew3Y_t'],['Skew5Y','Skew5Y_t']]
Table13A = Table131415fun(data_list,data_name,index_name,colname)
#Table13A = Table13A.applymap(lambda x:round(x, 3))
Table13A.loc['n',:] = Table13A.loc['n',:].astype('int')
Table13A[Table13A.isna()] = ' '

A1  = FM_regression1(['Skew1M','beta','size','bm','mom','rev','illiq'])
A2  = FM_regression1(['Skew3M','beta','size','bm','mom','rev','illiq'])
A3  = FM_regression1(['Skew6M','beta','size','bm','mom','rev','illiq'])
A4  = FM_regression1(['Skew12M','beta','size','bm','mom','rev','illiq'])
A5  = FM_regression1(['Skew1Y','beta','size','bm','mom','rev','illiq'])
A6  = FM_regression1(['Skew2Y','beta','size','bm','mom','rev','illiq'])
A7  = FM_regression1(['Skew3Y','beta','size','bm','mom','rev','illiq'])
A8  = FM_regression1(['Skew5Y','beta','size','bm','mom','rev','illiq'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['Skew1M','Skew1M_t','Skew3M','Skew3M_t','Skew6M','Skew6M_t','Skew12M','Skew12M_t','Skew1Y','Skew1Y_t','Skew2Y','Skew2Y_t','Skew3Y','Skew3Y_t','Skew5Y','Skew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['Skew1M','Skew1M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew3M','Skew3M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew6M','Skew6M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew12M','Skew12M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew1Y','Skew1Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew2Y','Skew2Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew3Y','Skew3Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['Skew5Y','Skew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t']]
Table13B = Table131415fun(data_list,data_name,index_name,colname)
# Table13B = Table13B.applymap(lambda x:round(x, 3))
Table13B.loc['n',:] = Table13B.loc['n',:].astype('int')
Table13B[Table13B.isna()] = ' '

![Table13.png](https://i.loli.net/2020/05/01/sxdWNfzkruGc86I.png)

In [19]:
# Table14
A1  = FM_regression1(['CoSkew1M'])
A2  = FM_regression1(['CoSkew3M'])
A3  = FM_regression1(['CoSkew6M'])
A4  = FM_regression1(['CoSkew12M'])
A5  = FM_regression1(['CoSkew1Y'])
A6  = FM_regression1(['CoSkew2Y'])
A7  = FM_regression1(['CoSkew3Y'])
A8  = FM_regression1(['CoSkew5Y'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['CoSkew1M','CoSkew1M_t','CoSkew3M','CoSkew3M_t','CoSkew6M','CoSkew6M_t','CoSkew12M','CoSkew12M_t','CoSkew1Y','CoSkew1Y_t','CoSkew2Y','CoSkew2Y_t','CoSkew3Y','CoSkew3Y_t','CoSkew5Y','CoSkew5Y_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['CoSkew1M','CoSkew1M_t'],['CoSkew3M','CoSkew3M_t'],['CoSkew6M','CoSkew6M_t'],['CoSkew12M','CoSkew12M_t'],['CoSkew1Y','CoSkew1Y_t'],['CoSkew2Y','CoSkew2Y_t'],['CoSkew3Y','CoSkew3Y_t'],['CoSkew5Y','CoSkew5Y_t']]   
Table14A = Table131415fun(data_list,data_name,index_name,colname)
# Table14A = Table14A.applymap(lambda x:round(x, 3))
Table14A.loc['n',:] = Table14A.loc['n',:].astype('int')
Table14A[Table14A.isna()] = ' '

A1  = FM_regression1(['CoSkew1M','beta','size','bm','mom','rev','illiq'])
A2  = FM_regression1(['CoSkew3M','beta','size','bm','mom','rev','illiq'])
A3  = FM_regression1(['CoSkew6M','beta','size','bm','mom','rev','illiq'])
A4  = FM_regression1(['CoSkew12M','beta','size','bm','mom','rev','illiq'])
A5  = FM_regression1(['CoSkew1Y','beta','size','bm','mom','rev','illiq'])
A6  = FM_regression1(['CoSkew2Y','beta','size','bm','mom','rev','illiq'])
A7  = FM_regression1(['CoSkew3Y','beta','size','bm','mom','rev','illiq'])
A8  = FM_regression1(['CoSkew5Y','beta','size','bm','mom','rev','illiq'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['CoSkew1M','CoSkew1M_t','CoSkew3M','CoSkew3M_t','CoSkew6M','CoSkew6M_t','CoSkew12M','CoSkew12M_t','CoSkew1Y','CoSkew1Y_t','CoSkew2Y','CoSkew2Y_t','CoSkew3Y','CoSkew3Y_t','CoSkew5Y','CoSkew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['CoSkew1M','CoSkew1M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew3M','CoSkew3M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew6M','CoSkew6M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew12M','CoSkew12M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew1Y','CoSkew1Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew2Y','CoSkew2Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew3Y','CoSkew3Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['CoSkew5Y','CoSkew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t']]   
Table14B = Table131415fun(data_list,data_name,index_name,colname)
# Table14B = Table14B.applymap(lambda x:round(x, 3))
Table14B.loc['n',:] = Table14B.loc['n',:].astype('int')
Table14B[Table14B.isna()] = ' '

![Table14.png](https://i.loli.net/2020/05/01/dt8Olf5jNF9RcwL.png)

In [20]:
# Table15
A1  = FM_regression1(['IdioSkew1M'])
A2  = FM_regression1(['IdioSkew3M'])
A3  = FM_regression1(['IdioSkew6M'])
A4  = FM_regression1(['IdioSkew12M'])
A5  = FM_regression1(['IdioSkew1Y'])
A6  = FM_regression1(['IdioSkew2Y'])
A7  = FM_regression1(['IdioSkew3Y'])
A8  = FM_regression1(['IdioSkew5Y'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['IdioSkew1M','IdioSkew1M_t','IdioSkew3M','IdioSkew3M_t','IdioSkew6M','IdioSkew6M_t','IdioSkew12M','IdioSkew12M_t','IdioSkew1Y','IdioSkew1Y_t','IdioSkew2Y','IdioSkew2Y_t','IdioSkew3Y','IdioSkew3Y_t','IdioSkew5Y','IdioSkew5Y_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['IdioSkew1M','IdioSkew1M_t'],['IdioSkew3M','IdioSkew3M_t'],['IdioSkew6M','IdioSkew6M_t'],['IdioSkew12M','IdioSkew12M_t'],['IdioSkew1Y','IdioSkew1Y_t'],['IdioSkew2Y','IdioSkew2Y_t'],['IdioSkew3Y','IdioSkew3Y_t'],['IdioSkew5Y','IdioSkew5Y_t']]
Table15A = Table131415fun(data_list,data_name,index_name,colname)
# Table15A = Table15A.applymap(lambda x:round(x, 3))
Table15A.loc['n',:] = Table15A.loc['n',:].astype('int')
Table15A[Table15A.isna()] = ' '

A1  = FM_regression1(['IdioSkew1M','beta','size','bm','mom','rev','illiq'])
A2  = FM_regression1(['IdioSkew3M','beta','size','bm','mom','rev','illiq'])
A3  = FM_regression1(['IdioSkew6M','beta','size','bm','mom','rev','illiq'])
A4  = FM_regression1(['IdioSkew12M','beta','size','bm','mom','rev','illiq'])
A5  = FM_regression1(['IdioSkew1Y','beta','size','bm','mom','rev','illiq'])
A6  = FM_regression1(['IdioSkew2Y','beta','size','bm','mom','rev','illiq'])
A7  = FM_regression1(['IdioSkew3Y','beta','size','bm','mom','rev','illiq'])
A8  = FM_regression1(['IdioSkew5Y','beta','size','bm','mom','rev','illiq'])

data_list = [A1,A2,A3,A4,A5,A6,A7,A8]
colname=['IdioSkew1M','IdioSkew1M_t','IdioSkew3M','IdioSkew3M_t','IdioSkew6M','IdioSkew6M_t','IdioSkew12M','IdioSkew12M_t','IdioSkew1Y','IdioSkew1Y_t','IdioSkew2Y','IdioSkew2Y_t','IdioSkew3Y','IdioSkew3Y_t','IdioSkew5Y','IdioSkew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t','Intercept','t','Adj_R2','n']
data_name = ['(1)','(2)','(3)','(4)','(5)','(6)','(7)','(8)']
index_name = [['IdioSkew1M','IdioSkew1M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew3M','IdioSkew3M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew6M','IdioSkew6M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew12M','IdioSkew12M_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew1Y','IdioSkew1Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew2Y','IdioSkew2Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew3Y','IdioSkew3Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t'],['IdioSkew5Y','IdioSkew5Y_t','beta','beta_t','Size','Size_t','BM','BM_t','Mom','Mom_t','Rev','Rev_t','Illiq','Illiq_t']]
Table15B = Table131415fun(data_list,data_name,index_name,colname)
# Table15B = Table15B.applymap(lambda x:round(x, 3))
Table15B.loc['n',:] = Table15B.loc['n',:].astype('int')
Table15B[Table15B.isna()] = ' '

![Table15.png](https://i.loli.net/2020/05/01/rRouEzgWYKd759m.png)