# 投资者情绪、有限套利与股价异象

同美国市场投资者情绪与截面股票未来收益短期正相关、长期负相关的现象不同，中国 A 股市场两者之间短期内就呈现负相关特征。

考虑到 A 股独特的市场交易结构与投资者行为特征，本文提出非主力资金买卖不平衡指标(BSI) ，通过捕捉散户的资金流向实现对个股投资者情绪的测算。

## 1. 文章所需变量

- 中国市场的ff3因子：市场因子、规模因子和价值因子

$$
SMB=(S/G+S/M+S/G)/3-( B/V + B /M + B /G) /3
$$

$$
VMG = ( S /V + B /V) /2 － ( S /G + B /G) /2
$$

- 换手率因子

需要注意的是：**这里的换手率为成交量除以总股本**

$$
PMO = ( S /P + B /P) /2 － ( S /O + B /O) /2
$$

- 组合收益率

本文每个交易日根据样本股票的 **平均换手率或非主力资金 BSI** 进行排序，重新平衡投资组合的 1 /5，并持有 5 个交易日，每天对 5 个部分的收益率进行平均。

其中这里的换手率计算采用的是**成交量除以自由流通股本**，计算公式中使用的股本既不是 Liu et al．(2019) 使用的总股本，也不是在中国市场投资中常用的流通股本，而是在流通股本基础上做了进一步的扣除。
![title](fig\fig1.png)




## 2. 样本选择和数据来源

### 2.1 样本选择（2010-02-01至2019-12-31）

- 考虑到新上市的股票的自由流通股本较少，而一年后有大量限售股解禁，因此我们要求股票上市时间超过 1 年才纳入样本。


- 我们要求股票最近 1 年至少有 120 天在交易，且股票最近 1 个月至少有15 天在交易，这样可以避免长期停牌的股票在复牌后出现异常收益，从而影响本文的分析结果。


- 本文剔除期末总市值最小的 30% 的股票，总市值为股价乘以总股本。Liu et al． ( 2019) 指出，由于中国 A 股市场的 IPO 制度，总市值最小的公司很有可能会被借壳上市或反向收购，其壳资源的价值波动将显著影响其收益率表现。


- 考虑到中国 A 股市场涨跌停板制度，我们剔除了交易日出现涨跌停状态但成交额明显小于近期平均成交额的股票，这些股票由于涨跌停板制度而并未实现充分交易，从而影响投资者情绪等指标的测量准确性。


- 由于中国 A 股市场尚未进行或实施股权分置改革的股票和特殊处理( ST) 股票的涨跌幅限制为 5% ，其交易规则与其他股票不同，因此本文也予以剔除

### 2.2 数据来源（日）

- 收益率：采用包含股息红利的收益率，来源国泰安


- 无风险收益率：银行一年期的存款利率，来源央财因子数据库


- 盈市率：与小组计算因子数据一致，来源国泰安


- 自由流通股：在流通股本基础上做了进一步的扣除，包括: ( 1) 持股不小于 5% 的大股东持有的流通股份; ( 2) 持股小于 5% 但关联方累计持有不小于 5% 的股东持有的流通股份; ( 3) 前 10大股东或前 10 大流通股东中公布的高管持股数，一般全流通情况下扣除 75% ，因为公司法规定高管每年实际可流通的不超过其持股数的 25% ，其他情况视上市公司具体公布情况。**由于wind数据库中有恰好对应指标，但是看不到具体计算过程，所以可能有误差。来源wind数据库。**

- 非主力净流入流出：计算方法为股票日流入流出数据减去主力流入流出数据。国泰安数据库中有大笔交易数据，内有大笔交易每日买入和卖出成交额以及买入卖出总成交额，数据可以进行替代，但是**本数据库对于大笔交易的定义是A股交易量超过100000股** 与文章数据“考虑到市场上常说的主力资金通常是根据股票交易中挂单额大于 20 万元的订单的交易额来测算，因此本文的非主力资金则定义为挂单额小于 20 万元的订单的交易额。”有差异，所以计算存在偏差。数据来源国泰安大笔交易数据库。


- 其他日度市场收益相关或衍生数据（来源国泰安）：
    - 收盘价
    - 总市值
    - 日总成交量和成交金额
    - 交易状体
    - 上市挂牌日期
    - 融资融券
    - 股指期货成分
    - …

## 3. 数据清洗和统一化

将下载的数据集合并对列的变量名称进行整理，采用长列表的格式储存。

并设置列表主键 

- ticker: 股票代码

- tradeDate: 交易日期

In [None]:
import pandas as pd
import numpy as np
import os
import warnings

warnings.filterwarnings("ignore")
#批量打开文件夹导入数据
def open_csv(path,fileName_list,csvName_list):
    data_list=[]
    path=path.replace('\\','/')
    for fileName in fileName_list:
        for csvName in csvName_list :
            path_exact=path+'/'+fileName+'/'+csvName+'.csv'
            if os.path.exists(path_exact):
                data_list.append(pd.read_csv(path_exact))
    return pd.concat(data_list)

def write_csv(df,path,csvName):
    path=path.replace('\\','/')
    path=path+'/'+csvName+'.csv'
    df.to_csv(path)

#设置主键，统一数据格式
def set_ticker_tradeDate(df,ticker_name,tradeDate_name):
    df.rename(columns={ticker_name:'ticker',tradeDate_name:'tradeDate'},inplace=True)
    if df.ticker.dtype != np.dtype('O'):
        df['ticker']=df['ticker'].astype('str')
        def set_ticker(ticker):
            if len(ticker)<6:
                diff=6-len(ticker)
                ticker='0'*diff+ticker
            return ticker
        df['ticker']=df['ticker'].apply(set_ticker)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df
#————————————————————————————————————————————————————————————————————————————————————————
#清洗 成交量_流入流出金额_大笔流入流出金额 数据
path=r'D:\项目文件\项目文件-个人\Project_IS\数据\收盘价_成交量_流入流出金额_大笔流入流出金额'
fileName_list=['20090101-20131231','20140101-20181231','20190101-20191231']
csvName_list=['LT_Dailyinfo']
df=open_csv(path,fileName_list,csvName_list)
df_clean=set_ticker_tradeDate(df,'Stkcd','Trddt')

#细节改动
df_clean.rename(columns={'Prccls':'closePrice','Tolstknum':'tradeVol'},inplace=True)
df_clean.sort_values(['ticker','tradeDate'],inplace=True)

#导出数据
path='D:\项目文件\项目文件-个人\Project_IS\数据清洗'
csvName='closePrice_tradeVol_LrgBuySel_TolBuySel'
write_csv(df_clean.set_index('ticker'),path,csvName)

#————————————————————————————————————————————————————————————————————————————————————————
#清洗 收盘价_交易状态 数据
path=r'D:\项目文件\项目文件-个人\Project_IS\数据\收盘价_交易状态'
fileName_list=['20100101-20141231','20150101-20191231']
csvName_list=['TRD_Dalyr','TRD_Dalyr1','TRD_Dalyr2','TRD_Dalyr3']
df=open_csv(path,fileName_list,csvName_list)
df_clean=set_ticker_tradeDate(df,'Stkcd','Trddt')

#细节改动
df_clean.rename(columns={'Clsprc':'closePrice','Trdsta':'tradeState'},inplace=True)
df_clean.drop('Ahshrtrd_D',axis=1,inplace=True)
df_clean.sort_values(['ticker','tradeDate'],inplace=True)

#导出数据
path='D:\项目文件\项目文件-个人\Project_IS\数据清洗'
csvName='closePrice_tradeState'
write_csv(df_clean.set_index('ticker'),path,csvName)
#data=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_tradeState.csv')

## 4. 中国市场的ff3因子

中国市场ff3因子计算与Liu et al．(2019)不同，采用是市值不为流通市值，而是自由流通市值，并且在样本剔除条件上进行更改并增加了两项剔除条件。

### 4.1 剔除剔除近365天有120天有交易以及股票最近30天有15天在交易

In [None]:
import pandas as pd
def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df

def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res

def exclude_ym(dff):
    df=dff.copy()
    df.set_index('tradeDate',inplace=True)
    df_map=df[['closePrice']].resample('D').apply(lambda x:0).rename(columns={'closePrice':'map'})
    df=df_map.merge(df,how='left',left_index=True,right_index=True)
    df['num_month']=df['closePrice'].fillna(-1).rolling(30).apply(lambda x: (x>0).sum()).shift(1)
    df['num_year']=df['closePrice'].fillna(-1).rolling(365).apply(lambda x: (x>0).sum()).shift(1)
    df.dropna(inplace=True)
    res=df[(df['num_month']>=15) & (df['num_year']>=120)].drop('map',axis=1).reset_index()
    return res

if __name__ == '__main__':
    closePrice_tradeState=open_csv('closePrice_totalvalue_tradeState.csv')
    closePrice_tradeState=closePrice_tradeState[['ticker','tradeDate','closePrice','tradeState']]
    closePrice_tradeState['ticker']=set_ticker(closePrice_tradeState)

    tradeState=pd.DataFrame({})
    for i,ticker in enumerate(closePrice_tradeState['ticker'].unique().tolist()):
        print(i,ticker)
        dff=closePrice_tradeState[closePrice_tradeState['ticker']==ticker]
        res=exclude_ym(dff)
        tradeState=pd.concat([tradeState,res])
    tradeState.to_csv('tradeState.csv')

### 4.2 因子收益率计算

对于样本股票的剔除，我们分步骤进行，在收益率表上剔除（1）剔除上市不满一年数据；（2）剔除近365天有120天有交易以及股票最近30天有15天在交易；（3）#剔除股改和st股票；（4）剔除成交量异常交易数据。在自由流通股上我们剔除了（5）剔除最小的30%市值股票。最后将股票合并取交集。


- 不过值得注意的是，文章并没有说明**剔除成交量异常交易数据**的具体方法。


“考虑到中国 A 股市场涨跌停板制度，我们剔除了交易日出现涨跌停状态但成交额明显小于近期平均成交额的股票，这些股票由于涨跌停板制度而并未实现充分交易，从而影响投资者情绪等指标的测量准确性”


- 所以在数据处理上我们采用当收益率的绝对值大于9.5%时，若成交量小于近20交易日均值的一半，则判断为异常收益率予以剔除

In [None]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
import warnings
warnings.filterwarnings("ignore")
#读取数据
def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df
ret_day=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/ret_day.csv')
#设置格式
def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res
ret_day['ticker']=set_ticker(ret_day)
ret_day['year']=ret_day['tradeDate'].dt.year
ret_day['month']=ret_day['tradeDate'].dt.month
ret_day['factorMonth']=ret_day['year']*100+ret_day['month']
alldata=ret_day.copy()
alldata=ret_day[ret_day['year']>2009]
#超额收益率
ff3=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ff3_factor.csv')
ff3['tradeDate']=pd.to_datetime(ff3['tradeDate'])
alldata=alldata.merge(ff3[['tradeDate','rf']],how='left',on='tradeDate')
alldata['rf']=alldata['rf']*100
alldata['ret']=alldata['ret']*100
alldata['ex_ret']=alldata['ret']-alldata['rf']

#月末时间点
calendar=pd.DataFrame(alldata['tradeDate'].unique(),columns=['tradeDate'])
calendar['year']=calendar['tradeDate'].dt.year
calendar['month']=calendar['tradeDate'].dt.month

def append_monthEnd(df):
    df.sort_values('tradeDate',inplace=True)
    monthEnd=df['tradeDate'].iloc[-1]
    df['monthEnd']=df['tradeDate']==monthEnd
    return df
calendar=calendar.groupby(['year','month']).apply(append_monthEnd)
calendar.index=calendar.index.droplevel([0,1])

calendar_map=calendar[['year','month']].copy()
calendar_map['year_month']=calendar_map['year']*100+calendar_map['month']
calendar_map=pd.DataFrame(calendar_map['year_month'].unique()).reset_index().rename(columns={'index':'monthNum'})
calendar_map['year']=calendar_map[0]//100
calendar_map['month']=calendar_map[0]%100
calendar_map.drop(0,axis=1,inplace=True)
alldata=alldata.merge(calendar_map,how='left',on=['year','month'])

#剔除上市不满一年数据
listDate=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/listDate.csv')
listDate['ticker']=listDate['ticker'].astype(str)
listDate['ticker']=set_ticker(listDate)
listDate['boundDate']=(pd.to_datetime(listDate['listDate'])+pd.Timedelta(days=180)).astype(str)
alldata=alldata.merge(listDate[['ticker','boundDate']],how='left',on='ticker')
alldata['boundDate']=alldata['boundDate'].fillna('1999-12-31')
alldata=alldata[alldata['tradeDate']>pd.to_datetime(alldata['boundDate'])]

#剔除近365天有120天有交易以及股票最近30天有15天在交易
tradeState=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/tradeState.csv')
tradeState['ticker']=tradeState['ticker'].astype(str)
tradeState['ticker']=set_ticker(tradeState)
alldata=alldata.merge(tradeState,how='inner',on=['ticker','tradeDate'])

#剔除股改和st股票
exclude1=alldata[alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]
alldata=alldata[~alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]

#剔除成交量异常交易数据
tradeVol=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_tradeVol.csv')
tradeVol=tradeVol[['ticker','tradeDate','tradeVol']]
tradeVol['ticker']=set_ticker(tradeVol)
ret_exclude=alldata.merge(tradeVol,how='left',on=['ticker','tradeDate'])

def append_mean_tradeVol(df):
    df['tradeVol']=df['tradeVol'].fillna(method='ffill')
    df['mean_tradeVol']=(df['tradeVol'].rolling(20).mean()).shift(1)
    df['mean_tradeVol']=df['mean_tradeVol'].fillna(method='bfill')
    return df
ret_exclude=ret_exclude.groupby('ticker').apply(append_mean_tradeVol)
exclude2=ret_exclude[((ret_exclude['ret']>9.5)|(ret_exclude['ret']<-9.5)) & (0.5*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol'])]
ret_exclude=ret_exclude[~(((ret_exclude['ret']>9.5)|(ret_exclude['ret']<-9.5)) & (0.5*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol']))]
ret_exclude=ret_exclude.merge(calendar[['tradeDate','monthEnd']],how='left',on='tradeDate')

#
ret=ret_exclude[['ticker','tradeDate','ex_ret']]
ret_wide=ret.pivot(index='tradeDate',columns='ticker',values='ex_ret')
ret_wide['factorMonth']=ret_wide.index.year*100+ret_wide.index.month
ret_wide=ret_wide[ret_wide['factorMonth']>201000]

#自由流通股
freeStock=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ticker_freeStock.csv')
freeStock['ticker']=set_ticker(freeStock)
freeStock['freeStock']=freeStock['freeStock'].str.replace(',','').astype(float)
freeStock_end=freeStock.merge(calendar[['tradeDate','monthEnd']],how='left',on='tradeDate')

freeStock_end.dropna(inplace=True)
freeStock_month=freeStock_end[freeStock_end['monthEnd']]
#补充201001数据
freeStock_month=pd.concat([freeStock_month,freeStock[freeStock['tradeDate']=='2009-12-31']])
freeStock_month['factorMonth']=freeStock_month['tradeDate'].dt.year*100+freeStock_month['tradeDate'].dt.month

#自由流通市值
closePrice_tradeState=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_totalvalue_tradeState.csv')
closePrice_tradeState=closePrice_tradeState[['ticker','tradeDate','closePrice','tradeState']]
closePrice_tradeState['ticker']=set_ticker(closePrice_tradeState)

closePrice=closePrice_tradeState[['tradeDate','ticker','closePrice']]
closePrice_wide=closePrice.pivot(index='tradeDate',columns='ticker',values='closePrice')
closePrice_wide.fillna(method='ffill',inplace=True)
closePrice=closePrice_wide.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='closePrice')
freeVol_month=freeStock_month.merge(closePrice,how='left',on=['ticker','tradeDate'])
freeVol_month['freeVol']=freeVol_month['freeStock']*freeVol_month['closePrice']

#剔除最小的30%
def exuclude_30(df):
    df.dropna(subset=['freeVol'],inplace=True)
    df=df[df['freeVol']!=0]
    df['rank']=df['freeVol'].rank(method='min')
    bound=df.iloc[int(len(df.sort_values('rank'))/10*3),:]['rank']
    df=df[df['rank']>=bound]
    return df.sort_values(['ticker','tradeDate'])
freeVol_month_ex=pd.DataFrame({})
for month in freeVol_month['factorMonth'].unique().tolist():
    df=freeVol_month[freeVol_month['factorMonth']==month]
    df=exuclude_30(df)
    freeVol_month_ex=pd.concat([freeVol_month_ex,df])
freeVol_month_wide_ex=freeVol_month_ex.pivot(index='factorMonth',columns='ticker',values='freeVol')
freeVol_month_wide_ex=freeVol_month_wide_ex[freeVol_month_wide_ex.index>=200912]
freeVol_month_wide_ex=freeVol_month_wide_ex.shift(1).dropna(how='all')
#——————————————————————————————————————————————————————————————————————————————————————————————————
#计算月度因子
#
EP=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/EP_month.csv')
EP['ticker']=EP['ticker'].astype(str)
EP['ticker']=set_ticker(EP)
EP_month=EP[EP['factorMonth']>201000]
EP_month_wide=EP_month.pivot(index='factorMonth',columns='ticker',values='EP_month')
#——————————————————————————————————————————————————————————————————————————————————————————————————
#分组函数
#
def get_factor_return(ret,f_me,f_ep,col):
    month_list=ret['factorMonth'].sort_values().unique().tolist()
    fac=pd.DataFrame(columns=col,index=ret.index)
    
    for i in range(len(month_list)):
        month=month_list[i]
        total=pd.concat([f_me.loc[month],f_ep.loc[month]],join='inner',axis=1)
        total.columns=['me','ep']
        total=total.dropna()
        
        freeStock_map=f_me.loc[month]
        
        retM=ret[ret['factorMonth']==month].drop('factorMonth',axis=1)
        retM=retM.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='ex_ret')
        
        total_ret=retM.copy()
        total_ret['freeVol']=total_ret['ticker'].map(freeStock_map)
        total_ret['freeVol']=total_ret['freeVol'].fillna(0)
        total_ret['vw_ex_ret']=total_ret['ex_ret']*total_ret['freeVol']
        
        if len(total)>=6:
            mkt_rf=total_ret.groupby('tradeDate').apply(lambda x:(x['vw_ex_ret']/x['freeVol'].sum()).sum())
            mkt_rf=pd.DataFrame(mkt_rf,columns=[col[0]])
            fac.update(mkt_rf)
            #
            final = total.sort_values(by='me')  #根据me排序    
            final['rank']=final['me'].rank(method='min')
            
            bound=final.iloc[int(len(final)/2),:]['rank']
            S0 = final[final['rank']<=bound]#小
            B0 = final[final['rank']>=bound]#大
            
            #
            final = total.sort_values(by='ep')
            final['rank']=final['ep'].rank(method='min')
            
            bound1=final.iloc[int(len(final)/10*3),:]['rank']
            bound2=final.iloc[int(len(final)/10*7),:]['rank']
            G0=final[final['rank']<=bound1]#小
            M0=final[(final['rank']>=bound1) & (final['rank']<=bound2)]
            V0=final[final['rank']>=bound2]#大
            
            def get_portfolio_ret(total_ret,df1,df2):
                df12=pd.concat([df1,df2],join='inner',axis=1)
                df12_ticker=list(df12.index)
                df12vw=total_ret[total_ret['ticker'].isin(df12_ticker)]
                df12vw=df12vw.groupby('tradeDate').apply(lambda x:(x['vw_ex_ret']/x['freeVol'].sum()).sum()) 
                return df12vw
           
            SGvw=get_portfolio_ret(total_ret,S0,G0)
            SMvw=get_portfolio_ret(total_ret,S0,M0)
            SVvw=get_portfolio_ret(total_ret,S0,V0)
            BGvw=get_portfolio_ret(total_ret,B0,G0) 
            BMvw=get_portfolio_ret(total_ret,B0,M0) 
            BVvw=get_portfolio_ret(total_ret,B0,V0)

            #
            smb=(SGvw +SMvw+SVvw)/3-(BGvw+BMvw+BVvw)/3
            smb=pd.DataFrame(smb,columns=[col[1]])
            fac.update(smb)
            
            vmg=(SVvw +BVvw)/2-(SGvw+BGvw)/2
            vmg=pd.DataFrame(vmg,columns=[col[2]])
            fac.update(vmg)
    return fac

ff3_free=get_factor_return(ret_wide,freeVol_month_wide_ex,EP_month_wide,['ch_mkt_rf','ch_smb','ch_vmg'])
ff3_free1=ff3_free[ff3_free.index>='2010-02-01']

## 5. 中国市场换手率因子

计算换手率因子除以上样本筛选条件外，还需计算换手率因子，但此换手率因子与文章构造组合收益率的换手率指标计算方法不同，如原文所述：

“换手率变动因子收益率 PMO 的构造方法同价值因子收益率是类似的，只是在进行分组构造投资组合时将盈市比指标换成换手率变动指标，构造小盘低活跃股( S /P) 、小盘高活跃股( S /O) 、大盘低活跃股( B /P) 、大盘高活跃股( S /O) 。换手率因子收益率则为 2 个低换手率组合的平均收益率减去 2 个高换手率组合的平均收益率”

“计算 Liu et al． ( 2019) 的换手率变动因子的指标为最近 20 天的平均换手率除以最近 250 天的平均换手率。**这里的换手率为成交量除以总股本**”

### 5.1 计算换手率

由于文章用到两种换手率，一是分母为总股本的换手率，用来计算因子收益率；二是分母为自由流通股本的换手率，用来计算组合收益率，所以我们在此一并计算。

In [None]:
import pandas as pd
import warnings

warnings.filterwarnings("ignore")
def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df


def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res

#成交金额
tradeValue_Turnover=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/tradeValue_Turnover.csv')
tradeValue=tradeValue_Turnover[['ticker','tradeDate','tradeValue']]
tradeValue['ticker']=tradeValue['ticker'].astype(str)
tradeValue['ticker']=set_ticker(tradeValue)

#自由流通市值
#收盘价
closePrice_tradeState=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_totalvalue_tradeState.csv')
closePrice_tradeState=closePrice_tradeState[['ticker','tradeDate','closePrice','tradeState']]
closePrice_tradeState['ticker']=set_ticker(closePrice_tradeState)
closePrice=closePrice_tradeState[['ticker','tradeDate','closePrice']]
#自由流通股
freeStock=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ticker_freeStock.csv')
freeStock['ticker']=set_ticker(freeStock)
freeStock['freeStock']=freeStock['freeStock'].str.replace(',','').astype(float)

#总市值
totalvalue_tradeState=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_totalvalue_tradeState.csv')
totalvalue=totalvalue_tradeState[['ticker','tradeDate','stkCap']]
totalvalue['ticker']=totalvalue['ticker'].astype(str)
totalvalue['ticker']=set_ticker(totalvalue)

totalvalue['stkCap']=totalvalue['stkCap']*1000
#——————————————————————————————————————————————————————————————————————————————————————
#计算换手率：采用成交金额除以自由流通市值
freeValue=freeStock.merge(closePrice,how='left',on=['ticker','tradeDate'])
freeValue['freeValue']=freeValue['freeStock']*freeValue['closePrice']
freeValue=freeValue[['ticker','tradeDate','freeValue']]
#
turnover=freeValue.merge(tradeValue,how='left',on=['ticker','tradeDate'])
turnover['turnover']=turnover['tradeValue']/turnover['freeValue']
turnover.dropna(subset=['turnover'],inplace=True)
#turnover.to_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover.csv')
#——————————————————————————————————————————————————————————————————————————————————————
#计算换手率2：采用成交金额除以总市值
turnover2=totalvalue.merge(tradeValue,how='left',on=['ticker','tradeDate'])
turnover2['turnover']=turnover2['tradeValue']/turnover2['stkCap']
turnover2.dropna(subset=['turnover'],inplace=True)
#turnover2.to_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover2.csv')

In [7]:
import pandas as pd
import warnings

warnings.filterwarnings("ignore")
def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df


def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res

In [4]:
#turnover
turnover=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover.csv')
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
turnover.pivot(index='tradeDate',columns='ticker',values='turnover')

ticker,000001,000002,000004,000006,000008,000009,000010,000011,000012,000014,...,688333,688357,688358,688363,688366,688368,688369,688388,688389,688399
tradeDate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2009-01-05,0.013071,0.011431,0.006408,0.013114,0.012828,0.015695,inf,0.010875,0.028371,0.021340,...,,,,,,,,,,
2009-01-06,0.024228,0.015062,0.013072,0.019230,0.016080,0.024875,inf,0.011199,0.026298,0.025397,...,,,,,,,,,,
2009-01-07,0.024135,0.010536,0.020456,0.019529,0.013581,0.020682,inf,0.009844,0.021728,0.026506,...,,,,,,,,,,
2009-01-08,0.015306,0.010306,0.036237,0.022065,0.042058,0.017706,inf,0.011536,0.032606,0.024617,...,,,,,,,,,,
2009-01-09,0.023186,0.007819,0.018435,0.020062,0.015361,0.020335,inf,0.010126,0.015484,0.025444,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019-12-25,0.004848,0.011326,0.012778,0.008411,0.011285,0.085163,0.004639,0.008485,0.006834,0.011958,...,0.067414,0.144083,0.069875,0.030971,0.027317,0.080932,0.069171,0.199324,0.099086,0.109450
2019-12-26,0.004309,0.014607,0.048019,0.009479,0.010525,0.087926,0.004052,0.018226,0.012010,0.011233,...,0.088878,0.120908,0.315832,0.048145,0.042588,0.038632,0.060441,0.112570,0.099083,0.127422
2019-12-27,0.012175,0.011614,0.060116,0.009215,0.014898,0.209940,0.004735,0.012200,0.011569,0.009609,...,0.066073,0.102545,0.202208,0.031474,0.036617,0.079905,0.095410,0.077806,0.088599,0.187219
2019-12-30,0.011249,0.014980,0.035271,0.010411,0.014721,0.143118,0.004633,0.024758,0.029648,0.011924,...,0.043937,0.074730,0.137218,0.030782,0.033200,0.059356,0.063783,0.076972,0.056389,0.091321


In [5]:
#turnover2
turnover=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover2.csv')
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
turnover.pivot(index='tradeDate',columns='ticker',values='turnover')

ticker,000001,000002,000004,000005,000006,000007,000008,000009,000010,000011,...,900947,900948,900949,900950,900951,900952,900953,900955,900956,900957
tradeDate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2009-01-05,0.010864,0.009489,0.004724,0.008462,0.010088,0.005527,0.006038,0.011272,0.003163,0.002069,...,0.003022,0.003081,0.000539,0.001100,0.000488,0.001848,0.000579,0.000948,0.000546,0.000782
2009-01-06,0.020137,0.012502,0.009637,0.013535,0.014793,0.007775,0.007569,0.017864,0.003339,0.002131,...,0.009740,0.005171,0.001318,0.001837,0.000662,0.003763,0.001122,0.001440,0.001362,0.001069
2009-01-07,0.020060,0.008746,0.015081,0.011185,0.015023,0.010716,0.006392,0.014852,0.003637,0.001873,...,0.003930,0.003259,0.001106,0.000984,0.000351,0.001477,0.002169,0.000979,0.000741,0.000817
2009-01-08,0.012722,0.008554,0.026715,0.011302,0.016974,0.013393,0.019796,0.012716,0.004525,0.002195,...,0.003739,0.003426,0.000647,0.001700,0.000410,0.002068,0.000999,0.001643,0.000497,0.000535
2009-01-09,0.019271,0.006490,0.013591,0.014914,0.015433,0.013547,0.007230,0.014604,0.004163,0.001927,...,0.002438,0.001933,0.000639,,0.000586,0.001562,0.000729,0.001095,0.000514,0.000649
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019-12-25,0.002207,0.007069,0.008315,0.004507,0.005373,0.013116,0.006817,0.057587,0.002767,0.002346,...,0.000285,0.000227,,,0.000312,0.000825,0.000113,0.000181,0.000471,0.000222
2019-12-26,0.001962,0.009116,0.031248,0.011637,0.006055,0.019658,0.006358,0.059456,0.002417,0.005039,...,0.000324,0.000242,,,0.000479,0.000522,0.000070,0.000412,0.000413,0.000206
2019-12-27,0.005543,0.007249,0.039119,0.008721,0.005886,0.030589,0.008999,0.141962,0.002824,0.003372,...,0.000284,0.000581,,,0.000111,0.000260,0.000221,0.000362,0.002896,0.000257
2019-12-30,0.005121,0.009350,0.022952,0.006267,0.006650,0.010728,0.008892,0.096777,0.002763,0.006844,...,0.000368,0.000767,,,0.000029,0.001757,0.000288,0.000557,0.001101,0.000584


In [6]:
#Turnover
turnover=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/tradeValue_Turnover.csv')
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
turnover.pivot(index='tradeDate',columns='ticker',values='Turnover')

ticker,000001,000002,000004,000005,000006,000007,000008,000009,000010,000011,...,900947,900948,900949,900950,900951,900952,900953,900955,900956,900957
tradeDate,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2009-01-05,0.01224,0.00997,0.00573,0.01072,0.01122,0.00708,0.00862,0.01590,0.00691,0.01093,...,0.00305,0.00683,0.00158,0.00276,0.00134,0.00186,0.00156,0.00095,0.00112,0.00149
2009-01-06,0.02282,0.01296,0.01160,0.01716,0.01653,0.01020,0.01100,0.02529,0.00727,0.01132,...,0.00996,0.01161,0.00388,0.00463,0.00183,0.00378,0.00299,0.00148,0.00282,0.00205
2009-01-07,0.02198,0.00889,0.01835,0.01388,0.01627,0.01362,0.00918,0.02055,0.00798,0.00987,...,0.00392,0.00714,0.00327,0.00242,0.00097,0.00148,0.00583,0.00097,0.00152,0.00155
2009-01-08,0.01410,0.00892,0.03228,0.01420,0.01873,0.01757,0.02790,0.01755,0.00986,0.01150,...,0.00379,0.00770,0.00187,0.00427,0.00112,0.00206,0.00269,0.00165,0.00102,0.00101
2009-01-09,0.02171,0.00668,0.01693,0.01906,0.01707,0.01713,0.01030,0.02054,0.00906,0.01022,...,0.00244,0.00424,0.00185,,0.00160,0.00157,0.00196,0.00110,0.00105,0.00124
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2019-12-25,0.00220,0.00705,0.00846,0.00451,0.00538,0.01449,0.00731,0.05962,0.00435,0.00235,...,0.00028,0.00056,,,0.00086,0.00083,0.00030,0.00018,0.00096,0.00042
2019-12-26,0.00197,0.00915,0.03234,0.01167,0.00607,0.02196,0.00684,0.06186,0.00378,0.00505,...,0.00033,0.00059,,,0.00132,0.00053,0.00019,0.00041,0.00085,0.00040
2019-12-27,0.00552,0.00724,0.03945,0.00869,0.00585,0.03457,0.00960,0.14625,0.00444,0.00337,...,0.00028,0.00143,,,0.00031,0.00026,0.00059,0.00036,0.00588,0.00048
2019-12-30,0.00517,0.00943,0.02312,0.00632,0.00667,0.01196,0.00954,0.09691,0.00435,0.00692,...,0.00037,0.00190,,,0.00008,0.00178,0.00077,0.00055,0.00226,0.00111


### 5.2 计算换手率因子收益率

我们采用换手率因子2，即turnover2.csv中的数据。

需要说明的是，文章中说明计算平均换手率的时候需要剔除一字涨跌停的股票，但是目前没想好怎么采用表达，直接剔除涨跌停股票，则会导致剔除股票数量过多，因子收益率效果不佳。

其次，换手率因子收益率对因子的值的变化十分敏感，具体表现为当历史均值计算窗口和最小窗口变化时，换手率因子收益率的均值将会发生较大变化。主要原因是换手率有许多缺失值，这是主要来源于成交量的缺失值。但是无法判断成交量缺失是否是当日无成交，或者是成交量缺失。而当采用过去20个交易日和250个交易日的历史均值，窗口值越满因子效果越好，但是缺失值会越多，所以折中采取20和220的最小窗口。

In [None]:
#计算换手率因子
#
turnover=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover2.csv')
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
#剔除无穷
turnover=turnover[turnover['turnover']<999]
'''
#剔除一字涨跌停股票
turnover_ex=ret_day.merge(turnover,how='left',on=['ticker','tradeDate'])
exclude3=turnover_ex[(turnover_ex['ret']>0.099) | (turnover_ex['ret']<-0.099)]
turnover_ex=turnover_ex[~(turnover_ex['ret']>0.099) | (turnover_ex['ret']<-0.099)]
'''
turnover_ex=turnover

#
turnover_wide_ex=turnover_ex.pivot(index='tradeDate',columns='ticker',values='turnover')
turnover20_wide_ex=turnover_wide_ex.rolling(window=20,min_periods=20).mean().shift(1)
turnover250_wide_ex=turnover_wide_ex.rolling(window=250,min_periods=220).mean().shift(1)
turnover20_ex=turnover20_wide_ex.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='turnover20').dropna()
turnover250_ex=turnover250_wide_ex.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='turnover250').dropna()

#
turnFactor_ex=turnover20_ex.merge(turnover250_ex,how='inner',on=['ticker','tradeDate'])
turnFactor_ex['turnFactor']=turnFactor_ex['turnover20']/turnFactor_ex['turnover250']
turnFactor_ex=turnFactor_ex.merge(calendar,how='left',on='tradeDate')
turnFactor_month_ex=turnFactor_ex.copy()
turnFactor_month_ex.dropna(inplace=True)
turnFactor_month_ex=turnFactor_month_ex[turnFactor_month_ex['monthEnd']]
#增加200912数据
turnFactor_month_ex=pd.concat([turnFactor_month_ex,turnFactor_ex[turnFactor_ex['tradeDate']=='2009-12-31']])
turnFactor_month_ex.sort_values(['tradeDate','ticker'],inplace=True)

turnFactor_month_wide_ex=turnFactor_month_ex.pivot(index='tradeDate',columns='ticker',values='turnFactor')
turnFactor_month_wide_ex=turnFactor_month_wide_ex.shift(1).dropna(how='all')
turnFactor_month_wide_ex['factorMonth']=turnFactor_month_wide_ex.index.year*100+turnFactor_month_wide_ex.index.month
turnFactor_month_wide_ex=turnFactor_month_wide_ex.reset_index().drop('tradeDate',axis=1).set_index('factorMonth')

#
def get_factor_return2(ret,f_me,f_turn,col):
    month_list=f_turn.index.sort_values().unique().tolist()
    fac=pd.DataFrame(columns=col,index=ret.index)
    
    for i in range(len(month_list)):
        month=month_list[i]
        total=pd.concat([f_me.loc[month],f_turn.loc[month]],join='inner',axis=1)
        total.columns=['me','turn']
        total=total.dropna()
        
        freeStock_map=f_me.loc[month]
        
        retM=ret[ret['factorMonth']==month].drop('factorMonth',axis=1)
        retM=retM.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='ex_ret')
        
        total_ret=retM.copy()
        total_ret['freeVol']=total_ret['ticker'].map(freeStock_map)
        total_ret['freeVol']=total_ret['freeVol'].fillna(0)
        total_ret['vw_ex_ret']=total_ret['ex_ret']*total_ret['freeVol']
        
        if len(total)>=6:
            mkt_rf=total_ret.groupby('tradeDate').apply(lambda x:(x['vw_ex_ret']/x['freeVol'].sum()).sum())
            mkt_rf=pd.DataFrame(mkt_rf,columns=[col[0]])
            fac.update(mkt_rf)
            #
            final = total.sort_values(by='me')  #根据me排序    
            final['rank']=final['me'].rank(method='min')
            
            bound=final.iloc[int(len(final)/2),:]['rank']
            S0 = final[final['rank']<=bound]#小
            B0 = final[final['rank']>=bound]#大
            
            #
            final = total.sort_values(by='turn')
            final['rank']=final['turn'].rank(method='min')
            
            bound=final.iloc[int(len(final)/2),:]['rank']
            P0=final[final['rank']<=bound]#小
            O0=final[final['rank']>=bound]#大
            
            def get_portfolio_ret(total_ret,df1,df2):
                df12=pd.concat([df1,df2],join='inner',axis=1)
                df12_ticker=list(df12.index)
                df12vw=total_ret[total_ret['ticker'].isin(df12_ticker)]
                df12vw=df12vw.groupby('tradeDate').apply(lambda x:(x['vw_ex_ret']/x['freeVol'].sum()).sum()) 
                return df12vw
           
            SPvw=get_portfolio_ret(total_ret,S0,P0)
            SOvw=get_portfolio_ret(total_ret,S0,O0)
            BPvw=get_portfolio_ret(total_ret,B0,P0) 
            BOvw=get_portfolio_ret(total_ret,B0,O0) 

            #
            smb=(SPvw +SOvw)/2-(BPvw+BOvw)/2
            smb=pd.DataFrame(smb,columns=[col[1]])
            fac.update(smb)
            
            pmo=(SPvw +BPvw)/2-(SOvw+BOvw)/2
            pmo=pd.DataFrame(pmo,columns=[col[2]])
            fac.update(pmo)
    return fac

pmo=get_factor_return2(ret_wide,freeVol_month_wide_ex,turnFactor_month_wide_ex,['n1','n2','pmo'])
pmo1=pmo[pmo.index>='2010-02-01']
ff4_free=pd.concat([ff3_free1,pmo1[['pmo']]],axis=1)
#ff4_free.to_csv('D:/项目文件/项目文件-个人/Project_IS/过程数据/ff4_ch.csv')

### 5.3 ff3因子和换手率因子结果汇报

In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
import warnings
warnings.filterwarnings("ignore")
ff4_free=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/过程数据/ff4_ch.csv')
ff4_free.set_index('tradeDate',inplace=True)
#
def NWtest(a, lags = 5):
    # lags为滞后期
    adj_a = pd.DataFrame(a)
    adj_a = adj_a.dropna()
    n=len(adj_a)
    if n>0:
        adj_a = adj_a.astype(float)
        adj_a = np.array(adj_a)
        model = sm.OLS(adj_a, [1] * len(adj_a)).fit(cov_type = 'HAC', cov_kwds={'maxlags': lags})
        mean=round(float(model.params[0]),4)
        std_adj=round(float(model.bse*n**0.5),4)
        tvalues=round(float(model.tvalues),4)
        return [mean,std_adj,tvalues]
    else:
        return [np.nan]*3
    
result=pd.DataFrame({})
for factor_name in ff4_free.columns:
    res=NWtest(ff4_free[factor_name])
    res=pd.DataFrame(res,index=['mean','std_adj','tvalues'],columns=[factor_name])
    result=pd.concat([result,res],axis=1)
result=result.T
result['mean']=result['mean']*250
result['std_adj']=result['std_adj']*250**0.5
result

Unnamed: 0,mean,std_adj,tvalues
ch_mkt_rf,4.95,23.724988,0.6462
ch_smb,4.8,9.858401,1.4991
ch_vmg,12.875,11.970802,3.3137
pmo,16.425,10.909858,4.6512


![title](fig/fig2.png)

## 6.组合收益率计算

文章对组合收益率的描述非常……emmm奇怪。

- 一是组合的样本区间，文章没做具体说明则视为与因子收益率相同


- 二是构造方法，原文为“由于本文分析的是投资者情绪的收益率短期异象，参考 Liu et al． ( 2019) 的做法，本文每个交易日根据样本股票的平均换手率或非主力资金 BSI 进行排序，重新平衡投资组合的 1 /5，并持有 5 个交易日，每天对 5 个部分的收益率进行平均。”所以，在尝试构造组合收益率的时候用到两种方法：（1）周末调仓，并持有一周；（2）每日调仓，最终收益率取窗口期为5的滚动回归。经过对比最终选定为方法（2）。

In [None]:
import pandas as pd
import statsmodels.formula.api as smf
 
import warnings
warnings.filterwarnings("ignore")

def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df

def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res
#——————————————————————————————————————————————————————————————————————————————————————————————————
#
ret_day=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/ret_day.csv')
ret_day['ticker']=ret_day['ticker'].astype(str)
ret_day['ticker']=set_ticker(ret_day)
alldata=ret_day.copy()

#超额收益率
ff3=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ff3_factor.csv')
ff3['tradeDate']=pd.to_datetime(ff3['tradeDate'])
alldata=alldata.merge(ff3[['tradeDate','rf']],how='left',on='tradeDate')
alldata['rf']=alldata['rf']*100
alldata['ret']=alldata['ret']*100
alldata['ex_ret']=alldata['ret']-alldata['rf']

#剔除上市不满一年数据
listDate=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/listDate.csv')
listDate['ticker']=listDate['ticker'].astype(str)
listDate['ticker']=set_ticker(listDate)
listDate['boundDate']=(pd.to_datetime(listDate['listDate'])+pd.Timedelta(days=180)).astype(str)
alldata=alldata.merge(listDate[['ticker','boundDate']],how='left',on='ticker')
alldata['boundDate']=alldata['boundDate'].fillna('1999-12-31')
alldata=alldata[alldata['tradeDate']>pd.to_datetime(alldata['boundDate'])]

#剔除近365天有120天有交易以及股票最近30天有15天在交易
tradeState=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/tradeState.csv')
tradeState['ticker']=tradeState['ticker'].astype(str)
tradeState['ticker']=set_ticker(tradeState)
alldata=alldata.merge(tradeState,how='inner',on=['ticker','tradeDate'])

#剔除股改和st股票
exclude1=alldata[alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]
alldata=alldata[~alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]

#剔除成交量异常交易数据
tradeVol=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_tradeVol.csv')
tradeVol=tradeVol[['ticker','tradeDate','tradeVol']]
tradeVol['ticker']=set_ticker(tradeVol)
ret_exclude=alldata.merge(tradeVol,how='left',on=['ticker','tradeDate'])

def append_mean_tradeVol(df):
    df['tradeVol']=df['tradeVol'].fillna(method='ffill')
    df['mean_tradeVol']=(df['tradeVol'].rolling(20).mean()).shift(1)
    df['mean_tradeVol']=df['mean_tradeVol'].fillna(method='bfill')
    return df
ret_exclude=ret_exclude.groupby('ticker').apply(append_mean_tradeVol)
exclude2=ret_exclude[((ret_exclude['ret']>9.5)|(ret_exclude['ret']<-9.5)) & (0.1*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol'])]
ret_exclude=ret_exclude[~(((ret_exclude['ret']>9)|(ret_exclude['ret']<-9)) & (0.5*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol']))]

ret=ret_exclude[['ticker','tradeDate','ex_ret']]
ret_wide=ret.pivot(index='tradeDate',columns='ticker',values='ex_ret')
ret_wide=ret_wide[ret_wide.index>='2010-01-01']

#自由流通股
freeStock=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ticker_freeStock.csv')
freeStock['ticker']=set_ticker(freeStock)
freeStock['freeStock']=freeStock['freeStock'].str.replace(',','').astype(float)
freeStock=freeStock[freeStock['tradeDate']>'2010-01-01']
#自由流通市值
closePrice_tradeState=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_totalvalue_tradeState.csv')
closePrice_tradeState=closePrice_tradeState[['ticker','tradeDate','closePrice','tradeState']]
closePrice_tradeState['ticker']=set_ticker(closePrice_tradeState)

closePrice=closePrice_tradeState[['tradeDate','ticker','closePrice']]
closePrice_wide=closePrice.pivot(index='tradeDate',columns='ticker',values='closePrice')
closePrice_wide.fillna(method='ffill',inplace=True)
closePrice=closePrice_wide.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='closePrice')
freeVol=freeStock.merge(closePrice,how='left',on=['ticker','tradeDate'])
freeVol['freeVol']=freeVol['freeStock']*freeVol['closePrice']
freeVol.set_index('tradeDate',inplace=True)
freeVol_wide=freeVol.reset_index().pivot(index='tradeDate',columns='ticker',values='freeVol')
#剔除最小的30%
def exuclude_30(df):
    df.dropna(subset=['freeVol'],inplace=True)
    df=df[df['freeVol']!=0]
    df['rank']=df['freeVol'].rank(method='min')
    bound=df.iloc[int(len(df.sort_values('rank'))/10*3),:]['rank']
    df=df[df['rank']>=bound]
    return df.sort_values(['ticker','tradeDate'])
freeVol_ex=pd.DataFrame({})
date_list=freeVol.index.unique().sort_values()
for tradeDate in date_list:
    print(tradeDate)
    df=freeVol[freeVol.index==tradeDate]
    df=exuclude_30(df)
    freeVol_ex=pd.concat([freeVol_ex,df])
freeVol_wide_ex=freeVol_ex.reset_index().pivot(index='tradeDate',columns='ticker',values='freeVol')
freeVol_wide_ex=freeVol_wide_ex.shift(1)
#——————————————————————————————————————————————————————————————————————————————————————————————————
#换手率滞后期20平均-（原文）采用自由流通股数据
#
turnover=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover.csv')
turnover.drop(['Unnamed: 0','freeValue','tradeValue'],axis=1,inplace=True)
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
#剔除正无穷
turnover=turnover[turnover['turnover']<999]
turnover_wide=turnover.pivot(index='tradeDate',columns='ticker',values='turnover')
#剔除一字涨跌停股票
turnover_ex=ret_day.merge(turnover,how='left',on=['ticker','tradeDate'])
#exclude3=turnover_ex[(turnover_ex['ret']>0.098) | (turnover_ex['ret']<-0.098)]
#turnover_ex=turnover_ex[~(turnover_ex['ret']>0.098) | (turnover_ex['ret']<-0.098)]

turnover_wide_ex=turnover_ex.pivot(index='tradeDate',columns='ticker',values='turnover')
turnover20_wide_ex=turnover_wide_ex.rolling(20,min_periods=5).mean().shift(1)
turnover20_wide_ex.dropna(how='all',inplace=True)

#非主力资金数据
#
BSIdata=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/LrgBuySel_TolBuySel.csv')
BSIdata['ticker']=BSIdata['ticker'].astype(str)
BSIdata['ticker']=set_ticker(BSIdata)

BSIdata['B']=BSIdata['Tolbuynva']-BSIdata['LrgTrdTolbuynva']
BSIdata['S']=BSIdata['Tolsellnva']-BSIdata['LrgTrdTolsellnva']
BSIdata['BSI']=(BSIdata['B']-BSIdata['S'])/(BSIdata['B']+BSIdata['S'])

bsi=BSIdata[['ticker','tradeDate','BSI']].dropna()
bsi_wide=bsi.pivot(index='tradeDate',columns='ticker',values='BSI')
#剔除一字涨跌停股票
bsi_ex=ret_day.merge(bsi,how='left',on=['ticker','tradeDate'])
#exclude4=bsi_ex[(bsi_ex['ret']>0.098) | (bsi_ex['ret']<-0.098)]
#bsi_ex=bsi_ex[~(bsi_ex['ret']>0.098) | (bsi_ex['ret']<-0.098)]

bsi_wide_ex=bsi_ex.pivot(index='tradeDate',columns='ticker',values='BSI')
bsi20_wide_ex=bsi_wide_ex.rolling(20,min_periods=5).mean().shift(1)
bsi20_wide_ex.dropna(how='all',inplace=True)
#——————————————————————————————————————————————————————————————————————————————————————————————————
#分组函数
#
def get_protfolio_return(ret,weight,factor,groups=10):
    date_list=factor.index.sort_values()
    #其余补未来一个月数据
    if groups==10:
        col=['gL','g2','g3','g4','g5','g6','g7','g8','g9','gH']
    elif groups==5:
        col=['gL','g2','g3','g4','gH']
    fac=pd.DataFrame(columns=col,index=ret.index)
    for i in range(len(date_list)):
        date=date_list[i]
        print(date)

        total_ret=ret[ret.index==date]
        total_ret.dropna(axis=1,inplace=True)
        if (len(total_ret.columns)>11) & (len(total_ret)>0):
            resD=total_ret.reset_index().melt(id_vars='tradeDate',value_name='ex_ret',var_name='ticker')
        
            total=pd.concat([weight[weight.index==date].T,factor[factor.index==date].T],join='inner',axis=1)
            total.columns=['weight','factor']
            total.dropna(inplace=True)
            total=total[total.index.isin(total_ret.columns[:-1])]
            if len(total)>10:
                #
                total=total.sort_values('factor')
                total['rank']=total['factor'].rank(method='min')
                bound=len(total)/groups
        
                for j in range(groups):
                    ticker_list=list(total[(total['rank']>=int(j*bound)) & (total['rank']<=int((j+1)*bound))].index)
                    ret_groups=resD[resD['ticker'].isin(ticker_list)]
                    ret_groups=ret_groups.merge(total[['weight']],how='left',left_on='ticker',right_index=True)
                    ret_groups.dropna(inplace=True)
                    ret_groups['vw_ex_ret']=ret_groups['ex_ret']*ret_groups['weight']
                    g=ret_groups.groupby('tradeDate').apply(lambda x : (x['vw_ex_ret']/x['weight'].sum()).sum())
                    g=pd.DataFrame(g)
                    g.columns=[col[j]]
                    fac.update(g)
    fac.dropna(how='all',inplace=True)
    fac=fac.rolling(5).mean()
    fac.dropna(how='all',inplace=True)
    fac['gL_H']=fac['gL']-fac['gH']
    return fac

#回归函数
def regression_result(portfolio_ret,ff3_ch):
    data=portfolio_ret.merge(ff3_ch,how='inner',left_index=True,right_on='tradeDate')
    data.dropna(inplace=True)
    data.set_index('tradeDate',inplace=True)
    data=data.astype(float)
    data=data*250
    result=pd.DataFrame({})
    for i in range(len(portfolio_ret.columns)):
        group=portfolio_ret.columns[i]
        model=smf.ols('%s ~ ch_mkt_rf + ch_smb +ch_vmg'%group,data=data).fit()
        params=model.params
        bse=model.bse
        res_params=pd.DataFrame(params,columns=[group])
        res_bse=pd.DataFrame(bse,columns=[group])
        res=pd.concat([res_params,res_bse])
        result=pd.concat([result,res],axis=1)
    return result

#
ff3_ch=pd.read_csv(r'D:/项目文件/项目文件-个人/Project_IS/过程数据/ff4_ch.csv')
ff3_ch['tradeDate']=pd.to_datetime(ff3_ch['tradeDate'])
ff3_ch=ff3_ch.iloc[:,:-1]
#————————————————————————————————————————————————————————————————————————————————————————
portfolio_ret=get_protfolio_return(ret_wide,freeVol_wide_ex,turnover20_wide_ex)
result=regression_result(portfolio_ret,ff3_ch)
portfolio2_ret=get_protfolio_return(ret_wide,freeVol_wide_ex,bsi20_wide_ex)
result2=regression_result(portfolio2_ret,ff3_ch)

In [1]:
result=pd.read_csv(r'D:/项目文件/项目文件-个人/Project_IS/过程数据/result.csv')
result.round(4)

NameError: name 'pd' is not defined

![title](fig/fig3.png)
![title](fig/fig4.png)

In [2]:
result2=pd.read_csv(r'D:/项目文件/项目文件-个人/Project_IS/过程数据/result2.csv')
result2

Unnamed: 0,mean_std,gL,g2,g3,g4,g5,g6,g7,g8,g9,gH,gL_H
0,Intercept,7.735360623,2.700314903,6.255437858,6.413544594,8.631140895,4.933376363,5.600089308,0.106256866,-0.129633968,-16.69909874,24.43445936
1,ch_mkt_rf,0.243510028,0.238245631,0.232451305,0.252060853,0.263605751,0.254543871,0.265862483,0.284352121,0.261681231,0.262401425,-0.018891396
2,ch_smb,0.231872878,0.157528,0.128037576,0.122512964,0.166847218,0.117125464,0.109438276,0.108905289,0.114443469,0.119111198,0.11276168
3,ch_vmg,-0.119642018,-0.148471644,-0.145362772,-0.161494164,-0.166355251,-0.152016906,-0.159450458,-0.168218186,-0.154536876,-0.166834736,0.047192718
4,-,-,-,-,-,-,-,-,-,-,-,-
5,Intercept,4.438800698,4.201359918,4.308547127,4.425986048,4.664096632,4.64708497,4.688238912,4.912895605,4.717515339,4.710308894,3.715842477
6,ch_mkt_rf,0.011921179,0.01128349,0.01157136,0.011886763,0.012526251,0.012480563,0.012591089,0.013194444,0.012669716,0.012650362,0.009979548
7,ch_smb,0.033744669,0.031939596,0.032754455,0.033647249,0.035457415,0.035328089,0.035640949,0.037348835,0.035863515,0.03580873,0.028248593
8,ch_vmg,0.027661428,0.02618176,0.026849723,0.027581571,0.029065413,0.028959401,0.029215861,0.030615862,0.029398304,0.029353395,0.023156145


![title](fig/fig5.png)

## 附录：

In [None]:
import pandas as pd
import statsmodels.formula.api as smf
 
import warnings
warnings.filterwarnings("ignore")

def open_csv(path):
    df=pd.read_csv(path)
    df['ticker']=df['ticker'].astype(str)
    df['tradeDate']=pd.to_datetime(df['tradeDate'])
    return df

def set_ticker(df):
    def set_ticker_cross(ticker):
        n=len(ticker)
        if n<6 :
            diff=6-n
            ticker='0'*diff+ticker
        return ticker
    res=df['ticker'].apply(set_ticker_cross)
    return res
#——————————————————————————————————————————————————————————————————————————————————————————————————
#
ret_day=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/ret_day.csv')
ret_day['ticker']=ret_day['ticker'].astype(str)
ret_day['ticker']=set_ticker(ret_day)
alldata=ret_day.copy()

#超额收益率
ff3=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ff3_factor.csv')
ff3['tradeDate']=pd.to_datetime(ff3['tradeDate'])
alldata=alldata.merge(ff3[['tradeDate','rf']],how='left',on='tradeDate')
alldata['rf']=alldata['rf']*100
alldata['ret']=alldata['ret']*100
alldata['ex_ret']=alldata['ret']-alldata['rf']

#
calendar=pd.DataFrame(alldata['tradeDate'].unique(),columns=['tradeDate'])
calendar['weekdays']=calendar['tradeDate'].dt.dayofweek
calendar_end=calendar[calendar['weekdays']==4]
calendar_end=calendar_end[calendar_end['tradeDate']>='2010-01-01']
calendar_end.sort_values('tradeDate',inplace=True)
calendar_map=pd.DataFrame(range(len(calendar_end['tradeDate'])),index=calendar_end['tradeDate'].tolist(),columns=['week_num'])
calendar=calendar.merge(calendar_map,how='left',left_on='tradeDate',right_index=True)
calendar.fillna(method='bfill',inplace=True)


#剔除上市不满一年数据
listDate=pd.read_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/listDate.csv')
listDate['ticker']=listDate['ticker'].astype(str)
listDate['ticker']=set_ticker(listDate)
listDate['boundDate']=(pd.to_datetime(listDate['listDate'])+pd.Timedelta(days=180)).astype(str)
alldata=alldata.merge(listDate[['ticker','boundDate']],how='left',on='ticker')
alldata['boundDate']=alldata['boundDate'].fillna('1999-12-31')
alldata=alldata[alldata['tradeDate']>pd.to_datetime(alldata['boundDate'])]

#剔除近365天有120天有交易以及股票最近30天有15天在交易
tradeState=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/tradeState.csv')
tradeState['ticker']=tradeState['ticker'].astype(str)
tradeState['ticker']=set_ticker(tradeState)
alldata=alldata.merge(tradeState,how='inner',on=['ticker','tradeDate'])

#剔除股改和st股票
exclude1=alldata[alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]
alldata=alldata[~alldata['tradeState'].isin([2,3,4,5,6,8,9,11,12,14,15])]

#剔除成交量异常交易数据
tradeVol=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_tradeVol.csv')
tradeVol=tradeVol[['ticker','tradeDate','tradeVol']]
tradeVol['ticker']=set_ticker(tradeVol)
ret_exclude=alldata.merge(tradeVol,how='left',on=['ticker','tradeDate'])

def append_mean_tradeVol(df):
    df['tradeVol']=df['tradeVol'].fillna(method='ffill')
    df['mean_tradeVol']=(df['tradeVol'].rolling(20).mean()).shift(1)
    df['mean_tradeVol']=df['mean_tradeVol'].fillna(method='bfill')
    return df
ret_exclude=ret_exclude.groupby('ticker').apply(append_mean_tradeVol)
exclude2=ret_exclude[((ret_exclude['ret']>9.5)|(ret_exclude['ret']<-9.5)) & (0.1*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol'])]
ret_exclude=ret_exclude[~(((ret_exclude['ret']>9)|(ret_exclude['ret']<-9)) & (0.5*ret_exclude['mean_tradeVol']>ret_exclude['tradeVol']))]
#
ret=ret_exclude[['ticker','tradeDate','ex_ret']]
ret_wide=ret.pivot(index='tradeDate',columns='ticker',values='ex_ret')
c=ret_wide[ret_wide.index>='2010-01-01']
ret_wide=ret_wide.merge(calendar[['tradeDate','week_num']],how='left',left_index=True,right_on='tradeDate').set_index('tradeDate')

#自由流通股
freeStock=open_csv('D:/项目文件/项目文件-个人/Project_IS/数据清洗/ticker_freeStock.csv')
freeStock['ticker']=set_ticker(freeStock)
freeStock['freeStock']=freeStock['freeStock'].str.replace(',','').astype(float)
freeStock_end=freeStock.copy()
freeStock_end['weekdays']=freeStock_end['tradeDate'].dt.dayofweek

freeStock_end.dropna(inplace=True)
freeStock_week=freeStock_end[freeStock_end['weekdays']==4]

#自由流通市值
closePrice_tradeState=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/closePrice_totalvalue_tradeState.csv')
closePrice_tradeState=closePrice_tradeState[['ticker','tradeDate','closePrice','tradeState']]
closePrice_tradeState['ticker']=set_ticker(closePrice_tradeState)

closePrice=closePrice_tradeState[['tradeDate','ticker','closePrice']]
closePrice_wide=closePrice.pivot(index='tradeDate',columns='ticker',values='closePrice')
closePrice_wide.fillna(method='ffill',inplace=True)
closePrice=closePrice_wide.reset_index().melt(id_vars='tradeDate',var_name='ticker',value_name='closePrice')
freeVol_week=freeStock_week.merge(closePrice,how='left',on=['ticker','tradeDate'])
freeVol_week['freeVol']=freeVol_week['freeStock']*freeVol_week['closePrice']

#剔除最小的30%
def exuclude_30(df):
    df.dropna(subset=['freeVol'],inplace=True)
    df=df[df['freeVol']!=0]
    df['rank']=df['freeVol'].rank(method='min')
    bound=df.iloc[int(len(df.sort_values('rank'))/10*3),:]['rank']
    df=df[df['rank']>=bound]
    return df.sort_values(['ticker','tradeDate'])
freeVol_week_ex=pd.DataFrame({})
weekEnd_list=freeVol_week['tradeDate'].astype(str).unique().tolist()
for tradeDate in weekEnd_list:
    df=freeVol_week[freeVol_week['tradeDate']==tradeDate]
    df=exuclude_30(df)
    freeVol_week_ex=pd.concat([freeVol_week_ex,df])
    
freeVol_week_wide_ex=freeVol_week_ex.pivot(index='tradeDate',columns='ticker',values='freeVol')
freeVol_week_wide_ex=freeVol_week_wide_ex[freeVol_week_wide_ex.index>='2009-12-25']
freeVol_week_wide_ex=freeVol_week_wide_ex.shift(1).dropna(how='all')
freeVol_week_wide_ex=freeVol_week_wide_ex.merge(calendar[['tradeDate','week_num']],how='left',left_index=True,right_on='tradeDate').set_index('tradeDate')
#——————————————————————————————————————————————————————————————————————————————————————————————————
#换手率滞后期20平均-（原文）采用自由流通股数据
#
turnover=open_csv(r'D:/项目文件/项目文件-个人/Project_IS/数据清洗/turnover.csv')
turnover.drop(['Unnamed: 0','freeValue','tradeValue'],axis=1,inplace=True)
turnover['ticker']=turnover['ticker'].astype(str)
turnover['ticker']=set_ticker(turnover)
#剔除正无穷
turnover=turnover[turnover['turnover']<999]
turnover_wide=turnover.pivot(index='tradeDate',columns='ticker',values='turnover')
#剔除一字涨跌停股票
turnover_ex=ret_day.merge(turnover,how='left',on=['ticker','tradeDate'])
exclude=turnover_ex[(turnover_ex['ret']>0.098) | (turnover_ex['ret']<-0.098)]
turnover_ex=turnover_ex[~(turnover_ex['ret']>0.098) | (turnover_ex['ret']<-0.098)]

turnover_wide_ex=turnover_ex.pivot(index='tradeDate',columns='ticker',values='turnover')
turnover20_wide_ex=turnover_wide_ex.rolling(20,min_periods=5).mean().shift(1)
turnover20_wide_ex['weekdays']=turnover20_wide_ex.index.dayofweek
turnover20_week_wide_ex=turnover20_wide_ex[turnover20_wide_ex['weekdays']==4].drop('weekdays',axis=1)
turnover20_week_wide_ex.dropna(how='all',axis=1,inplace=True)
turnover20_week_wide_ex=turnover20_week_wide_ex.merge(calendar[['tradeDate','week_num']],how='left',left_index=True,right_on='tradeDate').set_index('tradeDate')
#——————————————————————————————————————————————————————————————————————————————————————————————————
#分组函数
#
def get_protfolio_return(ret,weight,factor,groups=10):
    weekNum_list=factor['week_num'].tolist()
    #其余补未来一个月数据
    if groups==10:
        col=['gL','g2','g3','g4','g5','g6','g7','g8','g9','gH']
    elif groups==5:
        col=['gL','g2','g3','g4','gH']
    fac=pd.DataFrame(columns=col,index=ret.index)
    for i in range(len(weekNum_list)):
        weekNum=weekNum_list[i]
        print(weekNum)

        total_ret=ret[ret['week_num']==i]
        total_ret.dropna(how='all',axis=1,inplace=True)
        if (len(total_ret)>4) & (len(total_ret.columns)>11):
            resW=total_ret.drop('week_num',axis=1).reset_index().melt(id_vars='tradeDate',value_name='ex_ret',var_name='ticker')
        
            total=pd.concat([weight[weight['week_num']==i].T,factor[factor['week_num']==i].T],join='inner',axis=1)
            total.columns=['weight','factor']
            total.drop('week_num',inplace=True)
            total.dropna(inplace=True)
            total=total[total.index.isin(total_ret.columns[:-1])]
            if len(total)>10:
                #
                total=total.sort_values('factor')
                total['rank']=total['factor'].rank(method='min')
                bound=len(total)/groups
        
                for j in range(groups):
                    ticker_list=list(total[(total['rank']>=int(j*bound)) & (total['rank']<=int((j+1)*bound))].index)
                    ret_groups=resW[resW['ticker'].isin(ticker_list)]
                    ret_groups=ret_groups.merge(total[['weight']],how='left',left_on='ticker',right_index=True)
                    ret_groups.dropna(inplace=True)
                    ret_groups['vw_ex_ret']=ret_groups['ex_ret']*ret_groups['weight']
                    g=ret_groups.groupby('tradeDate').apply(lambda x : (x['vw_ex_ret']/x['weight'].sum()).sum())
                    g=pd.DataFrame(g)
                    g.columns=[col[j]]
                    fac.update(g)
    return fac

#回归函数
def regression_result(portfolio_ret,ff3_ch):
    data=portfolio_ret.merge(ff3_ch,how='inner',left_index=True,right_on='tradeDate')
    data.dropna(inplace=True)
    data.set_index('tradeDate',inplace=True)
    data=data.astype(float)
    data=data*2500
    result=pd.DataFrame({})
    for i in range(len(portfolio_ret.columns)):
        group=portfolio_ret.columns[i]
        model=smf.ols('%s ~ ch_mkt_rf + ch_smb +ch_vmg'%group,data=data).fit()
        params=model.params
        bse=model.bse
        res_params=pd.DataFrame(params,columns=[group])
        res_bse=pd.DataFrame(bse,columns=[group])
        res=pd.concat([res_params,res_bse])
        result=pd.concat([result,res],axis=1)
    return result
#
ff3_ch=pd.read_csv(r'D:/项目文件/项目文件-个人/Project_IS/过程数据/ff4_ch.csv')
ff3_ch['tradeDate']=pd.to_datetime(ff3_ch['tradeDate'])
#————————————————————————————————————————————————————————————————————————————————————————
portfolio_ret_ex=get_protfolio_return(ret_wide,freeVol_week_wide_ex,turnover20_week_wide_ex)
portfolio_ret_ex['gL_H']=portfolio_ret_ex['gL']-portfolio_ret_ex['gH']
result_ex=regression_result(portfolio_ret_ex,ff3_ch)

In [4]:
result3=pd.read_csv(r'D:/项目文件/项目文件-个人/Project_IS/过程数据/result3.csv')
result3

Unnamed: 0,mean_std,gL,g2,g3,g4,g5,g6,g7,g8,g9,gH,gL_H
0,Intercept,-7.968692292,-67.99364045,-52.61521416,-61.99222905,13.52981181,27.02813911,19.43656361,62.14059649,10.26729996,-17.05194049,9.083248195
1,ch_mkt_rf,0.876727161,1.030089158,1.10813342,1.178438041,1.217782478,1.27530634,1.311180076,1.340475618,1.388117157,1.432640363,-0.555913202
2,ch_smb,-0.155458803,0.009886774,0.104353949,0.21842501,0.256864749,0.281069492,0.318336643,0.380231598,0.480630684,0.452540078,-0.607998881
3,ch_vmg,0.24822558,0.20674608,0.069411286,-0.016209978,0.002623825,-0.154806407,-0.181161368,-0.307373034,-0.398023749,-0.454786588,0.703012169
4,-,-,-,-,-,-,-,-,-,-,-,-
5,Intercept,36.33603202,36.72073006,37.02040481,35.4957996,38.59502912,40.39670245,41.34107541,42.77514355,46.59826752,62.58454911,79.41005505
6,ch_mkt_rf,0.009804534,0.009908337,0.009989198,0.009577815,0.010414079,0.010900223,0.011155043,0.011541997,0.012573589,0.016887159,0.021427178
7,ch_smb,0.027540639,0.027832218,0.028059355,0.026903791,0.02925283,0.030618396,0.031334176,0.032421118,0.035318828,0.047435517,0.060188291
8,ch_vmg,0.022479872,0.022717872,0.022903271,0.021960049,0.023877437,0.024992072,0.025576323,0.026463533,0.02882877,0.038718941,0.049128312
