# analysis

## 介绍
单因子多维度分析.从因子ic,因子收益,选股潜在收益空间三个维度给出因子评价.新增模块

## ic_stats
- ` jaqs_fxdayu.research.signaldigger.analysis.ic_stats(signal_data) `

**简要描述：**

- 因子ic分析表
- 对事件因子(数值为0/1/-1的因子)无法使用该方法

**参数:**

|字段|必选|类型|说明|
|:----    |:---|:----- |-----   |
|signal_data |是|pandas.DataFrame |trade_date+symbol为MultiIndex,columns为signal(因子)、return(持有期相对/绝对收益,必须)、upside_ret(持有期潜在最大上涨收益,非必须)、downside_ret(持有期潜在最大下跌收益,非必须)、group(分组/行业分类,非必须)、quantile(按因子值分组,非必须)|

**返回:**
因子ic分析表
* 列:
  * return_ic/upside_ret_ic/downside_ret_ic
  * 持有期收益的ic/持有期最大向上空间的ic/持有期最大向下空间的ic
  
* 行:
  *  "IC Mean", "IC Std.", "t-stat(IC)", "p-value(IC)", "IC Skew", "IC Kurtosis", "Ann. IR"
  * IC均值，IC标准差，IC的t统计量，对IC做0均值假设检验的p-value，IC偏度，IC峰度，iC的年化信息比率-mean/std


**示例：**

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import jaqs_fxdayu 
jaqs_fxdayu.patch_all()
from jaqs.data import DataView
from jaqs.research import SignalDigger

# 加载dataview数据集
dv = DataView()
dataview_folder = './data'
dv.load_dataview(dataview_folder)

# 计算signal_data(通过jaqs.research.signaldigger.digger.SignalDigger.process_signal_before_analysis(*args, **kwargs))
sd = SignalDigger()
sd.process_signal_before_analysis(signal=dv.get_ts("pe"),
                                  price=dv.get_ts("close_adj"),
                                  high=dv.get_ts("high_adj"),
                                  low=dv.get_ts("low_adj"),
                                  group=dv.get_ts("sw1"),
                                  n_quantiles=5,
                                  period=5,
                                  benchmark_price=dv.data_benchmark,
                                  )
signal_data = sd.signal_data
signal_data.head()

Dataview loaded successfully.
Nan Data Count (should be zero) : 0;  Percentage of effective data: 99%


Unnamed: 0_level_0,Unnamed: 1_level_0,signal,return,upside_ret,downside_ret,group,quantile
trade_date,symbol,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
20170503,000001.SZ,6.7925,-0.005637,-0.003045,-0.042326,480000,1
20170503,000002.SZ,10.0821,0.011225,0.016697,-0.029432,430000,1
20170503,000008.SZ,42.9544,-0.049408,0.000463,-0.092972,640000,4
20170503,000009.SZ,79.4778,-0.069822,0.009714,-0.095426,510000,5
20170503,000027.SZ,20.4542,-0.019517,0.009404,-0.041616,410000,2


In [3]:
from jaqs_fxdayu.research.signaldigger.analysis import ic_stats

ic_stats(signal_data)

Unnamed: 0,return_ic,upside_ret_ic,downside_ret_ic
IC Mean,-0.022805,0.031198,-0.2035376
IC Std.,0.207325,0.159313,0.1692702
t-stat(IC),-1.105467,1.968055,-12.08439
p-value(IC),0.27161,0.051831,2.894849e-21
IC Skew,0.009493,-0.065715,0.440791
IC Kurtosis,-0.978744,-0.639758,-0.5878823
Ann. IR,-0.109998,0.195829,-1.202442


### return_stats
- ` jaqs_fxdayu.research.signaldigger.analysis.return_stats(signal_data,is_event,period) `

**简要描述：**

- 因子收益分析表--根据因子构建几种投资组合，通过组合表现分析因子的收益能力

**参数:**

|字段|必选|类型|说明|
|:----    |:---|:----- |-----   |
|signal_data |是|pandas.DataFrame |trade_date+symbol为MultiIndex,columns为signal(因子)、return(持有期相对/绝对收益,必须)、upside_ret(持有期潜在最大上涨收益,非必须)、downside_ret(持有期潜在最大下跌收益,非必须)、group(分组/行业分类,非必须)、quantile(按因子值分组,非必须)|
|is_event |是|bool |是否是事件因子(数值为0/1/-1的因子)|
|period |是|int |换仓周期(天数),**注意:**必须与signal_data中收益的计算周期一致|

**返回:**

收益分析表
* 列:
  * long_ret/short_ret/long_short_ret/top_quantile_ret/bottom_quantile_ret/tmb_ret/all_sample_ret
  * 多头组合收益/空头组合收益/多空组合收益/因子值最大组合收益/因子值最小组合收益/因子值最大组（构建多头）+因子值最小组（构建空头）收益/全样本（无论信号大小和方向）-基准组合收益
  
* 行:
  * 't-stat', "p-value", "skewness", "kurtosis", "Ann. Ret", "Ann. Vol", "Ann. IR", "occurance"
  * 持有期收益的t统计量，对持有期收益做0均值假设检验的p-value，偏度，峰度，持有期收益年化值，年化波动率，年化信息比率-年化收益/年化波动率，样本数量


**示例：**

In [4]:
from jaqs_fxdayu.research.signaldigger.analysis import return_stats

return_stats(signal_data,is_event=False,period=5)

Unnamed: 0,long_ret,long_short_ret,top_quantile_ret,bottom_quantile_ret,tmb_ret,all_sample_ret
t-stat,-1.203846,0.411628,-4.728619,-2.714885,-0.755901,-12.043624
p-value,0.23136,0.68145,0.0,0.00665,0.4514,0.0
skewness,-0.083057,0.37368,0.495042,1.348467,-0.261998,0.546392
kurtosis,-0.555038,0.042535,6.187667,9.207208,-0.272022,6.24135
Ann. Ret,-0.101735,0.021452,-0.12994,-0.051046,-0.078894,-0.120509
Ann. Vol,0.124471,0.076759,0.330355,0.22604,0.153727,0.268994
Ann. IR,-0.817333,0.279469,-0.393336,-0.225829,-0.513207,-0.447998
occurance,106.0,106.0,6996.0,6996.0,106.0,34980.0


## space_stats
- ` jaqs_fxdayu.research.signaldigger.analysis.space_stats(signal_data,is_event) `

**简要描述：**

- 因子潜在收益空间分析表--根据因子构建几种投资组合，通过组合在换仓周期内可能达到潜在最大上涨空间、潜在最大下跌空间来分析该因子选股收益的提升潜力，用于进一步辅助设计择时方案

**参数:**

|字段|必选|类型|说明|
|:----    |:---|:----- |-----   |
|signal_data |是|pandas.DataFrame |trade_date+symbol为MultiIndex,columns为signal(因子)、return(持有期相对/绝对收益,必须)、upside_ret(持有期潜在最大上涨收益,非必须)、downside_ret(持有期潜在最大下跌收益,非必须)、group(分组/行业分类,非必须)、quantile(按因子值分组,非必须)|
|is_event |是|bool |是否是事件因子(数值为0/1/-1的因子)|

**返回:**

因子潜在收益空间分析表
* 列:
  * long_space/short_space/long_short_space/top_quantile_space/bottom_quantile_space/tmb_space/all_sample_space
  * 多头组合空间/空头组合空间/多空组合空间/因子值最大组合空间/因子值最小组合空间/因子值最大组（构建多头）+因子值最小组（构建空头）空间/全样本（无论信号大小和方向）-基准组合空间
  
* 行:
  * 'Up_sp Mean','Up_sp Std','Up_sp IR','Up_sp Pct5', 'Up_sp Pct25 ','Up_sp Pct50 ', 'Up_sp Pct75','Up_sp Pct95','Up_sp Occur','Down_sp Mean','Down_sp Std', 'Down_sp IR', 'Down_sp Pct5','Down_sp Pct25 ','Down_sp Pct50 ','Down_sp Pct75', 'Down_sp Pct95','Down_sp Occur'
  * 组合持有个股的上行空间均值，上行空间标准差，上行空间信息比率-均值/标准差，上行空间5%分位数,..25%分位数，..中位数，..75%分位数,..95%分位数，上行空间样本数，下行空间...(同上行空间)


**示例：**

In [7]:
from jaqs_fxdayu.research.signaldigger.analysis import space_stats

space_stats(signal_data,is_event=False)

Unnamed: 0,long_space,top_quantile_space,bottom_quantile_space,tmb_space,all_sample_space
Up_sp Mean,-0.091582,-0.089756,-0.016239,-0.013714,-0.026786
Up_sp Std,0.033321,0.343245,0.212997,0.017699,0.240319
Up_sp IR,-2.748454,-0.261492,-0.076242,-0.774819,-0.11146
Up_sp Pct5,-0.127152,-1.0008,-0.005893,-0.040333,-1.0008
Up_sp Pct25,-0.117286,0.002457,0.004533,-0.028591,0.005062
Up_sp Pct50,-0.101419,0.020756,0.017939,-0.013746,0.019105
Up_sp Pct75,-0.076478,0.04798,0.039831,-5.1e-05,0.041935
Up_sp Pct95,-0.031515,0.111557,0.090402,0.013496,0.098799
Up_sp Occur,106.0,6996.0,6996.0,106.0,34980.0
Down_sp Mean,-0.167327,-0.171114,-0.076042,-0.154875,-0.092512


## analysis
- ` jaqs_fxdayu.research.signaldigger.analysis.analysis(signal_data,is_event,period) `

**简要描述：**

- 同时获得因子ic分析表、收益分析表、潜在收益空间分析表——单独计算三张表的方法见上述api

**参数:**

|字段|必选|类型|说明|
|:----    |:---|:----- |-----   |
|signal_data |是|pandas.DataFrame |trade_date+symbol为MultiIndex,columns为signal(因子)、return(持有期相对/绝对收益,必须)、upside_ret(持有期潜在最大上涨收益,非必须)、downside_ret(持有期潜在最大下跌收益,非必须)、group(分组/行业分类,非必须)、quantile(按因子值分组,非必须)|
|is_event |是|bool |是否是事件因子(数值为0/1/-1的因子)|
|period |是|int |换仓周期(天数),**注意:**必须与signal_data中收益的计算周期一致|

**返回:**

由因子ic分析表、收益分析表、潜在收益空间分析表组成的字典(dict)

**示例：**

In [8]:
from jaqs_fxdayu.research.signaldigger.analysis import analysis

result = analysis(signal_data,is_event=False,period=5)
print(result.keys())
result["ic"]

dict_keys(['ic', 'ret', 'space'])


Unnamed: 0,return_ic,upside_ret_ic,downside_ret_ic
IC Mean,-0.022805,0.031198,-0.2035376
IC Std.,0.207325,0.159313,0.1692702
t-stat(IC),-1.105467,1.968055,-12.08439
p-value(IC),0.27161,0.051831,2.894849e-21
IC Skew,0.009493,-0.065715,0.440791
IC Kurtosis,-0.978744,-0.639758,-0.5878823
Ann. IR,-0.109998,0.195829,-1.202442
