# Step By Step 03 - 因子设计
----------

任何用户在进行因子设计的时候，都会需要进行因子的变换。**alpha-mind**依托**finance-python**完成因子的组合变换，极大的提升了因子设计的效率；|

## 一、基础因子获取

In [1]:
import os
from alphamind.api import *

engine = SqlEngine(os.environ['DB_URI'])
engine

<alphamind.data.engines.sqlengine.mysql.SqlEngine at 0x23093d6cb08>

In [2]:
engine.fetch_factor("2020-07-30", factors=["EMA5D"], codes=[2010000083])

Unnamed: 0,EMA5D,code,chgPct,secShortName
0,5.781778,2010000083,0.5199,广州发展


In [3]:
engine.fetch_factor("2020-07-30", factors=["EMV6D"], codes=[2010000083])

Unnamed: 0,EMV6D,code,chgPct,secShortName
0,-0.004164,2010000083,0.5199,广州发展


## 二、四则运算

In [4]:
from PyFin.api import *

In [5]:
# 支持直观的加、减、乘、除

added_factor = LAST("EMA5D") + LAST("EMV6D")
engine.fetch_factor("2020-07-30", factors={"added": added_factor}, codes=[2010000083])

Unnamed: 0,added,code,chgPct,secShortName
0,5.777614,2010000083,0.5199,广州发展


In [6]:
# 表达式可以任意长

complex_factor = LAST("EMA5D") * LAST("EMV6D") / 2 + LAST("EMV6D")
engine.fetch_factor("2020-07-30", factors={"complex": complex_factor}, codes=[2010000083])

Unnamed: 0,complex,code,chgPct,secShortName
0,-0.016202,2010000083,0.5199,广州发展


## 三、基于窗口的滚动计算

金融计算必然设计时间序列处理，而时间处理方面，必然设计滚动运算。而这方面，**finance-python**也提供了完整的支持

In [7]:
# 一个标准的滚动平均

short_window = 10
long_window = 60
sma = MACD(short=short_window, long=long_window, x="EMA5D")
engine.fetch_factor("2020-07-30", factors={"sma_EMA5D": sma}, codes=[2010000083], warm_start=long_window)

Unnamed: 0,sma_EMA5D,code,chgPct,secShortName
0,-0.036377,2010000083,0.5199,广州发展


## 四、基于截面的处理

基于金融因子分析，经常需要比较个股之间的差别。这里面涉及到的就是截面的处理；比如：沪深300个股在某个指标上面的排序等：

In [8]:
# 我们以沪深300成分股为例

universe = Universe("HS300")
codes = engine.fetch_codes("2020-07-30", universe)  # 获取沪深300成分股
codes[:5]

['2010000001', '2010000005', '2010000010', '2010000011', '2010000012']

In [9]:
cross_rank = CSRank(x="EMA5D")
engine.fetch_factor("2020-07-30", factors={"cross_rank": cross_rank}, codes=codes)

Unnamed: 0,cross_rank,code,chgPct,secShortName
0,98.0,2010000001,-1.0466,浦发银行
1,130.0,2010000005,-2.0237,白云机场
2,264.0,2010000010,-1.7115,上海机场
3,1.0,2010000011,-0.8475,包钢股份
4,33.0,2010000012,0.0000,华能国际
...,...,...,...,...
295,9.0,2010031542,-0.6536,中国广核
296,43.0,2010031616,-0.9615,渝农商行
297,24.0,2010031720,-0.9615,浙商银行
298,32.0,2010031773,-0.6438,邮储银行


## 五、全部的复合

以上所有的运算都可以相互复合

In [10]:
a_very_comple_example = CSRank(x=LAST("EMA5D") * LAST("EMV6D") / 2 + LAST("EMV6D")) + MACD(short=short_window, long=long_window, x="EMA5D")

In [11]:
engine.fetch_factor("2020-07-30", factors={"a_very_comple_example": a_very_comple_example}, codes=codes, warm_start=long_window)

Unnamed: 0,a_very_comple_example,code,chgPct,secShortName
0,132.974768,2010000001,-1.0466,浦发银行
1,87.376388,2010000005,-2.0237,白云机场
2,15.101461,2010000010,-1.7115,上海机场
3,204.043171,2010000011,-0.8475,包钢股份
4,179.264889,2010000012,0.0000,华能国际
...,...,...,...,...
295,187.073673,2010031542,-0.6536,中国广核
296,161.121885,2010031616,-0.9615,渝农商行
297,182.079833,2010031720,-0.9615,浙商银行
298,184.833899,2010031773,-0.6438,邮储银行
