## `builtin`

除`zipline`标准因子、过滤器、分类器外，增加`builtin`模块，将基础数据与标准`pipeline`整合，修改或模拟`quantopian`IDE功能。主要包括以下部分：
+ 自定义因子
+ 自定义过滤器
+ 自定义分类器

In [1]:
from zipline.pipeline.fundamentals.reader import Fundamentals
from zipline.pipeline.builtin import TradingDays,QTradableStocks
from zipline.research import run_pipeline, select_output_by
from zipline.pipeline import Pipeline
from zipline.pipeline.data import USEquityPricing
from zipline.pipeline.factors import SimpleMovingAverage,Returns

### 因子

#### `SuccessiveYZ`
连续一字板数量（涨停、跌停）

**注意**统计连续一字，而非期间一字总数

In [2]:
from zipline.pipeline.builtin import SuccessiveYZ


def make_pipeline():
    window_length = 100
    yzzt, yzdt = SuccessiveYZ()
    return Pipeline(columns={
        '期间涨幅': Returns(window_length=window_length),
        '涨停个数': yzzt,
        '跌停个数': yzdt
    })


result = run_pipeline(make_pipeline(), '2018-4-24', '2018-4-27')
select_output_by(result, assets=['603876', '603733'])

Unnamed: 0,Unnamed: 1,期间涨幅,涨停个数,跌停个数
2018-04-24 00:00:00+00:00,仙鹤股份(603733),,1.0,1.0
2018-04-24 00:00:00+00:00,鼎胜新材(603876),,1.0,1.0
2018-04-25 00:00:00+00:00,仙鹤股份(603733),,1.0,1.0
2018-04-25 00:00:00+00:00,鼎胜新材(603876),,1.0,1.0
2018-04-26 00:00:00+00:00,仙鹤股份(603733),,1.0,1.0
2018-04-26 00:00:00+00:00,鼎胜新材(603876),,1.0,1.0
2018-04-27 00:00:00+00:00,仙鹤股份(603733),,1.0,1.0
2018-04-27 00:00:00+00:00,鼎胜新材(603876),,1.0,1.0


In [3]:
result = run_pipeline(make_pipeline(), '2018-2-5', '2018-2-9')
select_output_by(result, assets=['600150', '000693', '600074'])

Unnamed: 0,Unnamed: 1,期间涨幅,涨停个数,跌停个数
2018-02-05 00:00:00+00:00,*ST华泽(000693),0.0,1.0,1.0
2018-02-05 00:00:00+00:00,*ST保千(600074),-0.723773,1.0,1.0
2018-02-05 00:00:00+00:00,*ST船舶(600150),-0.004439,1.0,1.0
2018-02-06 00:00:00+00:00,*ST华泽(000693),0.0,1.0,1.0
2018-02-06 00:00:00+00:00,*ST保千(600074),-0.737247,1.0,1.0
2018-02-06 00:00:00+00:00,*ST船舶(600150),-0.010826,1.0,1.0
2018-02-07 00:00:00+00:00,*ST华泽(000693),0.0,1.0,1.0
2018-02-07 00:00:00+00:00,*ST保千(600074),-0.750722,1.0,1.0
2018-02-07 00:00:00+00:00,*ST船舶(600150),-0.010826,1.0,1.0
2018-02-08 00:00:00+00:00,*ST华泽(000693),0.0,1.0,1.0


#### `NDays` 上市天数

In [4]:
from zipline.pipeline.builtin import NDays

In [5]:
def make_pipeline():
    ndays = NDays()
    return Pipeline(
        columns={
            '上市天数': ndays,
        }
    )

In [6]:
result = run_pipeline(make_pipeline(), '2018-1-20', '2018-1-26')

In [7]:
select_output_by(result,assets=['600645','603103','603214','603876'])

Unnamed: 0,Unnamed: 1,上市天数
2018-01-22 00:00:00+00:00,中源协和(600645),9029.0
2018-01-22 00:00:00+00:00,横店影视(603103),102.0
2018-01-23 00:00:00+00:00,中源协和(600645),9030.0
2018-01-23 00:00:00+00:00,横店影视(603103),103.0
2018-01-24 00:00:00+00:00,中源协和(600645),9031.0
2018-01-24 00:00:00+00:00,横店影视(603103),104.0
2018-01-25 00:00:00+00:00,中源协和(600645),9032.0
2018-01-25 00:00:00+00:00,横店影视(603103),105.0
2018-01-26 00:00:00+00:00,中源协和(600645),9033.0
2018-01-26 00:00:00+00:00,横店影视(603103),106.0


In [8]:
from zipline.pipeline.builtin import TradingDays

#### `TradingDays` 期间交易天数
+ 当天成交量大于0,有效交易
+ 需要指定`window_length`

In [9]:
def make_pipeline():
    t20 = TradingDays(window_length=20)
    t200 = TradingDays(window_length=200)
    return Pipeline(
        columns={
            '20天内有效交易天数': t20,
            '200天内有效交易天数': t200,           
        }, 
    )

In [10]:
result = run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

In [11]:
select_output_by(result,'2018-04-23','2018-04-24',assets=['000001','600645','600076'])

Unnamed: 0,Unnamed: 1,20天内有效交易天数,200天内有效交易天数
2018-04-23 00:00:00+00:00,平安银行(000001),20.0,200.0
2018-04-23 00:00:00+00:00,康欣新材(600076),0.0,150.0
2018-04-23 00:00:00+00:00,中源协和(600645),20.0,127.0
2018-04-24 00:00:00+00:00,平安银行(000001),20.0,200.0
2018-04-24 00:00:00+00:00,康欣新材(600076),0.0,149.0
2018-04-24 00:00:00+00:00,中源协和(600645),20.0,127.0


+ 600076期间停牌三个月，20日内无交易，但200天内存在交易
+ 600645在200天内有停牌，有效成交率不足90%
+ 000001每天正常交易

#### `SuccessiveSuspensionDays`
连续停牌天数

In [12]:
from zipline.pipeline.builtin import SuccessiveSuspensionDays

def make_pipeline():
    days_90 = SuccessiveSuspensionDays(window_length=90, include=True)
    return Pipeline(
        columns={
            '90天内停牌天数': days_90,
            '成交量': USEquityPricing.volume.latest,           
        }, 
    )

result = run_pipeline(make_pipeline(), '2018-1-1', '2018-4-27')
select_output_by(result,'2018-04-23','2018-04-24',assets=['000001','600645','600076'])

Unnamed: 0,Unnamed: 1,90天内停牌天数,成交量
2018-04-23 00:00:00+00:00,平安银行(000001),0.0,95860000.0
2018-04-23 00:00:00+00:00,康欣新材(600076),50.0,0.0
2018-04-23 00:00:00+00:00,中源协和(600645),31.0,6070000.0
2018-04-24 00:00:00+00:00,平安银行(000001),0.0,107020000.0
2018-04-24 00:00:00+00:00,康欣新材(600076),51.0,0.0
2018-04-24 00:00:00+00:00,中源协和(600645),30.0,4820000.0


### 过滤器

#### `IsST`
+ 当前是否为ST状态

In [13]:
from zipline.pipeline.builtin import IsST

In [14]:
def make_pipeline():
    is_st = IsST()
    # 用于mask参数，非st计算结果为NaN
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], mask=is_st, window_length=200)
    return Pipeline(columns={
        '平均收盘': ma20,
    })

In [15]:
result = run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

In [16]:
select_output_by(
    result, '2018-04-23', assets=['600408', '600645', '600076'])

Unnamed: 0,Unnamed: 1,平均收盘
2018-04-23 00:00:00+00:00,康欣新材(600076),
2018-04-23 00:00:00+00:00,*ST安泰(600408),
2018-04-23 00:00:00+00:00,中源协和(600645),25.04455
2018-04-24 00:00:00+00:00,康欣新材(600076),
2018-04-24 00:00:00+00:00,*ST安泰(600408),
2018-04-24 00:00:00+00:00,中源协和(600645),25.0464
2018-04-25 00:00:00+00:00,康欣新材(600076),
2018-04-25 00:00:00+00:00,*ST安泰(600408),
2018-04-25 00:00:00+00:00,中源协和(600645),25.04905
2018-04-26 00:00:00+00:00,康欣新材(600076),


In [17]:
def make_pipeline():
    is_st = IsST()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    # 用于screen参数，非st不会显示
    return Pipeline(
        columns={
            '平均收盘': ma20,
        }, 
        screen=is_st
    )

In [18]:
result = run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

In [19]:
select_output_by(
    result, '2018-04-23', assets=['600408', '600645', '600076'])

Unnamed: 0,Unnamed: 1,平均收盘
2018-04-23 00:00:00+00:00,中源协和(600645),23.159
2018-04-24 00:00:00+00:00,中源协和(600645),23.0225
2018-04-25 00:00:00+00:00,中源协和(600645),23.023
2018-04-26 00:00:00+00:00,中源协和(600645),22.9925


#### `IsNewShare` 次新股
+ days：上市天数小于指定天数，判定为次新股，默认90天

In [20]:
from zipline.pipeline.builtin import IsNewShare

In [21]:
def make_pipeline():
    ndays = NDays()
    return Pipeline(
        columns={
            '上市天数': ndays,
        }, 
        screen=IsNewShare()
    )

In [22]:
run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

Unnamed: 0,Unnamed: 1,上市天数
2018-04-20 00:00:00+00:00,华西证券(002926),74.0
2018-04-20 00:00:00+00:00,泰永长征(002927),56.0
2018-04-20 00:00:00+00:00,华夏航空(002928),49.0
2018-04-20 00:00:00+00:00,润建通信(002929),50.0
2018-04-20 00:00:00+00:00,宏川智慧(002930),23.0
2018-04-20 00:00:00+00:00,锋龙股份(002931),17.0
2018-04-20 00:00:00+00:00,天邑股份(300504),21.0
2018-04-20 00:00:00+00:00,彩讯股份(300634),28.0
2018-04-20 00:00:00+00:00,南京聚隆(300644),73.0
2018-04-20 00:00:00+00:00,科顺股份(300737),85.0


#### `QTradableStocks` 量化可交易股票

In [23]:
from zipline.pipeline.builtin import QTradableStocks

In [24]:
def make_pipeline():
    stocks = QTradableStocks()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            'ma20': ma20,
        }, 
        screen=stocks
    )

In [25]:
result = run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

In [26]:
ds = result.index.get_level_values(0).unique()

In [27]:
for d in ds:
    print('在日期为{}时，有{}只股票符合'.format(d.date(), result.loc[d].shape[0]))

在日期为2018-04-20时，有2623只股票符合
在日期为2018-04-23时，有2622只股票符合
在日期为2018-04-24时，有2626只股票符合
在日期为2018-04-25时，有2630只股票符合
在日期为2018-04-26时，有2631只股票符合


#### `TopAverageAmount` & `TAA` 平均成交额前N位
+ 平均成交额排名前N位的股票。默认前500位

In [28]:
from zipline.pipeline.builtin import TAA

def make_pipeline():
    stocks = QTradableStocks()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            'ma20': ma20,
        }, 
        screen=TAA()
    )

result = run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')
result.loc['2018-04-20'].shape == (500,1)

True

#### `IsYZZT` & `IsYZDT`
+ 默认包含ST一字板
+ 如排除，设定`include_st=False`

In [29]:
from zipline.pipeline.builtin import IsYZZT
from zipline.pipeline.factors import DailyReturns
def make_pipeline():
    dr = DailyReturns()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            '涨幅':dr,
            'ma20': ma20,
        }, 
        screen=IsYZZT()
    )
# 只输出一字涨停部分
result = run_pipeline(make_pipeline(), '2017-9-20', '2017-9-24')
# 注意其中包含600228、000403等ST股票
select_output_by(result, assets=['600228', '000403', '002893','300699'])

Unnamed: 0,Unnamed: 1,涨幅,ma20
2017-09-20 00:00:00+00:00,华通热力(002893),0.100295,13.603333
2017-09-20 00:00:00+00:00,光威复材(300699),0.099935,30.569231
2017-09-21 00:00:00+00:00,华通热力(002893),0.099866,14.305
2017-09-21 00:00:00+00:00,光威复材(300699),0.100098,32.381429
2017-09-22 00:00:00+00:00,ST生化(000403),0.050081,31.00745
2017-09-22 00:00:00+00:00,华通热力(002893),0.099939,15.054
2017-09-22 00:00:00+00:00,光威复材(300699),0.099928,34.324667
2017-09-22 00:00:00+00:00,ST昌九(600228),0.049951,10.2355


In [30]:
def make_pipeline():
    dr = DailyReturns()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            '涨幅':dr,
            'ma20': ma20,
        }, 
        # 不包含ST
        screen=IsYZZT(include_st=False)
    )
# 只输出一字涨停部分
result = run_pipeline(make_pipeline(), '2017-9-20', '2017-9-24')
# 注意，此时没有包含600228、000403等ST股票
select_output_by(result, assets=['600228', '000403', '002893','300699'])

Unnamed: 0,Unnamed: 1,涨幅,ma20
2017-09-20 00:00:00+00:00,华通热力(002893),0.100295,13.603333
2017-09-20 00:00:00+00:00,光威复材(300699),0.099935,30.569231
2017-09-21 00:00:00+00:00,华通热力(002893),0.099866,14.305
2017-09-21 00:00:00+00:00,光威复材(300699),0.100098,32.381429
2017-09-22 00:00:00+00:00,华通热力(002893),0.099939,15.054
2017-09-22 00:00:00+00:00,光威复材(300699),0.099928,34.324667


In [31]:
from zipline.pipeline.builtin import IsYZDT


def make_pipeline():
    dr = DailyReturns()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            '涨幅': dr,
            'ma20': ma20,
        },
        screen=IsYZDT())


# 只输出一字跌停部分
run_pipeline(make_pipeline(), '2018-4-24', '2018-4-26')

Unnamed: 0,Unnamed: 1,涨幅,ma20
2018-04-24 00:00:00+00:00,*ST华泽(000693),-0.049296,6.8845
2018-04-24 00:00:00+00:00,*ST三维(000755),-0.049149,6.1065
2018-04-24 00:00:00+00:00,万丰奥威(002085),-0.099919,12.348
2018-04-24 00:00:00+00:00,*ST尤夫(002427),-0.050237,17.0545
2018-04-24 00:00:00+00:00,*ST龙力(002604),-0.05,7.879
2018-04-25 00:00:00+00:00,*ST华泽(000693),-0.049383,6.5405
2018-04-25 00:00:00+00:00,万丰奥威(002085),-0.100269,12.23
2018-04-25 00:00:00+00:00,*ST尤夫(002427),-0.0499,16.201
2018-04-25 00:00:00+00:00,*ST龙力(002604),-0.050817,7.7035
2018-04-25 00:00:00+00:00,*ST船舶(600150),-0.049708,18.37


In [32]:
def make_pipeline():
    dr = DailyReturns()
    ma20 = SimpleMovingAverage(
        inputs=[USEquityPricing.close], window_length=20)
    return Pipeline(
        columns={
            '涨幅': dr,
            'ma20': ma20,
        },
        # 排除st
        screen=IsYZDT(include_st=False))


# 输出不包含St一字跌停
run_pipeline(make_pipeline(), '2018-4-24', '2018-4-26')

Unnamed: 0,Unnamed: 1,涨幅,ma20
2018-04-24 00:00:00+00:00,万丰奥威(002085),-0.099919,12.348
2018-04-25 00:00:00+00:00,万丰奥威(002085),-0.100269,12.23
2018-04-26 00:00:00+00:00,*ST华信(002018),-0.100186,5.363
2018-04-26 00:00:00+00:00,圣阳股份(002580),-0.100358,8.328


#### 有关停复牌
当日复牌

![停牌表](./images/20180426_list.png)

In [33]:
# 可用于限定范围
from zipline.pipeline.filters import StaticSids

In [34]:
from zipline.pipeline.builtin import IsResumed, SuccessiveSuspensionDays

def make_pipeline():
    #target = StaticSids([2163,600051])
    dr = DailyReturns()
    return Pipeline(
        columns={
            '涨幅': dr,
            '停牌天数':SuccessiveSuspensionDays(include=True)
        },
        screen=IsResumed())


# 只输出连续停牌后当日复牌的股票涨跌幅
run_pipeline(make_pipeline(), '2018-4-26', '2018-4-27')

Unnamed: 0,Unnamed: 1,涨幅,停牌天数
2018-04-26 00:00:00+00:00,珠海中富(000659),0.005348,1.0
2018-04-26 00:00:00+00:00,*ST华信(002018),-0.100186,24.0
2018-04-26 00:00:00+00:00,圣阳股份(002580),-0.100358,89.0
2018-04-26 00:00:00+00:00,*ST哈空(600202),-0.049342,1.0
2018-04-26 00:00:00+00:00,*ST狮头(600539),-0.037037,63.0
2018-04-26 00:00:00+00:00,苏美达(600710),0.100885,6.0
2018-04-26 00:00:00+00:00,宁波中百(600857),0.100284,2.0
2018-04-26 00:00:00+00:00,星湖科技(600866),0.101149,60.0
2018-04-26 00:00:00+00:00,渤海汽车(600960),0.002714,89.0
2018-04-26 00:00:00+00:00,*ST蓝科(601798),-0.050157,1.0


### 分类器

股票分类涉及到地区、行业、概念，为简化处理，不再动态跟踪数据，而是使用静态方式简化。每次回测时，都使用最新的分类数据。如股票在2018-2-1，所处地区为上海，后由于变更注册地，2018-4-1所处地区更改为北京。在2018-2-1回测时，该股票使用的分类数据是最新数据，即上海；而2018-4-1回测时，使用的分类数据为北京。但这会造成一个问题，即在不同的时间，以同样策略回测同一期间的数据时，结果会不一致。请注意此类差异。

分类器直接使用`Fundamentals`容器类

In [35]:
def make_pipeline():
    dqfl = Fundamentals.info.region.latest
    return Pipeline(
        columns={
            'dqfl': dqfl,
        }, 
        screen=dqfl.element_of([0,1,22])
    )

In [36]:
run_pipeline(make_pipeline(), '2018-4-20', '2018-4-26')

Unnamed: 0,Unnamed: 1,dqfl
2018-04-20 00:00:00+00:00,*ST宜化(000422),22
2018-04-20 00:00:00+00:00,鄂武商Ａ(000501),22
2018-04-20 00:00:00+00:00,长航凤凰(000520),22
2018-04-20 00:00:00+00:00,云南白药(000538),1
2018-04-20 00:00:00+00:00,沙隆达Ａ(000553),22
2018-04-20 00:00:00+00:00,我爱我家(000560),1
2018-04-20 00:00:00+00:00,天茂集团(000627),22
2018-04-20 00:00:00+00:00,湖北广电(000665),22
2018-04-20 00:00:00+00:00,盈方微(000670),22
2018-04-20 00:00:00+00:00,襄阳轴承(000678),22


In [37]:
# 通过查询获取分类编码的含义
Fundamentals.region_cname(0)

'上海其它'

In [38]:
Fundamentals.region_cname(22)

'湖北'