In [1]:
from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import SimpleMovingAverage, AverageDollarVolume

##Putting It All Together
Now that we've covered the basic components of the Pipeline API, let's construct a pipeline that we might want to use in an algorithm.

To start, let's first create a filter to narrow down the types of securities coming out of our pipeline. In this example, we will create a filter to select for securities that meet all of the following criteria:
- Is a primary share
- Is listed as a common stock
- Is not a [depositary receipt](http://www.investopedia.com/terms/d/depositaryreceipt.asp) (ADR/GDR)
- Is not trading [over-the-counter](http://www.investopedia.com/terms/o/otc.asp) (OTC)
- Is not [when-issued](http://www.investopedia.com/terms/w/wi.asp) (WI)
- Doesn't have a name indicating it's a [limited partnership](http://www.investopedia.com/terms/l/limitedpartnership.asp) (LP)
- Doesn't have a company reference entry indicating it's a LP
- Is not an [ETF](http://www.investopedia.com/terms/e/etf.asp) (has Morningstar fundamental data)


####Why These Criteria?
Selecting for primary shares and common stock helps us to select only a single security for each company. In general, primary shares are a good representative asset of a company so we will select for these in our pipeline.

ADRs and GDRs are issuances in the US equity market for stocks that trade on other exchanges. Frequently, there is inherent risk associated with depositary receipts due to currency fluctuations so we exclude them from our pipeline.

OTC, WI, and LP securities are not tradeable with most brokers. As a result, we exclude them from our pipeline.

###Creating Our Pipeline
Let's create a filter for each criterion and combine them together to create a `tradeable_stocks` filter. First, we need to import the Morningstar `DataSet` as well as the `IsPrimaryShare` builtin filter.

In [2]:
from quantopian.pipeline.data import Fundamentals
from quantopian.pipeline.filters.fundamentals import IsPrimaryShare

Now we can define our filters:

In [3]:
# Filter for primary share equities. IsPrimaryShare is a built-in filter.
primary_share = IsPrimaryShare()

# Equities listed as common stock (as opposed to, say, preferred stock).
# 'ST00000001' indicates common stock.
common_stock = Fundamentals.security_type.latest.eq('ST00000001')

# Non-depositary receipts. Recall that the ~ operator inverts filters,
# turning Trues into Falses and vice versa
not_depositary = ~Fundamentals.is_depositary_receipt.latest

# Equities not trading over-the-counter.
not_otc = ~Fundamentals.exchange_id.latest.startswith('OTC')

# Not when-issued equities.
not_wi = ~Fundamentals.symbol.latest.endswith('.WI')

# Equities without LP in their name, .matches does a match using a regular
# expression
not_lp_name = ~Fundamentals.standard_name.latest.matches('.* L[. ]?P.?$')

# Equities with a null value in the limited_partnership Morningstar
# fundamental field.
not_lp_balance_sheet = Fundamentals.limited_partnership.latest.isnull()

# Equities whose most recent Morningstar market cap is not null have
# fundamental data and therefore are not ETFs.
have_market_cap = Fundamentals.market_cap.latest.notnull()

# Filter for stocks that pass all of our previous filters.
tradeable_stocks = (primary_share & common_stock & not_depositary & not_otc 
                    & not_wi & not_lp_name & not_lp_balance_sheet & have_market_cap)

Note that when defining our filters, we used several `Classifier` methods that we haven't yet seen including `notnull`, `startswith`, `endswith`, and `matches`. Documentation on these methods is available [here](https://www.quantopian.com/help#quantopian_pipeline_classifiers_Classifier).

Next, let's create a filter for the top 10% of tradeable stocks by 30-day average dollar volume. We'll call this our `base_universe`.

In [7]:
def make_pipeline():
    base_universe = AverageDollarVolume(window_length=30, mask=tradeable_stocks).percentile_between(90, 100)
    
    return Pipeline(columns = {'base universe': base_universe}, screen = base_universe)

print("Number of securities that passed the filter: %d" % len(run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')))
run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27').head(20)



Number of securities that passed the filter: 418





After March 31, 2018, this field will no longer be active, and new data will contain only NaNs.

We recommend that that you remove 'limited_partnership' from your code.


Unnamed: 0,Unnamed: 1,base universe
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),True
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),True
2017-12-27 00:00:00+00:00,Equity(67 [ADSK]),True
2017-12-27 00:00:00+00:00,Equity(114 [ADBE]),True
2017-12-27 00:00:00+00:00,Equity(122 [ADI]),True
2017-12-27 00:00:00+00:00,Equity(128 [ADM]),True
2017-12-27 00:00:00+00:00,Equity(161 [AEP]),True
2017-12-27 00:00:00+00:00,Equity(168 [AET]),True
2017-12-27 00:00:00+00:00,Equity(185 [AFL]),True
2017-12-27 00:00:00+00:00,Equity(216 [HES]),True


####Built-in Base Universe

We have just defined our own base universe to select 'tradeable' securities with high dollar volume. However, Quantopian has several built-in filters that do something similar, the best and newest of which is the [QTradableStocksUS](https://www.quantopian.com/help#quantopian_pipeline_filters_QTradableStocksUS). The QTradableStocksUS is a built-in pipeline filter that selects a daily universe of stocks that are filtered in three passes and adhere to a set of criteria to yield the most liquid universe possible without any size constraints. The QTradableStocksUS therefore has no size cutoff unlike its predecessors, the [Q500US](https://www.quantopian.com/help#quantopian_pipeline_filters_Q500US) and the [Q1500US](https://www.quantopian.com/help#quantopian_pipeline_filters_Q1500US). More detail on the selection criteria of the QTradableStocksUS can be found [here](https://www.quantopian.com/posts/working-on-our-best-universe-yet-qtradablestocksus).

To simplify our pipeline, let's replace what we've already written for our `base_universe` with the `QTradableStocksUS` built-in filter. First, we need to import it.

In [8]:
from quantopian.pipeline.filters import QTradableStocksUS

Then, let's set our base_universe to the `QTradableStocksUS`.

In [11]:
def make_pipeline():
    base_universe = QTradableStocksUS()
    
    return Pipeline(columns = {'QTradableStockUS': base_universe}, screen = base_universe)

run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27').head(10)



Unnamed: 0,Unnamed: 1,QTradableStockUS
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),True
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),True
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),True
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),True
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),True
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),True
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),True
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),True
2017-12-27 00:00:00+00:00,Equity(67 [ADSK]),True
2017-12-27 00:00:00+00:00,Equity(76 [TAP]),True


Now that we have a filter `base_universe` that we can use to select a subset of securities, let's focus on creating factors for this subset. For this example, let's create a pipeline for a mean reversion strategy. In this strategy, we'll look at the 30-day and 60-day moving averages (close price). Let's plan to open equally weighted long positions in the 50 securities with the least (most negative) percent difference and equally weighted short positions in the 50 with the greatest percent difference. To do this, let's create two moving average factors using our `base_universe` filter as a mask. Then let's combine them into a factor computing the percent difference.

In [12]:
# 30-day close price average.
mean_30 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30, mask=base_universe)

# 60-day close price average.
mean_60 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=60, mask=base_universe)

percent_difference = (mean_30 - mean_60) / mean_60

Next, let's create filters for the top 50 and bottom 50 equities by `percent_difference`.

In [13]:
# Create a filter to select securities to short.
shorts = percent_difference.top(50)

# Create a filter to select securities to long.
longs = percent_difference.bottom(50)

Let's then combine `shorts` and `longs` to create a new filter that we can use as the screen of our pipeline:

In [14]:
securities_to_trade = (shorts | longs)

Since our earlier filters were used as masks as we built up to this final filter, when we use `securities_to_trade` as a screen, the output securities will meet the criteria outlined at the beginning of the lesson (primary shares, non-ETFs, etc.). They will also have high dollar volume.

Finally, let's instantiate our pipeline. Since we are planning on opening equally weighted long and short positions later, the only information that we actually need from our pipeline is which securities we want to trade (the pipeline index) and whether or not to open a long or a short position. Let's add our `longs` and `shorts` filters to our pipeline and set our screen to be `securities_to_trade`.

In [15]:
def make_pipeline():
    
    # Base universe filter.
    base_universe = QTradableStocksUS()
    
    # 10-day close price average.
    mean_30 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30, mask=base_universe)

    # 30-day close price average.
    mean_60 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=60, mask=base_universe)

    # Percent difference factor.
    percent_difference = (mean_30 - mean_60) / mean_60
    
    # Create a filter to select securities to short.
    shorts = percent_difference.top(50)

    # Create a filter to select securities to long.
    longs = percent_difference.bottom(50)
    
    # Filter for the securities that we want to trade.
    securities_to_trade = (shorts | longs)
    
    return Pipeline(columns={'longs': longs, 'shorts': shorts}, screen=securities_to_trade)

Running this pipeline will result in a DataFrame containing 2 columns. Each day, the columns will contain boolean values that we can use to decide whether we want to open a long or short position in each security.

In [16]:
result = run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')
result.head(20)



Unnamed: 0,Unnamed: 1,longs,shorts
2017-12-27 00:00:00+00:00,Equity(915 [BKE]),False,True
2017-12-27 00:00:00+00:00,Equity(2069 [FTR]),True,False
2017-12-27 00:00:00+00:00,Equity(2460 [EFII]),True,False
2017-12-27 00:00:00+00:00,Equity(2614 [ESL]),True,False
2017-12-27 00:00:00+00:00,Equity(3150 [INO]),True,False
2017-12-27 00:00:00+00:00,Equity(3585 [HL]),True,False
2017-12-27 00:00:00+00:00,Equity(4265 [KEM]),True,False
2017-12-27 00:00:00+00:00,Equity(4413 [AXGN]),False,True
2017-12-27 00:00:00+00:00,Equity(4564 [LB]),False,True
2017-12-27 00:00:00+00:00,Equity(4751 [MDP]),False,True


In the next lesson, we'll add this pipeline to an algorithm.