In [1]:
from quantopian.pipeline import Pipeline
from quantopian.research import run_pipeline
from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.pipeline.factors import SimpleMovingAverage

##Filters
A Filter is a function from an asset and a moment in time to a boolean:
```
F(asset, timestamp) -> boolean
```
In Pipeline, [Filters](https://www.quantopian.com/help#quantopian_pipeline_filters_Filter) are used for narrowing down the set of securities included in a computation or in the final output of a pipeline. There are two common ways to create a `Filter`: comparison operators and `Factor`/`Classifier` methods.

###Comparison Operators
Comparison operators on `Factors` and `Classifiers` produce Filters. Since we haven't looked at `Classifiers` yet, let's stick to examples using `Factors`. The following example produces a filter that returns `True` whenever the latest close price is above $100.

In [13]:
last_close_price = USEquityPricing.close.latest

def make_pipeline():
    close_price_filter = last_close_price > 100
    
    return Pipeline(columns = {'close price filter': close_price_filter})

run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')



Unnamed: 0,Unnamed: 1,close price filter
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),False
2017-12-27 00:00:00+00:00,Equity(21 [AAME]),False
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),True
2017-12-27 00:00:00+00:00,Equity(25 [HWM_PR]),False
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),False
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),False
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),False
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),True
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),False
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),False


And this example produces a filter that returns True whenever the 15-day mean is below the 60-day mean.

In [14]:
def make_pipeline():
    mean_close_15 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=15)
    mean_close_60 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=60)
    mean_crossover_filter = mean_close_15 < mean_close_60
    
    return Pipeline(columns = {'mean crossover filter': mean_crossover_filter})

run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')



Unnamed: 0,Unnamed: 1,mean crossover filter
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),False
2017-12-27 00:00:00+00:00,Equity(21 [AAME]),False
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),False
2017-12-27 00:00:00+00:00,Equity(25 [HWM_PR]),False
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),False
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),False
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),True
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),False
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),False
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),True


Remember, each security will get its own `True` or `False` value each day.

###Factor/Classifier Methods
Various methods of the `Factor` and `Classifier` classes return `Filters`. Again, since we haven't yet looked at `Classifiers`, let's stick to `Factor` methods for now (we'll look at `Classifier` methods later). The `Factor.top(n)` method produces a `Filter` that returns `True` for the top `n` securities of a given `Factor`. The following example produces a filter that returns `True` for exactly 150 securities every day, indicating that those securities were in the top 150 by last close price across all known securities.

In [18]:
def make_pipeline():
    last_close_price = USEquityPricing.close.latest
    top_close_price_filter = last_close_price.top(150)
    
    return Pipeline(columns = {'top close price': top_close_price_filter})

run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')



Unnamed: 0,Unnamed: 1,top close price
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),False
2017-12-27 00:00:00+00:00,Equity(21 [AAME]),False
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),False
2017-12-27 00:00:00+00:00,Equity(25 [HWM_PR]),False
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),False
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),False
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),False
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),True
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),False
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),False


For a full list of `Factor` methods that return `Filters`, see [this link](https://www.quantopian.com/help#quantopian_pipeline_factors_Factor).

For a full list of `Classifier` methods that return `Filters`, see [this link](https://www.quantopian.com/help#quantopian_pipeline_classifiers_Classifier).

##Dollar Volume Filter
As a starting example, let's create a filter that returns `True` if a security's 5-month average dollar volume is above $50,000,000. To do this, we'll first need to create an `AverageDollarVolume` factor to compute the 5-month average dollar volume. Let's include the built-in `AverageDollarVolume` factor in our imports:

In [19]:
from quantopian.pipeline.factors import AverageDollarVolume

And then, let's instantiate our average dollar volume factor.

In [20]:
dollar_volume = AverageDollarVolume(window_length=150)

By default, `AverageDollarVolume` uses `USEquityPricing.close` and `USEquityPricing.volume` as its `inputs`, so we don't specify them.

Now that we have a dollar volume factor, we can create a filter with a boolean expression. The following line creates a filter returning `True` for securities with a `dollar_volume` greater than 50,000,000:

In [22]:
high_dollar_volume = (dollar_volume > 50000000)

To see what this filter looks like, let's can add it as a column to the pipeline we defined in the previous lesson.

In [23]:
def make_pipeline():

    mean_close_60 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=60)
    mean_close_30 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30)

    percent_difference = (mean_close_30 - mean_close_60) / mean_close_60
    
    dollar_volume = AverageDollarVolume(window_length=150)
    high_dollar_volume = (dollar_volume > 50000000)

    return Pipeline(
        columns={
            'percent_difference': percent_difference,
            'high_dollar_volume': high_dollar_volume
        }
    )

If we make and run our pipeline, we now have a column `high_dollar_volume` with a boolean value corresponding to the result of the expression for each security.

In [25]:
result = run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')
result.head(10)



Unnamed: 0,Unnamed: 1,high_dollar_volume,percent_difference
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),True,-0.023961
2017-12-27 00:00:00+00:00,Equity(21 [AAME]),False,0.025186
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),True,0.032539
2017-12-27 00:00:00+00:00,Equity(25 [HWM_PR]),False,0.004791
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),False,0.01875
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),False,0.049752
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),False,-0.013418
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),False,0.035248
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),True,0.006952
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),True,-0.048604


##Applying a Screen
By default, a pipeline produces computed values each day for every asset in the Quantopian database. Very often however, we only care about a subset of securities that meet specific criteria (for example, we might only care about securities that have enough daily trading volume to fill our orders quickly). We can tell our Pipeline to ignore securities for which a filter produces `False` by passing that filter to our Pipeline via the `screen` keyword.

To screen our pipeline output for securities with a 2-month average dollar volume greater than $20,000,000, we can simply pass our `high_dollar_volume` filter as the `screen` argument. This is what our `make_pipeline` function now looks like:

In [26]:
def make_pipeline():

    mean_close_60 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=60)
    mean_close_30 = SimpleMovingAverage(inputs=[USEquityPricing.close], window_length=30)

    percent_difference = (mean_close_30 - mean_close_60) / mean_close_60

    dollar_volume = AverageDollarVolume(window_length=60)
    high_dollar_volume = dollar_volume > 20000000

    return Pipeline(
        columns={'percent_difference': percent_difference},
        screen=high_dollar_volume)

When we run this, the pipeline output only includes securities that pass the `high_dollar_volume` filter on a given day. For example, running this pipeline on May 5th, 2015 results in an output for ~2,100 securities

In [27]:
result = run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27')
print 'Number of securities that passed the filter: %d' % len(result)
result.head(20)



Number of securities that passed the filter: 1511


Unnamed: 0,Unnamed: 1,percent_difference
2017-12-27 00:00:00+00:00,Equity(2 [HWM]),-0.023961
2017-12-27 00:00:00+00:00,Equity(24 [AAPL]),0.032539
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),0.035248
2017-12-27 00:00:00+00:00,Equity(62 [ABT]),0.006952
2017-12-27 00:00:00+00:00,Equity(64 [GOLD]),-0.048604
2017-12-27 00:00:00+00:00,Equity(67 [ADSK]),-0.025277
2017-12-27 00:00:00+00:00,Equity(76 [TAP]),-0.009901
2017-12-27 00:00:00+00:00,Equity(114 [ADBE]),0.034462
2017-12-27 00:00:00+00:00,Equity(122 [ADI]),-0.008602
2017-12-27 00:00:00+00:00,Equity(128 [ADM]),-0.022268


##Inverting a Filter
The `~` operator is used to invert a filter, swapping all `True` values with `Falses` and vice-versa. For example, we can write the following to filter for low dollar volume securities:

In [29]:
def make_pipeline():
    low_dollar_volume = ~high_dollar_volume
    
    return Pipeline(columns = {'low dollar volume': low_dollar_volume}, screen = low_dollar_volume)

run_pipeline(make_pipeline(), '2017-12-27', '2017-12-27').head(20)



Unnamed: 0,Unnamed: 1,low dollar volume
2017-12-27 00:00:00+00:00,Equity(21 [AAME]),True
2017-12-27 00:00:00+00:00,Equity(25 [HWM_PR]),True
2017-12-27 00:00:00+00:00,Equity(31 [ABAX]),True
2017-12-27 00:00:00+00:00,Equity(41 [ARCB]),True
2017-12-27 00:00:00+00:00,Equity(52 [ABM]),True
2017-12-27 00:00:00+00:00,Equity(53 [ABMD]),True
2017-12-27 00:00:00+00:00,Equity(66 [AB]),True
2017-12-27 00:00:00+00:00,Equity(70 [VBF]),True
2017-12-27 00:00:00+00:00,Equity(84 [ACET]),True
2017-12-27 00:00:00+00:00,Equity(100 [IEP]),True


This will return `True` for all securities with an average dollar volume below or equal to $20,000,000 over the last 60 days.

In the next lesson, we will look at combining filters.