<a href="https://www.quantrocket.com"><img alt="QuantRocket logo" src="https://www.quantrocket.com/assets/img/notebook-header-logo.png"></a>

<a href="https://www.quantrocket.com/disclaimer/">Disclaimer</a>

# Define a Base Universe

Before researching specific factors, we will define a base universe. We don't want to include certain securities such as ETFs and ADRS in any of our subsequent analysis, and by defining a base universe in a separate file, we can import and use the definition in our notebooks without having to re-define the universe rules in each notebook. 

The base universe will still be quite broad, for two reasons. First, we can always add more rules to the base rules in any given notebook to narrow the universe. Second, using a broad universe will help us see how factors behave across the US equities market, even if we subsequently wish to narrow the universe for trading or further analysis.   

## Explore Sharadar Categories

Different types of securities are categorized in the `sharadar_Category` field of the securities master database. Let's query all Sharadar records in the securities master database and group by `sharadar_Category` to see a breakdown of security types:

In [1]:
from quantrocket.master import get_securities
securities = get_securities(vendors="sharadar", fields=["Symbol", "sharadar_Category"])

securities.groupby("sharadar_Category").Symbol.count()

sharadar_Category
ADR                                          2
ADR Common Stock                          2052
ADR Common Stock Primary Class             138
ADR Common Stock Secondary Class            95
ADR Preferred                                6
ADR Preferred Stock                         96
ADR Stock Warrant                          145
CEF                                       1075
Canadian                                     1
Canadian Common Stock                      368
Canadian Common Stock Primary Class         10
Canadian Common Stock Secondary Class        3
Canadian Preferred Stock                     3
Canadian Stock Warrant                       8
Domestic                                    76
Domestic Common Stock                    13625
Domestic Common Stock Primary Class       1177
Domestic Common Stock Secondary Class     1081
Domestic Preferred                          45
Domestic Preferred Stock                  1142
Domestic Primary                          

We will focus on domestic common stocks. Since some companies have multiple share classes, we will exclude "Domestic Common Stock Secondary Class". The following Pipeline expression will satisfy these requirements:

In [2]:
from zipline.pipeline import master

category = master.SecuritiesMaster.sharadar_Category.latest
base_universe = (
    # domestic common stocks
    category.has_substring("Domestic Common")
    # no secondary shares
    & ~category.has_substring("Secondary")
)

## Liquidity Filter

Even though we want our base universe to be broad and include companies of all sizes, it is still important to add a basic liquidity filter. We will limit the universe to stocks that have had at least some trading volume on each trading day of the past month (approximately 21 trading days). Stocks that have zero trading volume are not only untradable but are also more likely to have suspect prices that can cause unexpected results in Alphalens tear sheets and other analyses.  

In [3]:
from zipline.pipeline import EquityPricing

base_universe = (EquityPricing.volume.latest > 0).all(21, mask=base_universe)

## Penny Stock Filter

In addition to the liquidity filter, we will also add a rule to filter out penny stocks by requiring that the closing price must be above $1.00 for 21 consecutive days. Penny stocks often undergo dramatic price jumps and price drops that, if included in the analysis, can bias the results and make it harder to interpret overall factor performance.    

In [4]:
base_universe = (EquityPricing.close.latest > 1.00).all(21, mask=base_universe)

## Helper file

To be able to reuse the base universe, we put it in a separate file, [universe.py](universe.py). The universe can be imported and used as follows:

In [5]:
from codeload.fundamental_factors.universe import BaseUniverse

universe = BaseUniverse()

***

## *Next Up*

Lesson 3: [Basic Usage](Lesson03-Basic-Usage.ipynb)