<a href="https://www.quantrocket.com"><img alt="QuantRocket logo" src="https://www.quantrocket.com/assets/img/notebook-header-logo.png"></a><br>
<a href="https://www.quantrocket.com/disclaimer/">Disclaimer</a>

# Universe Selection

As a reminder, QVAL stipulates the following universe: 

* all NYSE stocks
* exclude financials, ADRs, REITs

To use Sharadar fundamentals with IB price data, we first need to re-create our universes in the IB master database (`quantrocket.master.main.sqlite`), since we previously created the universes in the Sharadar master database (`quantrocket.master.sharadar.sqlite`).

# Collect NYSE listings

The first step is to collect NYSE listing details from IB. First, start IB Gateway:

In [1]:
from quantrocket.launchpad import start_gateways
start_gateways(wait=True)

{'ibg1': {'status': 'running'}}

Then collect the NYSE stock listings:

In [2]:
from quantrocket.master import collect_listings
collect_listings(exchanges="NYSE", sec_types="STK")

{'status': 'the listing details will be collected asynchronously'}

Monitor flightlog for the completion message:

```
quantrocket.master: INFO Collecting NYSE STK listings from IB website
quantrocket.master: INFO Requesting details for 8520 NYSE listings found on IB website (expected runtime: 0:20:09)
quantrocket.master: INFO Saved 3111 NYSE listings to securities master database
```

## Create Universes

### All NYSE securities

First, download a CSV of all NYSE securities from the IB master: 

In [3]:
from quantrocket.master import download_master_file
download_master_file("nyse_securities.csv", exchanges="NYSE")

We can use the file to create the universe of all NYSE securities:

In [4]:
from quantrocket.master import create_universe
create_universe("nyse-stk", "nyse_securities.csv")

{'code': 'nyse-stk',
 'provided': 3390,
 'inserted': 3390,
 'total_after_insert': 3390}

### Financials

Next we create a universe of financials so we can exclude them.

First load the securities into Pandas and list the sectors:

In [5]:
import pandas as pd
nyse_securities = pd.read_csv("nyse_securities.csv")
nyse_securities.Sector.unique()

array(['Consumer, Cyclical', 'Consumer, Non-cyclical', 'Financial', nan,
       'Basic Materials', 'Utilities', 'Industrial', 'Communications',
       'Energy', 'Government', 'Technology', 'Diversified'], dtype=object)

In the IB data, the financial sector is called "Financial". We filter the DataFrame to stocks in this sector, write them to a file (we use an in-memory file so as not to clutter the hard drive), and upload the file to create the universe of financial stocks:

In [6]:
import io
f = io.StringIO()
nyse_securities[nyse_securities.Sector == "Financial"].to_csv(f)
create_universe("nyse-financials", f)

{'code': 'nyse-financials',
 'provided': 1035,
 'inserted': 1035,
 'total_after_insert': 1035}

## REITS

Next we create a universe of REITs. From inspecting the master file we know that REITs are identified in the "Industry" column:

> In the IB data, all REITS are actually categorized under the Financial sector, meaning that REITS would be excluded when we exclude financials, even if we didn't create a separate REIT universe.

In [7]:
f = io.StringIO()
nyse_securities[nyse_securities.Industry.fillna("").str.contains("REIT")].to_csv(f)
create_universe("nyse-reits", f)

{'code': 'nyse-reits',
 'provided': 393,
 'inserted': 393,
 'total_after_insert': 393}

## ADRs

To find ADRs in the IB master file, we have to search the `LongName` field for the text "ADR". 

> Note the space in front of " ADR" in the below search, which is intended to prevent matching a word that ends with "ADR". Consider a regex search for finer-grained matching.

First have a peek:

In [8]:
adrs = nyse_securities[nyse_securities.LongName.str.contains(" ADR")]
adrs[["Symbol","LongName"]].head()

Unnamed: 0,Symbol,LongName
22,AMX,AMERICA MOVIL-SPN ADR CL L
35,AU,ANGLOGOLD ASHANTI-SPON ADR
44,BBVA,BANCO BILBAO VIZCAYA-SP ADR
49,BCS,BARCLAYS PLC-SPONS ADR
60,BHP,BHP BILLITON LTD-SPON ADR


Then create the ADR universe:

In [9]:
f = io.StringIO()
adrs.to_csv(f)
create_universe("nyse-adrs", f)

{'code': 'nyse-adrs',
 'provided': 144,
 'inserted': 144,
 'total_after_insert': 144}

***

## *Next Up*

Part 2: [Collect IB Historical Data](Part2-Collect-IB-Historical-Data.ipynb)