<img alt="QuantRocket logo" src="https://www.quantrocket.com/assets/img/notebook-header-logo.png">

# Define a universe

QuantRocket relies heavily on the concept of universes, which are user-defined groupings of securities. Universes provide a convenient way to refer to and manipulate groups of securities when collecting historical data, running a trading strategy, etc. You can create universes based on exchanges, security types, sectors, liquidity, or any criteria you like. A universe could consist of one or two securities or one or two thousand securities.

## Download master file
To create our first universe, we will download a CSV of all the listings for our exchange, pare down the CSV to a smaller number of listings, then upload the pared down CSV to create our universe. The usage guide outlines [several other ways to create universes](https:www.quantrocket.com/docs/#universe).

First download the listings from the securities master database to a CSV file (substituting your exchange for NASDAQ):

In [1]:
from quantrocket.master import download_master_file
download_master_file("securities.csv", exchanges="NASDAQ", sec_types="STK")

> In QuantRocket terminology, the word "collect" refers to retrieving data from IB and saving it to your QuantRocket databases. The word "download" refers to retrieving data out of your QuantRocket databases into a file for use by you or your algorithms.

We can load the CSV into pandas:

In [2]:
import pandas as pd
securities = pd.read_csv("securities.csv")
securities.head()

Unnamed: 0,ConId,Symbol,Etf,SecType,PrimaryExchange,Currency,LocalSymbol,TradingClass,MarketName,LongName,...,UnderSymbol,UnderSecType,MarketRuleIds,Strike,Right,Cusip,EvRule,EvMultiplier,Delisted,DateDelisted
0,4157,ADI,0,STK,NASDAQ,USD,ADI,NMS,NMS,ANALOG DEVICES INC,...,,,"26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,2...",0.0,,ISIN:US0326541051,,0.0,0,
1,4391,AMD,0,STK,NASDAQ,USD,AMD,NMS,NMS,ADVANCED MICRO DEVICES,...,,,"26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,2...",0.0,,ISIN:US0079031078,,0.0,0,
2,4661,ADP,0,STK,NASDAQ,USD,ADP,NMS,NMS,AUTOMATIC DATA PROCESSING,...,,,"26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,2...",0.0,,ISIN:US0530151036,,0.0,0,
3,4691,AVT,0,STK,NASDAQ,USD,AVT,NMS,NMS,AVNET INC,...,,,"26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,2...",0.0,,ISIN:US0538071038,,0.0,0,
4,5552,CDNS,0,STK,NASDAQ,USD,CDNS,NMS,NMS,CADENCE DESIGN SYS INC,...,,,"26,26,26,26,26,26,26,26,26,26,26,26,26,26,26,2...",0.0,,ISIN:US1273871087,,0.0,0,


Note the `ConId` column in the CSV file: ConId is short for "contract ID" and is IB's unique identifier for a particular security or contract. ConIds are used throughout QuantRocket to refer to securities.

## Filter master file

QuantRocket supports working with large universes such as every stock on an exchange. However, for this introductory tutorial we will pare down the master file to create a modestly sized universe. This will keep the tutorial fast and simple as well as help illustrate the flexibility of universe creation; you can create larger universes later. 

To pare down the master file we'll use `qgrid`, a tool that provides Excel-like filtering and sorting of DataFrames inside Jupyter notebooks. We limit the number of columns to make the grid more readable:

In [None]:
import qgrid
widget = qgrid.show_grid(securities[["ConId","Symbol","LongName","Sector","Industry","Category"]])
widget

> (this is an image of a grid, execute the above cell to see the actual grid)

![QGrid widget](static/qgrid-widget.png)

Use the grid to filter the DataFrame by symbol, name, or sector. You can hand-pick a list of symbols, select a sector and industry, or choose a random range of conids.

When you've filtered the grid to a smaller size (say 50-100 securities), use `get_changed_df()` to access the filtered DataFrame:

In [4]:
filtered_securities = widget.get_changed_df()
filtered_securities.head()

Unnamed: 0,ConId,Symbol,LongName,Sector,Industry,Category
24,265598,AAPL,APPLE INC,Technology,Computers,Computers
70,267892,COKE,COCA-COLA BOTTLING CO CONSOL,"Consumer, Non-cyclical",Beverages,Beverages-Non-alcoholic
226,274105,SBUX,STARBUCKS CORP,"Consumer, Cyclical",Retail,Retail-Restaurants
331,3691937,AMZN,AMAZON.COM INC,Communications,Internet,E-Commerce/Products
766,15124833,NFLX,NETFLIX INC,Communications,Internet,Internet Content-Entmnt


## Create universe

To create a universe from the filtered securities, we write the DataFrame to a CSV and upload the CSV. (Only the ConId column in the CSV matters for this purpose; other columns are ignored.) We'll name the universe "demo-stocks":

In [5]:
filtered_securities.to_csv("filtered_securities.csv")

In [6]:
from quantrocket.master import create_universe
create_universe("demo-stocks", infilepath_or_buffer="filtered_securities.csv")

{'code': 'demo-stocks', 'provided': 60, 'inserted': 60, 'total_after_insert': 60}

The function output confirms the name and size of our new universe.

Now that we have a universe, the next step is to collect historical data for our backtest.

***

## *Next Up*

Part 3: [Collect Historical Data](Part3-Collect-Historical-Data.ipynb)