# Notebook Instructions

1. All the <u>code and data files</u> used in this course are available in the downloadable unit of the <u>last section of this course</u>.
2. You can run the notebook document sequentially (one cell at a time) by pressing **shift + enter**. 
3. While a cell is running, a [*] is shown on the left. After the cell is run, the output will appear on the next line.

This course is based on specific versions of python packages. You can find the details of the packages in <a href='https://quantra.quantinsti.com/quantra-notebook' target="_blank" >this manual</a>.

# Annual Filtering

In a previous notebook on cross sectional momentum, you had taken a position on a cross section of stocks and held them for a preset period. This cross section came from a portfolio of stocks which was selected once at the start of the period you ran the strategy on.

The portfolio needs to be refreshed periodically to add stocks showing relatively better momentum among the stocks in the S&P 500 universe. 

In this notebook, you will learn how to generate an annual list of filtered stocks. This is done for each year in the dataset. 


<img src="https://d2a032ejo53cab.cloudfront.net/Glossary/TYKwWxyS/flowmom.png"/>




In this notebook, you will perform the following steps:

1. [Read price and volume data](#read)
2. [Generate annual list of filtered stocks](#filter) 

<a id='read'></a> 

## Read price and volume data

You will read the pickle file, which stores the S&P 500 stocks price and volume data. This data is available in the downloadable unit of this course in the last section. You can also download SPY stocks price and volume data from finance.yahoo.com.

Syntax: 
```python
import pandas as pd
pd.read_pickle(filename)
```
Parameter:
    1. filename: name of the file in the string format

In [1]:
# Import pandas and numpy
import pandas as pd
import numpy as np

# Import matplotlib and set the style
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-darkgrid')

# The data is stored in the directory 'data'
path = '../data/'

def read_data(filename):
    data = pd.read_pickle(path+filename)
    return data

# Read price and volume data
stocks_price = read_data('spy_stocks_price_2010_2020.bz2')
stocks_volume = read_data('spy_stocks_volume_2010_2020.bz2')

# Remove the first day as it is the only data point of that year
stocks_price = stocks_price[1:]
stocks_volume = stocks_volume[1:]

<a id='filter'></a> 
## Generate annual list of filtered stocks

1. Multiply price and volume over first `n` periods in a year
2. Take an average or mean of the above product to get average dollar volume
3. Average dollar volume or average daily turnover (defined as the number of shares traded to the number of shares outstanding) are used to predict the magnitude and the persistence of future price momentum
4. Momentum is stronger among high average turnover or average dollar volume stocks

For illustration purpose, `n` is taken as 90.

We group the data into years using ```groupby()``` function. The syntax is given below:

Syntax: 
```python
dataframe_name.groupby(list_of_column_names)
```
1. dataframe_name: name of the dataframe based on whose columns we need to group the data
2. list_of_column_names: Ordered list of columns for which the grouping of the original dataframe will happen

In [2]:
def annual_universe_selector(stocks_price,stocks_volume):
    # Choosing from the volumes dataframe the index of the group
    stocks_volume = stocks_volume.loc[stocks_price.index]
    # Filter the top 100 stocks
    filtered_stocks = (stocks_price[:90] * stocks_volume[:90]
                       ).mean().sort_values(ascending=False).index[:100]

    # Return the filtered stocks for as a Series
    return pd.Series(filtered_stocks)

In [3]:
# Create yearly groups of the dataset on shifted data to avoid forward bias
year_groups = stocks_price.shift(90).groupby([stocks_price.index.year])

# Use the annual_universe_selector function to generate this list for each year
yearly_filtered_stocks = pd.DataFrame(year_groups.apply(annual_universe_selector,stocks_volume))
yearly_filtered_stocks.head()

Unnamed: 0_level_0,0,1,2,3,4,5,6,7,8,9,...,90,91,92,93,94,95,96,97,98,99
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010,MMM,ABT,ABBV,ABMD,ACN,ATVI,ADBE,AMD,AAP,AES,...,CBRE,CDW,CE,CNC,CNP,CTL,CERN,CF,SCHW,CHTR
2011,AAPL,BAC,C,GOOGL,MSFT,CSCO,INTC,JPM,XOM,NFLX,...,MA,DVN,AMAT,AIG,MO,CSX,DD,CL,CF,WBA
2012,AAPL,BAC,GOOGL,AMZN,C,MSFT,NFLX,XOM,JPM,ORCL,...,MMM,EA,ANTM,DD,NVDA,CTSH,CMG,NKE,CL,COG
2013,AAPL,GOOGL,BAC,MSFT,FB,C,AMZN,XOM,INTC,JPM,...,ALXN,HON,HES,MU,EXC,PXD,BAX,NTAP,PSX,HUM
2014,AAPL,FB,GOOGL,AMZN,BAC,VZ,C,MSFT,TWTR,NFLX,...,APA,NEM,LLY,TXN,ANTM,TMO,MCK,DG,AAL,MYL


In the code above you learnt how to generate an annual list of stocks. This filtered list is generated based on the average dollar volume of the stocks in the previous 90 days. You can use these lists to improve your cross sectional momentum strategy covered in the previous units.

<br></br>