<a href="https://colab.research.google.com/github/byunsy/quantitative-momentum/blob/main/Quantitative_Momentum_Strategy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Quantitative Momentum Strategy


The fundamental idea of 'momentum investing' is a trading strategy in which investors buy stocks that are rising and sell them at their peaks.

In this quantitative momentum strategy study, we will be building a investment strategy that **selects 50 stocks with the highest price momentum** (i.e. highest high-quality-momentum score). Afterwards, we will calculate the recommended number of shares to purchase based on an equally weighted portfolio of the selected 50 stocks. 



## Import Necessary Modules

In [1]:
import numpy as np 
import pandas as pd 
import requests 
import math 

We will need to upload the `sp_500_stocks.csv` to attain the complete listing of companies in S&P 500. 


In [2]:
from google.colab import files
uploaded = files.upload()

## Attain S&P Stock Listing

Get a list of all the constituents in the S&P 500. 

In [3]:
sp500 = pd.read_csv('sp_500_stocks.csv')
sp500

Unnamed: 0,Symbol
0,A
1,AAL
2,AAP
3,AAPL
4,ABBV
...,...
500,YUM
501,ZBH
502,ZBRA
503,ZION


## API Call

We first need to have a test api token to use IEX Cloud APIs (This will remain private). You can receive sandbox Text APIs from the IEX Cloud API website. 

In [4]:
from iex_api import IEX_CLOUD_API_TOKEN

**NOTE:**

`api_url` differs by what we are trying to find. We need to carefully read the IEX Cloud API docs to know the structure. 

In [5]:
# To take an example of what we get from IEX Cloud, we will take Microsoft
symbol='MSFT'
api_url = f'https://sandbox.iexapis.com/stable/stock/{symbol}/stats?token={IEX_CLOUD_API_TOKEN}'
ms_data = requests.get(api_url).json()

ms_data

{'avg10Volume': 32274084,
 'avg30Volume': 27415686,
 'beta': 1.1564834753412272,
 'companyName': 'Microsoft Corporation',
 'day200MovingAvg': 218.28,
 'day30ChangePercent': 0.023121686663047432,
 'day50MovingAvg': 225.36,
 'day5ChangePercent': 0.0648883834482844,
 'dividendYield': 0.009453471136648016,
 'employees': 0,
 'exDividendDate': '2020-11-16',
 'float': 0,
 'marketcap': 1746783765732,
 'maxChangePercent': 8.732411350019836,
 'month1ChangePercent': 0.009117084712751675,
 'month3ChangePercent': 0.05488515283244261,
 'month6ChangePercent': 0.07287746793386388,
 'nextDividendDate': '2021-02-14',
 'nextEarningsDate': '0',
 'peRatio': 35.99275346774564,
 'sharesOutstanding': 7632593509,
 'ttmDividendRate': 2.1596155903237766,
 'ttmEPS': 6.2,
 'week52change': 0.3837531657089547,
 'week52high': 237.52,
 'week52low': 140.68,
 'year1ChangePercent': 0.379391682883754,
 'year2ChangePercent': 1.2093480958762546,
 'year5ChangePercent': 3.7377482466658165,
 'ytdChangePercent': 0.0162421614067

We can now get specific information about our data using indices. 

In [6]:
print("One Year Percentage Change:", ms_data['year1ChangePercent'])

One Year Percentage Change: 0.379391682883754


**NOTE:**

Since we are using sandbox test APIs, the values returned are not real. 

## Data Analysis


Firstly, let's calculate the one-year price return for all the constituents in S&P 500. This will help us know which companies had the highest momentum for the latest year. 

We will use batch API calls because it is much more efficient.

In [7]:
# For Batch API Call
def chunks(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

In [8]:
symbol_batch = list(chunks(sp500['Symbol'], 100))
symbol_strings = []

for batch in symbol_batch:
    symbol_strings.append(','.join(batch))

df_columns = ['Symbol', 'Price', 'One-Year Price Return']
df = pd.DataFrame(columns=df_columns)

for symbol_string in symbol_strings:

    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_call_url).json()

    for symbol in symbol_string.split(','):
        df = df.append(pd.Series([symbol, data[symbol]['quote']['latestPrice'], 
                                  data[symbol]['stats']['year1ChangePercent']], 
                                  index=df_columns), ignore_index = True)
        
df

Unnamed: 0,Symbol,Price,One-Year Price Return
0,A,131.34,0.414052
1,AAL,16.39,-0.42886
2,AAP,168.26,0.123336
3,AAPL,142.85,0.773768
4,ABBV,115.11,0.340278
...,...,...,...
500,YUM,111.82,0.0254913
501,ZBH,164.49,0.0762515
502,ZBRA,411.77,0.624899
503,ZION,48.52,0.0446758


## 50 Highgest-momentum stocks in the S&P 500

We can now sort the list attained above and take the top 50 companies (and remove the rest).

In [9]:
df_sorted = df.sort_values('One-Year Price Return', ascending=False)
df_sorted = df_sorted[:50]

# inplace=True makes changes on this current data frame.
df_sorted.reset_index(drop=True, inplace=True)

# Print the data frame 
df_sorted

Unnamed: 0,Symbol,Price,One-Year Price Return
0,CARR,39.92,2.3132
1,FCX,31.89,1.51745
2,LB,45.53,1.3426
3,NVDA,561.1,1.21355
4,ALB,175.4,1.20755
5,PYPL,252.0,1.19385
6,ALGN,548.75,0.941802
7,PWR,80.17,0.937272
8,WST,302.75,0.915959
9,ABMD,359.8,0.906723


So we now have a list of 50 companies that had the highest momentum in the latest year. However, this is quite a simple (and somewhat naive) model.

We will take a closer look using other APIs to get a more wholistic view of the S&P companies which will help us craft a better investing strategy. 

## Momentum Strategy

It is important to distinguish between 'high-quality' and'low-quality' momentum. 

**High-quality momentum** stocks will show a **slower and steadier performance** throughout the entire year whereas low-quality momentum stocks may have a sudden (and often short-term) surge upward trend. 

From IEX Cloud APIs, we will utilize four main factors to calculate their percentiles for each factor and take the arithmetic mean of the percentiles to calculate the HQM (High-Quality Momentum) score. The four factors are: 

- 1-Year Price Returns
- 6-Month Price Returns
- 3-Month Price Returns
- 1-Month Price Returns



In [10]:
hqm_columns = ['Symbol', 'Price', 'HQM Score', 'Num Shares to Purchase', 
               '1-Year Price Ret', '1-Year Return Per',
               '6-Month Price Ret','6-Month Return Per',
               '3-Month Price Ret','3-Month Return Per',
               '1-Month Price Ret','1-Month Return Per']

Let's first fill up the price return section for each time period (1-year, 6-months, 3-months, 1-month)

In [11]:
hqm_df = pd.DataFrame(columns=hqm_columns)

for symbol_string in symbol_strings:

    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_call_url).json()

    for symbol in symbol_string.split(','):
        hqm_df = hqm_df.append(pd.Series([symbol, data[symbol]['quote']['latestPrice'],'N/A','N/A',
                                          data[symbol]['stats']['year1ChangePercent'],'N/A',
                                          data[symbol]['stats']['month6ChangePercent'],'N/A',
                                          data[symbol]['stats']['month3ChangePercent'],'N/A',
                                          data[symbol]['stats']['month1ChangePercent'],'N/A'], 
                                          index = hqm_columns), ignore_index = True)

# Print the data frame 
hqm_df

Unnamed: 0,Symbol,Price,HQM Score,Num Shares to Purchase,1-Year Price Ret,1-Year Return Per,6-Month Price Ret,6-Month Return Per,3-Month Price Ret,3-Month Return Per,1-Month Price Ret,1-Month Return Per
0,A,126.64,,,0.419223,,0.319719,,0.190862,,0.0799613,
1,AAL,15.97,,,-0.424611,,0.4034,,0.2106,,0.0224372,
2,AAP,170.26,,,0.119062,,0.102362,,0.0730322,,0.0282053,
3,AAPL,141.03,,,0.794042,,0.452122,,0.21121,,0.0549944,
4,ABBV,113.74,,,0.352683,,0.16899,,0.3412,,0.0915006,
...,...,...,...,...,...,...,...,...,...,...,...,...
500,YUM,112.19,,,0.0258312,,0.155875,,0.0804741,,-0.00296807,
501,ZBH,162.47,,,0.0767299,,0.221134,,0.12931,,0.071491,
502,ZBRA,424.28,,,0.61762,,0.479265,,0.367908,,0.0764444,
503,ZION,49.13,,,0.0454474,,0.495711,,0.527443,,0.148822,


Now we can calculate the percentile using the scipy module.

In [12]:
from scipy import stats

time_periods = ['1-Year','6-Month','3-Month','1-Month']

# If any values are 'None', the code will not work. 
# So we first need to convert it to zeros. 
for row in hqm_df.index:
    for time_period in time_periods:
        if hqm_df.loc[row, f'{time_period} Price Ret'] == None:
            hqm_df.loc[row, f'{time_period} Price Ret'] = 0


for row in hqm_df.index:
    for time_period in time_periods:

        hqm_df.loc[row, f'{time_period} Return Per'] = stats.percentileofscore(a=hqm_df[f'{time_period} Price Ret'], 
                                                                               score=hqm_df.loc[row, f'{time_period} Price Ret'])

# Print the data frame    
hqm_df

Unnamed: 0,Symbol,Price,HQM Score,Num Shares to Purchase,1-Year Price Ret,1-Year Return Per,6-Month Price Ret,6-Month Return Per,3-Month Price Ret,3-Month Return Per,1-Month Price Ret,1-Month Return Per
0,A,126.64,,,0.419223,87.3267,0.319719,68.5149,0.190862,64.3564,0.0799613,69.901
1,AAL,15.97,,,-0.424611,1.78218,0.4034,78.0198,0.2106,67.3267,0.0224372,40.198
2,AAP,170.26,,,0.119062,55.0495,0.102362,31.0891,0.0730322,34.6535,0.0282053,43.3663
3,AAPL,141.03,,,0.794042,96.8317,0.452122,83.1683,0.21121,67.5248,0.0549944,59.4059
4,ABBV,113.74,,,0.352683,82.3762,0.16899,44.5545,0.3412,82.3762,0.0915006,74.0594
...,...,...,...,...,...,...,...,...,...,...,...,...
500,YUM,112.19,,,0.0258312,40.7921,0.155875,41.1881,0.0804741,37.4257,-0.00296807,24.7525
501,ZBH,162.47,,,0.0767299,49.1089,0.221134,53.8614,0.12931,50.099,0.071491,66.9307
502,ZBRA,424.28,,,0.61762,93.6634,0.479265,85.3465,0.367908,85.1485,0.0764444,68.7129
503,ZION,49.13,,,0.0454474,43.7624,0.495711,86.3366,0.527443,94.4554,0.148822,89.1089


## HQM (High Quality Momentum) Score

With the percentiles in place, we can now take their arithmetic mean to get a overall HQM score for each stock. 

In [13]:
from statistics import mean

for row in hqm_df.index:
    momentum_percentiles = []

    for time_period in time_periods:
        momentum_percentiles.append(hqm_df.loc[row, f'{time_period} Return Per'])

    hqm_df.loc[row, 'HQM Score'] = mean(momentum_percentiles)

In [14]:
hqm_df

Unnamed: 0,Symbol,Price,HQM Score,Num Shares to Purchase,1-Year Price Ret,1-Year Return Per,6-Month Price Ret,6-Month Return Per,3-Month Price Ret,3-Month Return Per,1-Month Price Ret,1-Month Return Per
0,A,126.64,72.5248,,0.419223,87.3267,0.319719,68.5149,0.190862,64.3564,0.0799613,69.901
1,AAL,15.97,46.8317,,-0.424611,1.78218,0.4034,78.0198,0.2106,67.3267,0.0224372,40.198
2,AAP,170.26,41.0396,,0.119062,55.0495,0.102362,31.0891,0.0730322,34.6535,0.0282053,43.3663
3,AAPL,141.03,76.7327,,0.794042,96.8317,0.452122,83.1683,0.21121,67.5248,0.0549944,59.4059
4,ABBV,113.74,70.8416,,0.352683,82.3762,0.16899,44.5545,0.3412,82.3762,0.0915006,74.0594
...,...,...,...,...,...,...,...,...,...,...,...,...
500,YUM,112.19,36.0396,,0.0258312,40.7921,0.155875,41.1881,0.0804741,37.4257,-0.00296807,24.7525
501,ZBH,162.47,55,,0.0767299,49.1089,0.221134,53.8614,0.12931,50.099,0.071491,66.9307
502,ZBRA,424.28,83.2178,,0.61762,93.6634,0.479265,85.3465,0.367908,85.1485,0.0764444,68.7129
503,ZION,49.13,78.4158,,0.0454474,43.7624,0.495711,86.3366,0.527443,94.4554,0.148822,89.1089


Let's sort the entire data frame by its HQM score. 

In [15]:
hqm_df_sorted = hqm_df.sort_values(by='HQM Score', ascending=False)
hqm_df_sorted = hqm_df_sorted[:50]

hqm_df_sorted.reset_index(inplace=True)

# Print the data frame 
hqm_df_sorted

Unnamed: 0,index,Symbol,Price,HQM Score,Num Shares to Purchase,1-Year Price Ret,1-Year Return Per,6-Month Price Ret,6-Month Return Per,3-Month Price Ret,3-Month Return Per,1-Month Price Ret,1-Month Return Per
0,179,FCX,31.39,98.6139,,1.48024,99.802,1.28988,99.604,0.694604,97.6238,0.265655,97.4257
1,410,SIVB,501.15,98.5149,,0.890685,98.0198,1.1675,99.4059,0.69711,97.8218,0.311886,98.8119
2,23,ALB,181.45,97.6733,,1.20655,99.2079,1.01428,98.4158,0.840748,98.8119,0.188299,94.2574
3,203,GM,57.5,96.2376,,0.610027,93.4653,1.15817,99.0099,0.490443,93.0693,0.357577,99.4059
4,29,AMAT,108.66,95.8416,,0.689523,94.6535,0.669732,93.4653,0.732374,98.2178,0.253186,97.0297
5,288,LRCX,569.58,95.0,,0.847742,97.6238,0.582913,91.4851,0.604001,96.8317,0.187196,94.0594
6,314,MOS,29.22,94.8515,,0.44798,88.1188,1.12818,98.8119,0.521801,94.2574,0.296449,98.2178
7,275,LB,46.97,94.4554,,1.31678,99.604,1.34125,99.802,0.346582,83.1683,0.200428,95.2475
8,266,KLAC,311.98,93.1188,,0.748679,96.0396,0.512013,87.9208,0.529586,94.6535,0.185704,93.8614
9,324,MU,85.76,91.5842,,0.397184,85.9406,0.601379,92.2772,0.529834,95.0495,0.171935,93.0693


## The Number of Shares to Purchase

We can use the same, simple method we used previously in other notes.

In [16]:
PORTFOLIO_SIZE = 100000
position_size = float(PORTFOLIO_SIZE) / len(hqm_df_sorted.index)
print(position_size)

2000.0


In [17]:
for i in range(0, len(hqm_df_sorted['Symbol'])):
    hqm_df_sorted.loc[i, 'Num Shares to Purchase'] = math.floor(position_size / hqm_df_sorted['Price'][i])

# Print the data frame 
hqm_df_sorted

Unnamed: 0,index,Symbol,Price,HQM Score,Num Shares to Purchase,1-Year Price Ret,1-Year Return Per,6-Month Price Ret,6-Month Return Per,3-Month Price Ret,3-Month Return Per,1-Month Price Ret,1-Month Return Per
0,179,FCX,31.39,98.6139,63,1.48024,99.802,1.28988,99.604,0.694604,97.6238,0.265655,97.4257
1,410,SIVB,501.15,98.5149,3,0.890685,98.0198,1.1675,99.4059,0.69711,97.8218,0.311886,98.8119
2,23,ALB,181.45,97.6733,11,1.20655,99.2079,1.01428,98.4158,0.840748,98.8119,0.188299,94.2574
3,203,GM,57.5,96.2376,34,0.610027,93.4653,1.15817,99.0099,0.490443,93.0693,0.357577,99.4059
4,29,AMAT,108.66,95.8416,18,0.689523,94.6535,0.669732,93.4653,0.732374,98.2178,0.253186,97.0297
5,288,LRCX,569.58,95.0,3,0.847742,97.6238,0.582913,91.4851,0.604001,96.8317,0.187196,94.0594
6,314,MOS,29.22,94.8515,68,0.44798,88.1188,1.12818,98.8119,0.521801,94.2574,0.296449,98.2178
7,275,LB,46.97,94.4554,42,1.31678,99.604,1.34125,99.802,0.346582,83.1683,0.200428,95.2475
8,266,KLAC,311.98,93.1188,6,0.748679,96.0396,0.512013,87.9208,0.529586,94.6535,0.185704,93.8614
9,324,MU,85.76,91.5842,23,0.397184,85.9406,0.601379,92.2772,0.529834,95.0495,0.171935,93.0693
