# Momentum Investing Strategy # 

_Momentum Investing_ means investing in stocks that have increased in prices the most. 

For this project, we're going to build an investing strategy that selects the 50 stocks with the highest price momentum, From there, we will calculate recommended trades for an equal-weight portfolio of these 50 stocks.

## Library Imports ##

In [51]:
import numpy as np
import pandas as pd

import requests
import math

from scipy.stats import percentileofscore as score
import xlsxwriter
import yfinance as yf

from secrets import ALPHAVANTAGE_API_KEY



## Can import datafile if don't want to run the whole thing
# final_dataframe = pd.read_csv("OneYearPriceReturn.csv", index_col=0)
# hqm_df = pd.read_csv("HQM.csv", index_col=0)

## Importing List of Stocks ## 

As before, we'll need to import our list of stocks.

In [49]:
tickers = pd.read_csv("constituents.csv")
tickers = tickers.loc[:,"Symbol"]
tickers

0       MMM
1       AOS
2       ABT
3      ABBV
4      ABMD
       ... 
500     YUM
501    ZBRA
502     ZBH
503    ZION
504     ZTS
Name: Symbol, Length: 505, dtype: object

***
## Getting Market Data ## 

For this project we will be pulling price and also one-year stock returns. For this, we need to calculate the Rate of Return (RoR) which is given by `(Year End Price - Year Start Price)/Year Start Price`.

In [3]:
## THIS CELL TAKES APPROX 5 MINS TO RUN

my_columns = ["Ticker", "Price", "One-YearPriceReturn", "NumberSharesToBuy"]
final_dataframe = pd.DataFrame(columns=my_columns)
i = 0
n = len(tickers)

for tick in tickers:
    ## Timer if wanted
    # i += 1
    # p = np.round((100*i)/n)
    # print(f"{p}% completed,", tick)
    try:
        stock = yf.Ticker(tick)
        data = stock.history(period="1y")
        start_price = data.Close[0]
        end_price = data.Close[-1]
        one_year_return = (end_price - start_price)/start_price
        row = pd.Series([tick, end_price, one_year_return, 'N/A'], index = my_columns)
        final_dataframe = pd.concat([final_dataframe, row.to_frame().T], ignore_index=True)
    except:
        print("Error with: ", tick)

final_dataframe.to_csv("OneYearPriceReturn.csv")

0.0% completed, MMM
0.0% completed, AOS
1.0% completed, ABT
1.0% completed, ABBV
1.0% completed, ABMD
1.0% completed, ACN
1.0% completed, ATVI
2.0% completed, ADM
2.0% completed, ADBE
2.0% completed, AAP
2.0% completed, AMD
2.0% completed, AES
3.0% completed, AFL
3.0% completed, A
3.0% completed, APD
3.0% completed, AKAM
3.0% completed, ALK
4.0% completed, ALB
4.0% completed, ARE
4.0% completed, ALGN
4.0% completed, ALLE
4.0% completed, LNT
5.0% completed, ALL
5.0% completed, GOOGL
5.0% completed, GOOG
5.0% completed, MO
5.0% completed, AMZN
6.0% completed, AMCR
6.0% completed, AEE
6.0% completed, AAL
6.0% completed, AEP
6.0% completed, AXP
7.0% completed, AIG
7.0% completed, AMT
7.0% completed, AWK
7.0% completed, AMP
7.0% completed, ABC
8.0% completed, AME
8.0% completed, AMGN
8.0% completed, APH
8.0% completed, ADI
8.0% completed, ANSS
9.0% completed, ANTM
- ANTM: No data found, symbol may be delisted
Error with:  ANTM
9.0% completed, AON
9.0% completed, APA
9.0% completed, AAPL
9.0

### Removing Low-Momentum Stocks

The investment strategy that we're building seeks to identify the 50 highest momentum stocks in the S&P 500. 

Because of this, the next thing we need to do is to remove all stocks from `final_dataframe` that fall below this momentum threshold. We'll sort `final_dataframe` by the stocks one-tear price return, and drop all stocks outside the top 50.

In [2]:
## Sort the dataframe
final_dataframe.sort_values("One-YearPriceReturn", ascending=False, inplace=True)

## select the top 50
top_50 = final_dataframe[:50]

## reset index
top_50.reset_index(inplace=True, drop=True)

## view the top 50
top_50

Unnamed: 0.1,index,Unnamed: 0,Ticker,Price,One-YearPriceReturn,NumberSharesToBuy
0,334,334,OXY,64.544998,0.913691,
1,451,451,VLO,144.324997,0.902265,
2,284,284,MPC,129.179092,0.882966,
3,217,217,HES,152.940002,0.745513,
4,162,162,ENPH,219.164993,0.71089,
5,329,329,NUE,153.684998,0.639244,
6,178,178,XOM,113.675003,0.623636,
7,283,283,MRO,27.934999,0.5455,
8,392,392,SLB,55.34,0.534512,
9,437,437,TWTR,53.700001,0.53166,


### Calculating the Number of Shares to Buy

We now need to calculate the number of shares we need to buy. We are going to wrap this functionality in a function, since we'll be using it again in the future. This portfolio will be equally weighted amongst the top50 stocks, as we haven't included marketCap. 

In [17]:
def portfolio_size_input():
    # Get portfolio size
    portfolio_size = input("Enter the value of your portfolio:")
    try: 
        portfolio_size = float(portfolio_size)
    except ValueError:
        print("Portfolio Size is not a valid number.\n")

    return portfolio_size


def number_shares_to_buy(portfolio_size, portfolio):
    value = portfolio_size/50
    # loop through top_50 and place number of shares
    for i in range(0,len(portfolio)) :
        portfolio.loc[i,"NumberSharesToBuy"] = math.floor(value/portfolio.loc[i,"Price"])

We now use the functions to update our portfolio.

In [6]:
portfolio_size = portfolio_size_input()
number_shares_to_buy(portfolio_size, top_50)

top_50

Unnamed: 0.1,index,Unnamed: 0,Ticker,Price,One-YearPriceReturn,NumberSharesToBuy
0,334,334,OXY,64.544998,0.913691,309.0
1,451,451,VLO,144.324997,0.902265,138.0
2,284,284,MPC,129.179092,0.882966,154.0
3,217,217,HES,152.940002,0.745513,130.0
4,162,162,ENPH,219.164993,0.71089,91.0
5,329,329,NUE,153.684998,0.639244,130.0
6,178,178,XOM,113.675003,0.623636,175.0
7,283,283,MRO,27.934999,0.5455,715.0
8,392,392,SLB,55.34,0.534512,361.0
9,437,437,TWTR,53.700001,0.53166,372.0


***
## Building a Better (and more realistic) Momemtum Strategy

Real-world quantitative firms differentiate between "high quality" and "low quality" momentum stocks. These are defined as 

- High-quality momentum stocks show "slow and steady" outperformance over long periods of time. 
- Low-quality momentum stocks might not show any momentum for a long time, and then surge upwards.

The reason why high-quality momentum stocks are preferred is because low-quality momentum can often be caused by short-term news that is unlikely to be repeated in the future (such as the FDA approving a biotech company).

To identify high-quality momentum, we're going to build a strategy that selects stocks from the highest percentile of:

- 1-month price returns
- 3-month price returns
- 6-month price returns
- 1-year price returns

Let's start by building our DataFrame. We use `hqm` for High Quality Momentum.

In [52]:
# Can Reload the old data here and skip to percentiles
hqm_df = pd.read_csv("HQM.csv", index_col=0)

In [50]:
hqm_columns = [
    "Ticker",
    "Price",
    "NumberSharesToBuy",
    "OneYearPriceReturn",
    "OneYearReturnPercentile",
    "SixMonthPriceReturn",
    "SixMonthReturnPercentile",
    "ThreeMonthPriceReturn",
    "ThreeMonthReturnPercentile",
    "OneMonthPriceReturn",
    "OneMonthReturnPercentile",
    "HQM Score"
]

hqm_df = pd.DataFrame(columns=hqm_columns)

We now loop through as before, completing the columns as we go.

In [11]:
## THIS CELL TAKES 7 MINS to RUN
i = 0
for tick in tickers:
    # Timer if wanted
    i += 1
    p = np.round((100*i)/len(tickers))
    print(f"{p}% completed,", tick)
    try:
        # Get data
        stock = yf.Ticker(tick)
        data = stock.history(period="1y", interval="1mo").Close
        data.dropna(axis=0, inplace=True)

        # one-year price return
        start_price = data[0]
        end_price = data[-1]
        one_year_price_return = (end_price - start_price)/start_price

        # six month price return
        start_price = data[-7]
        end_price = data[-1]
        six_month_price_return = (end_price - start_price)/start_price

        # three month price return
        start_price = data[-4]
        end_price = data[-1]
        three_month_price_return = (end_price - start_price)/start_price

        # one month price return
        start_price = data[-2]
        end_price = data[-1]
        one_month_price_return = (end_price - start_price)/start_price
        
        # update data frame
        row = pd.Series([tick, data[-1], "N/A", one_year_price_return, 'N/A', 
                                    six_month_price_return, "N/A",
                                    three_month_price_return, "N/A",
                                    one_month_price_return, "N/A", "N/A"], index = hqm_columns)
        hqm_df = pd.concat([hqm_df, row.to_frame().T], ignore_index=True)
    except:
        print("Error with: ", tick)

hqm_df.to_csv("HQM.csv")

0.0% completed, MMM
0.0% completed, AOS
1.0% completed, ABT
1.0% completed, ABBV
1.0% completed, ABMD
1.0% completed, ACN
1.0% completed, ATVI
2.0% completed, ADM
2.0% completed, ADBE
2.0% completed, AAP
2.0% completed, AMD
2.0% completed, AES
3.0% completed, AFL
3.0% completed, A
3.0% completed, APD
3.0% completed, AKAM
3.0% completed, ALK
4.0% completed, ALB
4.0% completed, ARE
4.0% completed, ALGN
4.0% completed, ALLE
4.0% completed, LNT
5.0% completed, ALL
5.0% completed, GOOGL
5.0% completed, GOOG
5.0% completed, MO
5.0% completed, AMZN
6.0% completed, AMCR
6.0% completed, AEE
6.0% completed, AAL
6.0% completed, AEP
6.0% completed, AXP
7.0% completed, AIG
7.0% completed, AMT
7.0% completed, AWK
7.0% completed, AMP
7.0% completed, ABC
8.0% completed, AME
8.0% completed, AMGN
8.0% completed, APH
8.0% completed, ADI
8.0% completed, ANSS
9.0% completed, ANTM
- ANTM: No data found, symbol may be delisted
Error with:  ANTM
9.0% completed, AON
9.0% completed, APA
9.0% completed, AAPL
9.0

### Calculating Momentum Percentiles

we now need to calculate momentum percentiles scores for every stock in the dataframe. Mroe specifically, we need to compute percentile scores for the following metrics for every stock:

- `OneYearPriceReturn`
- `SixMonthPriceReturn`
- `ThreeMonthPriceReturn`
- `OneMonthPriceReturn`

Here's how we do this:

In [53]:
time_periods = ["OneYear", "SixMonth", "ThreeMonth", "OneMonth"]

for row in hqm_df.index:
    for time_period in time_periods:
        percentile_col = f"{time_period}ReturnPercentile"
        compute_col = f"{time_period}PriceReturn"
        hqm_df.loc[row, percentile_col] = score(hqm_df[compute_col], hqm_df.loc[row, compute_col])/100

hqm_df

Unnamed: 0,Ticker,Price,NumberSharesToBuy,OneYearPriceReturn,OneYearReturnPercentile,SixMonthPriceReturn,SixMonthReturnPercentile,ThreeMonthPriceReturn,ThreeMonthReturnPercentile,OneMonthPriceReturn,OneMonthReturnPercentile,HQM Score
0,MMM,115.000000,,-0.193568,0.184805,-0.054760,0.199179,-0.076395,0.090349,-0.062143,0.004107,
1,AOS,60.669998,,-0.102064,0.338809,0.081005,0.544148,-0.001153,0.498973,0.007640,0.751540,
2,ABT,112.529999,,-0.050470,0.418891,0.106409,0.618070,0.050748,0.749487,-0.008512,0.273101,
3,ABBV,147.690002,,0.037503,0.591376,0.120118,0.655031,-0.074985,0.098563,0.003662,0.636550,
4,ABMD,381.019989,,0.226170,0.913758,0.539413,0.991786,0.551005,1.000000,0.008550,0.767967,
...,...,...,...,...,...,...,...,...,...,...,...,...
482,YUM,128.509995,,0.068528,0.689938,0.166235,0.759754,0.003412,0.531828,0.014526,0.893224,
483,ZBRA,310.079987,,-0.249819,0.117043,0.027980,0.392197,0.147255,0.946612,-0.024169,0.049281,
484,ZBH,125.480003,,0.024626,0.560575,0.185094,0.790554,0.046777,0.724846,-0.003653,0.420945,
485,ZION,51.770000,,-0.249603,0.119097,-0.044934,0.223819,0.007178,0.552361,-0.016714,0.125257,


### Calculating the HQM Score

We'll now calculate our `HQM Score`, which is the high-quality momentum score that we'll use to filter for stock. 

The `HQM Score` will be the arithmetic mean of the 4 momentum percentile scores that we calculated in the last section. 

In [54]:
from statistics import mean 

for row in hqm_df.index:
    momentum_percentiles = []
    for time_period in time_periods:
        momentum_percentiles.append(hqm_df.loc[row, f"{time_period}ReturnPercentile"])
    
    ## compute mean of momemtum percentiles and then append to dataframe
    hqm_df.loc[row, "HQM Score"] = mean(momentum_percentiles)
    
hqm_df

Unnamed: 0,Ticker,Price,NumberSharesToBuy,OneYearPriceReturn,OneYearReturnPercentile,SixMonthPriceReturn,SixMonthReturnPercentile,ThreeMonthPriceReturn,ThreeMonthReturnPercentile,OneMonthPriceReturn,OneMonthReturnPercentile,HQM Score
0,MMM,115.000000,,-0.193568,0.184805,-0.054760,0.199179,-0.076395,0.090349,-0.062143,0.004107,0.119610
1,AOS,60.669998,,-0.102064,0.338809,0.081005,0.544148,-0.001153,0.498973,0.007640,0.751540,0.533368
2,ABT,112.529999,,-0.050470,0.418891,0.106409,0.618070,0.050748,0.749487,-0.008512,0.273101,0.514887
3,ABBV,147.690002,,0.037503,0.591376,0.120118,0.655031,-0.074985,0.098563,0.003662,0.636550,0.495380
4,ABMD,381.019989,,0.226170,0.913758,0.539413,0.991786,0.551005,1.000000,0.008550,0.767967,0.918378
...,...,...,...,...,...,...,...,...,...,...,...,...
482,YUM,128.509995,,0.068528,0.689938,0.166235,0.759754,0.003412,0.531828,0.014526,0.893224,0.718686
483,ZBRA,310.079987,,-0.249819,0.117043,0.027980,0.392197,0.147255,0.946612,-0.024169,0.049281,0.376283
484,ZBH,125.480003,,0.024626,0.560575,0.185094,0.790554,0.046777,0.724846,-0.003653,0.420945,0.624230
485,ZION,51.770000,,-0.249603,0.119097,-0.044934,0.223819,0.007178,0.552361,-0.016714,0.125257,0.255133


### Selecting the 50 Best Momentum Stocks

We now rank the dataframe by the `HQM Score` and pick out the top 50. 

In [55]:
hqm_df.sort_values("HQM Score", ascending=False, inplace=True)

# take the top 50
hqm_df = hqm_df[:50]

# reset index
hqm_df.reset_index(inplace=True, drop=True)

hqm_df

Unnamed: 0,Ticker,Price,NumberSharesToBuy,OneYearPriceReturn,OneYearReturnPercentile,SixMonthPriceReturn,SixMonthReturnPercentile,ThreeMonthPriceReturn,ThreeMonthReturnPercentile,OneMonthPriceReturn,OneMonthReturnPercentile,HQM Score
0,CAT,257.609985,,0.4055,0.977413,0.4107,0.973306,0.094871,0.880903,0.024292,0.973306,0.951232
1,MPC,130.210007,,0.715676,0.997947,0.308422,0.934292,0.075653,0.833676,0.026245,0.975359,0.935318
2,PCAR,111.080002,,0.262352,0.936345,0.31436,0.938398,0.082095,0.852156,0.086357,0.997947,0.931211
3,ABMD,381.019989,,0.22617,0.913758,0.539413,0.991786,0.551005,1.0,0.00855,0.767967,0.918378
4,VLO,145.029999,,0.798937,1.0,0.257898,0.897331,0.093004,0.876797,0.013629,0.882957,0.914271
5,DHI,97.0,,0.149283,0.831622,0.371453,0.960986,0.131195,0.936345,0.013796,0.887064,0.904004
6,TWTR,53.700001,,0.510549,0.99384,0.095471,0.587269,0.290555,0.995893,0.224909,1.0,0.894251
7,LEN,99.089996,,0.118628,0.790554,0.285574,0.921971,0.128202,0.930185,0.015995,0.911704,0.888604
8,LW,96.980003,,0.479565,0.98768,0.226567,0.86653,0.119178,0.921971,0.008842,0.776181,0.88809
9,GE,80.699997,,0.087026,0.731006,0.410743,0.975359,0.203636,0.979466,0.011659,0.843943,0.882444


### Calculating the Number of Shares to Buy

In [56]:
portfolio_size = portfolio_size_input()
number_shares_to_buy(portfolio_size, hqm_df)

hqm_df

Unnamed: 0,Ticker,Price,NumberSharesToBuy,OneYearPriceReturn,OneYearReturnPercentile,SixMonthPriceReturn,SixMonthReturnPercentile,ThreeMonthPriceReturn,ThreeMonthReturnPercentile,OneMonthPriceReturn,OneMonthReturnPercentile,HQM Score
0,CAT,257.609985,77.0,0.4055,0.977413,0.4107,0.973306,0.094871,0.880903,0.024292,0.973306,0.951232
1,MPC,130.210007,153.0,0.715676,0.997947,0.308422,0.934292,0.075653,0.833676,0.026245,0.975359,0.935318
2,PCAR,111.080002,180.0,0.262352,0.936345,0.31436,0.938398,0.082095,0.852156,0.086357,0.997947,0.931211
3,ABMD,381.019989,52.0,0.22617,0.913758,0.539413,0.991786,0.551005,1.0,0.00855,0.767967,0.918378
4,VLO,145.029999,137.0,0.798937,1.0,0.257898,0.897331,0.093004,0.876797,0.013629,0.882957,0.914271
5,DHI,97.0,206.0,0.149283,0.831622,0.371453,0.960986,0.131195,0.936345,0.013796,0.887064,0.904004
6,TWTR,53.700001,372.0,0.510549,0.99384,0.095471,0.587269,0.290555,0.995893,0.224909,1.0,0.894251
7,LEN,99.089996,201.0,0.118628,0.790554,0.285574,0.921971,0.128202,0.930185,0.015995,0.911704,0.888604
8,LW,96.980003,206.0,0.479565,0.98768,0.226567,0.86653,0.119178,0.921971,0.008842,0.776181,0.88809
9,GE,80.699997,247.0,0.087026,0.731006,0.410743,0.975359,0.203636,0.979466,0.011659,0.843943,0.882444


***
## Formatting the Excel Output

We will use `xlsxwriter` library for Python to create our nicely formatted excel output.

In [39]:
writer = pd.ExcelWriter("high_quality_momentum_strategy.xlsx", engine="xlsxwriter")

hqm_df.to_excel(writer, sheet_name="High Quality Momentum Strategy", index=False)

### Creating the Formats We'll Need For Our `.xlsx` File

Formats include colors, fonts, and also symbols like `%` and `$`. We'll need four main formats for our excel documents:
- string format for tickers
- $XX.XX format for stock prices 
- $XX,XXX format for market capitalization 
- integer format for the number of shares to buy

In [40]:
background_color = "#0a0a23"
font_color = "ffffff"

string_format = writer.book.add_format(
    {
        "font_color" : font_color,
        "bg_color" : background_color,
        "border" : 1
    }
)

dollar_format = writer.book.add_format(
    {
        "num_format" : "$0.00",
        "font_color" : font_color,
        "bg_color" : background_color,
        "border" : 1
    }
)

integer_format = writer.book.add_format(
    {
        "num_format" : "0",
        "font_color" : font_color,
        "bg_color" : background_color,
        "border" : 1
    }
)

percent_format = writer.book.add_format(
    {
        "num_format" : "0.0%",
        "font_color" : font_color,
        "bg_color" : background_color,
        "border" : 1
    }
)

### Applying the Formats to our Columns in our `.xlsx` File ###

We can use `set_column` method applied to the `writer.book` object to apply formats to specific columns of our spreadsheets. 

In [41]:
column_formats = {
    "A" : ["Ticker", string_format],
    "B" : ["Price", dollar_format],
    "C" : ["NumberSharesToBuy", integer_format],
    "D" : ["OneYearPriceReturn", percent_format], 
    "E" : ["OneYearReturnPercentile", percent_format],
    "F" : ["SixMonthPriceReturn", percent_format],
    "G" : ["SixMonthReturnPercentile", percent_format],
    "H" : ["ThreeMonthPriceReturn", percent_format],
    "I" : ["ThreeMonthReturnPercentile", percent_format],
    "J" : ["OneMonthPriceReturn", percent_format],
    "K" : ["OneMonthReturnPercentile", percent_format],
    "L" : ["HQM Score", percent_format]
}

for column in column_formats.keys():
    writer.sheets["High Quality Momentum Strategy"].set_column(f"{column}:{column}", 30, column_formats[column][1])
    # write headers
    writer.sheets["High Quality Momentum Strategy"].write(f"{column}1", column_formats[column][0], column_formats[column][1])

### Saving Excel Output

In [42]:
writer.close()