# Quantitative Momentum Investing Strategy

"Momentum investing" means investing in the stocks that have increased in price the most.

For this project, we're going to build an investing strategy that selects the 50 stocks with the highest price momentum. From there, we will calculate recommended trades for an equal-weight portfolio of these 50 stocks.

In [1]:
import numpy as np #The Numpy numerical computing library
import pandas as pd #The Pandas data science library
import requests #The requests library for HTTP requests in Python
import xlsxwriter #The XlsxWriter libarary for 
import math #The Python math module
from scipy import stats #The SciPy stats module
from statistics import mean #The mean metric that we'll use to make a realistic momentum

We start with reading our data into the pandas data frame.

In [2]:
stocks = pd.read_csv('sp_500_stocks.csv')
stocks

Unnamed: 0,Ticker
0,A
1,AAL
2,AAP
3,AAPL
4,ABBV
...,...
500,YUM
501,ZBH
502,ZBRA
503,ZION


In [3]:
stocks.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 505 entries, 0 to 504
Data columns (total 1 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Ticker  505 non-null    object
dtypes: object(1)
memory usage: 2.0+ KB


The data has no null entries, however, we know that four of the stocks listed here have been delisted. Therefore, we will remove them from the list. Ref. Equal weight index fund project

Well I was wrong, I later realised that there are some None value that have to be dealt with. Therefore, I am adding a line of code for the same. Don't run it now, when you find the error, then remove the '#' and run all later cells.

In [4]:
stocks = stocks[~stocks['Ticker'].isin(['DISCA', 'HFC', 'VIAC', 'WLTW'])] #to get rid of the 4 rows with delisted stocks
#stocks = stocks[~stocks['Ticker'].isin(['CTL', 'ETFC', 'MYL', 'NBL'])] #for an error that arises later on
stocks # we don't care about the index here as we will create a new data frame and store the values there

Unnamed: 0,Ticker
0,A
1,AAL
2,AAP
3,AAPL
4,ABBV
...,...
500,YUM
501,ZBH
502,ZBRA
503,ZION


We then import the API token necessary for the API call.

In [5]:
from secrets import IEX_CLOUD_API_TOKEN

## Testing our API call

Using the same method as before we will create a URL for the API call using the base URL: https://sandbox.iexapis.com

Now it's time to structure our API calls to IEX cloud.
We need the following information from the API:
1. One year stock returns
2. Price of each stock

Firstly, since we want these metrics for particular stocks, we'll use the stock endpoint. However, this endpoint does not give us the latest price of the stock, therefore, we'll have to use this in tandem with the /quote/ endpoint.
After quite a bit of searching we land up on a responde attribute called year1ChangePercent which is in the Stats endpoint. Addind /satble/ as well for the most satble version in the data. 
htts://sandbox.iexapis.com/stable/stock/{symbol}stats?token={TOKEN}

In [7]:
symbol = 'AAPL'
api_url = f'https://sandbox.iexapis.com/stable/stock/{symbol}/stats?token={IEX_CLOUD_API_TOKEN}'
data = requests.get(api_url)
data.status_code

200

So that gives us response 200 which means the API call is wroking perfectly. Now we same the data as a dictionary. The key that we want is 'year1changepercent'

In [18]:
data = requests.get(api_url).json()
print(data)
data['year1ChangePercent']

{'companyName': 'Apple Inc', 'marketcap': 2447461811099, 'week52high': 183.63, 'week52low': 133.11, 'week52highSplitAdjustOnly': 191.4, 'week52lowSplitAdjustOnly': 129.86, 'week52change': 0.03911628927111773, 'sharesOutstanding': 16929212573, 'float': 0, 'avg10Volume': 76884786, 'avg30Volume': 75583933, 'day200MovingAvg': 160.08, 'day50MovingAvg': 143.65, 'employees': 153812, 'ttmEPS': 6.22, 'ttmDividendRate': 0.8961801053419308, 'dividendYield': 0.006130655266382796, 'nextDividendDate': '', 'exDividendDate': '2022-04-28', 'nextEarningsDate': '2022-07-25', 'peRatio': 24.18322848614018, 'beta': 1.2835373627891, 'maxChangePercent': 59.29533110443586, 'year5ChangePercent': 3.382863832063613, 'year2ChangePercent': 0.6045487879939576, 'year1ChangePercent': 0.06681828242681734, 'ytdChangePercent': -0.15344087316122115, 'month6ChangePercent': -0.09082201979189025, 'month3ChangePercent': -0.09889398449597887, 'month1ChangePercent': 0.1503432292965604, 'day30ChangePercent': 0.1505803341625761, 

0.06681828242681734

## Creating chunks for the API call and buliding our Data Frame.

We start by creating chunks of the data and adding them to a list.

In [6]:
def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]
symbol_groups = list(chunks(stocks['Ticker'], 100))
print(len(symbol_groups))
symbol_groups

6


[0         A
 1       AAL
 2       AAP
 3      AAPL
 4      ABBV
       ...  
 95     CINF
 96       CL
 97      CLX
 98      CMA
 99    CMCSA
 Name: Ticker, Length: 100, dtype: object,
 100     CME
 101     CMG
 102     CMI
 103     CMS
 104     CNC
        ... 
 196     FTV
 197      GD
 198      GE
 199    GILD
 200     GIS
 Name: Ticker, Length: 100, dtype: object,
 201       GL
 202      GLW
 203       GM
 204     GOOG
 205    GOOGL
        ...  
 297      MAS
 298      MCD
 299     MCHP
 300      MCK
 301      MCO
 Name: Ticker, Length: 100, dtype: object,
 302    MDLZ
 303     MDT
 304     MET
 305     MGM
 306     MHK
        ... 
 397      RL
 398     RMD
 399     ROK
 400     ROL
 401     ROP
 Name: Ticker, Length: 100, dtype: object,
 402    ROST
 403     RSG
 404     RTX
 405    SBAC
 406    SBUX
        ... 
 499     XYL
 500     YUM
 501     ZBH
 502    ZBRA
 503    ZION
 Name: Ticker, Length: 100, dtype: object,
 504    ZTS
 Name: Ticker, dtype: object]

After which we'll take each chunk join them using the comma as a seperator so that we can pass them onto IEX cloud within the limit of 100 stocks at a time for batch API calls.

In [7]:
symbol_strings = []
for i in range(0, len(symbol_groups)): # loop runs for each chunk
    symbol_strings.append(','.join(symbol_groups[i])) # join() joins all keys in the chunk and append() add each joined chunk
    print(symbol_strings[i])

A,AAL,AAP,AAPL,ABBV,ABC,ABMD,ABT,ACN,ADBE,ADI,ADM,ADP,ADSK,AEE,AEP,AES,AFL,AIG,AIV,AIZ,AJG,AKAM,ALB,ALGN,ALK,ALL,ALLE,ALXN,AMAT,AMCR,AMD,AME,AMGN,AMP,AMT,AMZN,ANET,ANSS,ANTM,AON,AOS,APA,APD,APH,APTV,ARE,ATO,ATVI,AVB,AVGO,AVY,AWK,AXP,AZO,BA,BAC,BAX,BBY,BDX,BEN,BF.B,BIIB,BIO,BK,BKNG,BKR,BLK,BLL,BMY,BR,BRK.B,BSX,BWA,BXP,C,CAG,CAH,CARR,CAT,CB,CBOE,CBRE,CCI,CCL,CDNS,CDW,CE,CERN,CF,CFG,CHD,CHRW,CHTR,CI,CINF,CL,CLX,CMA,CMCSA
CME,CMG,CMI,CMS,CNC,CNP,COF,COG,COO,COP,COST,COTY,CPB,CPRT,CRM,CSCO,CSX,CTAS,CTL,CTSH,CTVA,CTXS,CVS,CVX,CXO,D,DAL,DD,DE,DFS,DG,DGX,DHI,DHR,DIS,DISCK,DISH,DLR,DLTR,DOV,DOW,DPZ,DRE,DRI,DTE,DUK,DVA,DVN,DXC,DXCM,EA,EBAY,ECL,ED,EFX,EIX,EL,EMN,EMR,EOG,EQIX,EQR,ES,ESS,ETFC,ETN,ETR,EVRG,EW,EXC,EXPD,EXPE,EXR,F,FANG,FAST,FB,FBHS,FCX,FDX,FE,FFIV,FIS,FISV,FITB,FLIR,FLS,FLT,FMC,FOX,FOXA,FRC,FRT,FTI,FTNT,FTV,GD,GE,GILD,GIS
GL,GLW,GM,GOOG,GOOGL,GPC,GPN,GPS,GRMN,GS,GWW,HAL,HAS,HBAN,HBI,HCA,HD,HES,HIG,HII,HLT,HOLX,HON,HPE,HPQ,HRB,HRL,HSIC,HST,HSY,HUM,HWM,IBM,ICE,IDXX,IEX,IFF,ILMN,INCY,INF

We now create an empty data frame in pandas to push the data, from the API call, into it.

In [8]:
my_columns = ['Ticker', 'Price', 'One-Year Price Return', 'Number of Shares to Buy']
final_dataframe = pd.DataFrame(columns = my_columns)

Now, to push the data into our data frame. We will use both, stats and quote endpoint to get the data we require.

In [130]:
final_dataframe = pd.DataFrame(columns = my_columns)
for symbol_string in symbol_strings:
    batch_api_url = f'https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_url).json()
    #print(data)
    for symbol in symbol_string.split(','):
        final_dataframe = final_dataframe.append(
        pd.Series(
        [symbol,
        data[symbol]['quote']['latestPrice'],
        data[symbol]['stats']['year1ChangePercent'],
        'N/A'
        ],
        index = my_columns),
        ignore_index = True)
final_dataframe
        

Unnamed: 0,Ticker,Price,One-Year Price Return,Number of Shares to Buy
0,A,120.06,-0.193485,
1,AAL,15.28,-0.20841,
2,AAP,195.60,-0.049764,
3,AAPL,158.27,0.0692547,
4,ABBV,153.40,0.361053,
...,...,...,...,...
496,YUM,121.83,0.076254,
497,ZBH,113.00,-0.287721,
498,ZBRA,336.04,-0.387568,
499,ZION,53.66,0.107201,


## Removing Low-Momentum Stocks

The investment strategy that we're building seeks to identify the 50 highest-momentum stocks in the S&P 500.

Because of this, the next thing we need to do is remove all the stocks in our DataFrame that fall below this momentum threshold. We'll sort the DataFrame by the stocks' one-year price return, and drop all stocks outside the top 50.

In [12]:
final_dataframe.sort_values('One-Year Price Return', ascending = False, inplace = True) #inplace = True is used to modify the original data frame
final_dataframe = final_dataframe[:50]
final_dataframe

Unnamed: 0,Ticker,Price,One-Year Price Return,Number of Shares to Buy
273,LB,82.25,2.38275,
355,OXY,64.24,1.57744,
147,DVN,60.16,1.47269,
315,MRO,23.12,1.10622,
42,APA,35.83,1.02434,
89,CF,89.79,0.949801,
465,VLO,112.67,0.872257,
313,MPC,91.3,0.784474,
298,MCK,337.8,0.775873,
138,DLTR,178.7,0.74788,


We need to reset the index.

In [13]:
final_dataframe.reset_index(drop = True, inplace = True)
final_dataframe

Unnamed: 0,Ticker,Price,One-Year Price Return,Number of Shares to Buy
0,LB,82.25,2.38275,
1,OXY,64.24,1.57744,
2,DVN,60.16,1.47269,
3,MRO,23.12,1.10622,
4,APA,35.83,1.02434,
5,CF,89.79,0.949801,
6,VLO,112.67,0.872257,
7,MPC,91.3,0.784474,
8,MCK,337.8,0.775873,
9,DLTR,178.7,0.74788,


In [14]:
def portfolio_input():
    global portfolio_size
    portfolio_size = input('Please enter the value of your portfolio:')
    try: 
        val = float(portfolio_size)
    except ValueError: #forcing a string to float() gives a Value Error
        print('Please enter the portfolio value as a number without currency')
        portfolio_size = input('Please enter the value of your portfolio:')
portfolio_input()

Please enter the value of your portfolio: 1000000


## Calculating the number of shares to buy:

In [15]:
position_size = float(portfolio_size)/len(final_dataframe.index)
for index in final_dataframe.index:
    final_dataframe.loc[index, 'Number of Shares to Buy'] = math.floor(position_size/final_dataframe['Price'][index])
final_dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value)


Unnamed: 0,Ticker,Price,One-Year Price Return,Number of Shares to Buy
0,LB,82.25,2.38275,243
1,OXY,64.24,1.57744,311
2,DVN,60.16,1.47269,332
3,MRO,23.12,1.10622,865
4,APA,35.83,1.02434,558
5,CF,89.79,0.949801,222
6,VLO,112.67,0.872257,177
7,MPC,91.3,0.784474,219
8,MCK,337.8,0.775873,59
9,DLTR,178.7,0.74788,111


## Building a More Realistic Momentum Strategy:

Real-world quantitative investment firms differentiate between "high quality" and "low quality" momentum stocks:

- High-quality momentum stocks show "slow and steady" outperformance over long periods of time
- Low-quality momentum stocks might not show any momentum for a long time, and then surge upwards.

The reason why high-quality momentum stocks are preferred is because low-quality momentum can often be cause by short-term news that is unlikely to be repeated in the future (such as an FDA approval for a biotechnology company).


To identify high-quality momentum, we're going to build a strategy that selects stocks from the highest percentiles of:

- 1-month price returns
- 3-month price returns
- 6-month price returns
- 1-year price returns

Let's start by building our DataFrame. You'll notice that I use the abbreviation hqm often. It stands for high-quality momentum.

In [9]:
hqm_columns = ['Ticker', 
               'Price',
               'One-month Return',
               'One-month Return Percentile', 
               'Three-month Return',
               'Three-month Return Percentile', 
               'Six-month Return',
               'Six-month Return Percentile', 
               'One-Year Return',
               'One-Year Return Percentile', 
               'Number of Shares to Buy']
hqm_dataframe = pd.DataFrame(columns = hqm_columns)
hqm_dataframe

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy


Now we need to make an API call to get these metrics. We can refurbish the code from above to fit our requirement here. Reusability is one of the key components of a good code.

In [10]:
for symbol_string in symbol_strings:
    batch_api_url = f'https://sandbox.iexapis.com/stable/stock/market/batch/?types=stats,quote&symbols={symbol_string}&token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(batch_api_url).json()
    #print(data)
    for symbol in symbol_string.split(','):
        hqm_dataframe = hqm_dataframe.append( # the only changes that are made are the columns and the name of the data frame
        pd.Series( 
        [symbol,
         data[symbol]['quote']['latestPrice'],
         data[symbol]['stats']['month1ChangePercent'],
         'N/A',
         data[symbol]['stats']['month3ChangePercent'],
         'N/A',
         data[symbol]['stats']['month6ChangePercent'],
         'N/A',
         data[symbol]['stats']['year1ChangePercent'],
         'N/A',
         'N/A'],
        index = hqm_columns),
        ignore_index = True)
hqm_dataframe

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy
0,A,125.200,0.120315,,0.0177166,,-0.0734235,,-0.144705,,
1,AAL,13.970,0.0826888,,-0.307683,,-0.139079,,-0.353409,,
2,AAP,198.310,0.107551,,-0.158497,,-0.151976,,-0.070692,,
3,AAPL,153.520,0.144247,,-0.0680674,,-0.04238,,0.0771907,,
4,ABBV,150.830,0.0397097,,-0.060248,,0.143063,,0.317363,,
...,...,...,...,...,...,...,...,...,...,...,...
496,YUM,120.110,0.0812466,,-0.0292748,,-0.0229501,,0.0494523,,
497,ZBH,110.688,0.0634575,,-0.174896,,-0.116461,,-0.317956,,
498,ZBRA,326.200,0.124716,,-0.155061,,-0.328445,,-0.391153,,
499,ZION,53.310,0.000583426,,-0.148942,,-0.146286,,0.0274332,,


## Calculating Momentum Percentiles:

We now need to calculate momentum percentile scores for every stock in the universe. More specifically, we need to calculate percentile scores for the following metrics for every stock:

- One-Year Price Return
- Six-Month Price Return
- Three-Month Price Return
- One-Month Price Return

Here's how we'll do this:

In [12]:
time_periods = [
                'One-month',
                'Three-month',
                'Six-month',
                'One-Year',
                ]
for time_period in time_periods:
    for index in hqm_dataframe.index:
        hqm_dataframe.loc[index, f'{time_period} Return Percentile'] = stats.percentileofscore(
            hqm_dataframe[f'{time_period} Return'],
            hqm_dataframe.loc[index, f'{time_period} Return'])
hqm_dataframe

TypeError: '<' not supported between instances of 'NoneType' and 'float'

Well this should have worked, but, we kept getting an error saying 'NoneType' and 'float' are not comparable. Which means there are rows in this data with None values is them. So we'll run a couple of test loops for that.

In [49]:
tests = time_periods
test_dataframe = pd.DataFrame(columns = tests)
for test in tests:
    a = 0
    for index in hqm_dataframe.index:
        if hqm_dataframe[f'{test} Return'][index] is None:
            test_dataframe.loc[a, test] = index
            a = a + 1
test_dataframe

Unnamed: 0,One-month,Three-month,Six-month,One-Year
0,118,118,118,118
1,164,164,164,164
2,324,324,324,324
3,325,325,325,325


We see that each column has four None values at the same index to the next column, i.e., that 4 rows have None values in them. Now, we can deal with this by making all of them 0, but that will skew our percentile scores, therefore, it would be better to simply remove them from the data itself.

I wrote a program only to realise that isnull() method could be used here instead. I'm new to python and may not present the optimal solution in one go but I usually do tend to be able to work it out. We can do this by calling all the rows with columns that have None value in them.

In [22]:
hqm_dataframe[hqm_dataframe.isnull().any(axis=1)]

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy
118,CTL,11.0,,,,,,,,,
164,ETFC,51.44,,,,,,,,,
324,MYL,16.484,,,,,,,,,
325,NBL,8.76,,,,,,,,,


We need to go back and remove these from the data sheet.

Now that that's done. We copy the same code and that should run fine. I am leaving the error message there in order to showcase the live debugging of the code which is a learning process for every coder.

In [91]:
time_periods = [
                'One-month',
                'Three-month',
                'Six-month',
                'One-Year',
                ]
for time_period in time_periods:
    for index in hqm_dataframe.index:
        hqm_dataframe.loc[index, f'{time_period} Return Percentile'] = stats.percentileofscore(
            hqm_dataframe[f'{time_period} Return'],
            hqm_dataframe.loc[index, f'{time_period} Return'])
hqm_dataframe

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy
0,A,123.36,0.060916,43.0584,-0.060015,69.8189,-0.153141,39.4366,-0.188820,24.3461,
1,AAL,15.93,0.174730,97.1831,-0.226758,13.2797,-0.128714,45.2716,-0.207755,21.7304,
2,AAP,193.79,0.142489,91.1469,-0.139120,37.0221,-0.174562,32.3944,-0.050404,49.6982,
3,AAPL,159.73,0.154477,93.5614,-0.097903,54.7284,-0.091762,55.1308,0.066542,71.831,
4,ABBV,148.64,0.096924,72.6358,-0.033592,77.0624,0.127758,92.1529,0.365672,94.165,
...,...,...,...,...,...,...,...,...,...,...,...
492,YUM,122.47,0.098578,73.6419,-0.034082,76.66,-0.028462,69.2153,0.077936,73.0382,
493,ZBH,113.00,0.047974,34.2052,-0.183446,23.34,-0.128652,45.4728,-0.286008,11.0664,
494,ZBRA,335.54,0.082818,62.7767,-0.237666,11.0664,-0.399843,3.21932,-0.392609,4.82897,
495,ZION,55.19,0.006512,18.3099,-0.200994,18.1087,-0.215026,25.1509,0.107400,76.0563,


## Calculating the HQM Score:

We'll now calculate our HQM Score, which is the high-quality momentum score that we'll use to filter for stocks in this investing strategy.

The HQM Score will be the arithmetic mean of the 4 momentum percentile scores that we calculated in the last section.

To calculate arithmetic mean, we will use the mean function from Python's built-in statistics module.

In [92]:
for row in hqm_dataframe.index:
    momentum_percentiles = []
    for time_period in time_periods:
        momentum_percentiles.append(hqm_dataframe.loc[row, f'{time_period} Return Percentile'])
    hqm_dataframe.loc[row, 'HQM Score'] = mean(momentum_percentiles) #.loc[] will create a column with the key it can't find
hqm_dataframe # so we don't need to go back and create a new column in the data frame

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy,HQM Score
0,A,123.36,0.060916,43.0584,-0.060015,69.8189,-0.153141,39.4366,-0.188820,24.3461,,44.164990
1,AAL,15.93,0.174730,97.1831,-0.226758,13.2797,-0.128714,45.2716,-0.207755,21.7304,,44.366197
2,AAP,193.79,0.142489,91.1469,-0.139120,37.0221,-0.174562,32.3944,-0.050404,49.6982,,52.565392
3,AAPL,159.73,0.154477,93.5614,-0.097903,54.7284,-0.091762,55.1308,0.066542,71.831,,68.812877
4,ABBV,148.64,0.096924,72.6358,-0.033592,77.0624,0.127758,92.1529,0.365672,94.165,,84.004024
...,...,...,...,...,...,...,...,...,...,...,...,...
492,YUM,122.47,0.098578,73.6419,-0.034082,76.66,-0.028462,69.2153,0.077936,73.0382,,73.138833
493,ZBH,113.00,0.047974,34.2052,-0.183446,23.34,-0.128652,45.4728,-0.286008,11.0664,,28.521127
494,ZBRA,335.54,0.082818,62.7767,-0.237666,11.0664,-0.399843,3.21932,-0.392609,4.82897,,20.472837
495,ZION,55.19,0.006512,18.3099,-0.200994,18.1087,-0.215026,25.1509,0.107400,76.0563,,34.406439


That looks fine. 

## Selecting the 50 Best Momentum Stocks:

Now all we need to do it to arrange them in descending order of the HQM Score and select the first 50 stocks which we can then use to calculate the number of shares.

In [93]:
hqm_dataframe.sort_values(by = 'HQM Score', ascending = False, inplace = True)
hqm_dataframe = hqm_dataframe[:50]
hqm_dataframe

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy,HQM Score
223,HRB,37.8,0.156539,94.7686,0.38686,100.0,0.763774,99.5976,0.704857,97.5855,,97.987928
107,COG,22.71,0.249634,99.7988,0.258928,99.5976,0.189137,95.9759,0.350293,93.9638,,97.334004
104,CNC,92.15,0.199724,98.3903,0.064946,96.3783,0.180378,95.5734,0.284714,91.1469,,95.372233
351,OXY,65.98,0.131648,88.33,0.042675,93.5614,0.794246,99.7988,1.572829,99.7988,,95.372233
279,LLY,329.66,0.123064,85.3119,0.104113,98.3903,0.344187,99.3964,0.425287,94.9698,,94.517103
424,TAP,59.89,0.168368,96.5795,0.058268,95.7746,0.189731,96.1771,0.20757,87.3239,,93.963783
452,UNH,541.8,0.182839,98.1891,-0.004546,83.501,0.165225,94.9698,0.33524,93.5614,,92.555332
137,DLTR,180.94,0.135509,89.7384,-0.009931,82.0926,0.305426,97.9879,0.747583,98.1891,,92.002012
94,CI,270.93,0.126987,87.1227,0.058172,95.5734,0.168215,95.171,0.231371,88.9336,,91.700201
228,HUM,495.04,0.161029,95.7746,0.084339,97.3843,0.326749,99.1952,0.082631,73.4406,,91.448692


Resetting the indices.

In [94]:
hqm_dataframe.reset_index(drop = True, inplace = True)
hqm_dataframe

Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy,HQM Score
0,HRB,37.8,0.156539,94.7686,0.38686,100.0,0.763774,99.5976,0.704857,97.5855,,97.987928
1,COG,22.71,0.249634,99.7988,0.258928,99.5976,0.189137,95.9759,0.350293,93.9638,,97.334004
2,CNC,92.15,0.199724,98.3903,0.064946,96.3783,0.180378,95.5734,0.284714,91.1469,,95.372233
3,OXY,65.98,0.131648,88.33,0.042675,93.5614,0.794246,99.7988,1.572829,99.7988,,95.372233
4,LLY,329.66,0.123064,85.3119,0.104113,98.3903,0.344187,99.3964,0.425287,94.9698,,94.517103
5,TAP,59.89,0.168368,96.5795,0.058268,95.7746,0.189731,96.1771,0.20757,87.3239,,93.963783
6,UNH,541.8,0.182839,98.1891,-0.004546,83.501,0.165225,94.9698,0.33524,93.5614,,92.555332
7,DLTR,180.94,0.135509,89.7384,-0.009931,82.0926,0.305426,97.9879,0.747583,98.1891,,92.002012
8,CI,270.93,0.126987,87.1227,0.058172,95.5734,0.168215,95.171,0.231371,88.9336,,91.700201
9,HUM,495.04,0.161029,95.7746,0.084339,97.3843,0.326749,99.1952,0.082631,73.4406,,91.448692


That looks good. Now, we can proceed to calculating the number of shares of each stock that one must buy.

## Calculating the Number of Shares to Buy:

We'll use the portfolio_input function that we created earlier to accept our portfolio size. Then we will use similar logic in a for loop to calculate the number of shares to buy for each stock in our investment universe.

In [79]:
portfolio_input()

Please enter the value of your portfolio: 1000000


In [95]:
position_size = float(portfolio_size)/len(hqm_dataframe.index)
for index in hqm_dataframe.index:
    hqm_dataframe.loc[index, 'Number of Shares to Buy'] = math.floor(position_size/hqm_dataframe['Price'][index])
hqm_dataframe

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value)


Unnamed: 0,Ticker,Price,One-month Return,One-month Return Percentile,Three-month Return,Three-month Return Percentile,Six-month Return,Six-month Return Percentile,One-Year Return,One-Year Return Percentile,Number of Shares to Buy,HQM Score
0,HRB,37.8,0.156539,94.7686,0.38686,100.0,0.763774,99.5976,0.704857,97.5855,529,97.987928
1,COG,22.71,0.249634,99.7988,0.258928,99.5976,0.189137,95.9759,0.350293,93.9638,880,97.334004
2,CNC,92.15,0.199724,98.3903,0.064946,96.3783,0.180378,95.5734,0.284714,91.1469,217,95.372233
3,OXY,65.98,0.131648,88.33,0.042675,93.5614,0.794246,99.7988,1.572829,99.7988,303,95.372233
4,LLY,329.66,0.123064,85.3119,0.104113,98.3903,0.344187,99.3964,0.425287,94.9698,60,94.517103
5,TAP,59.89,0.168368,96.5795,0.058268,95.7746,0.189731,96.1771,0.20757,87.3239,333,93.963783
6,UNH,541.8,0.182839,98.1891,-0.004546,83.501,0.165225,94.9698,0.33524,93.5614,36,92.555332
7,DLTR,180.94,0.135509,89.7384,-0.009931,82.0926,0.305426,97.9879,0.747583,98.1891,110,92.002012
8,CI,270.93,0.126987,87.1227,0.058172,95.5734,0.168215,95.171,0.231371,88.9336,73,91.700201
9,HUM,495.04,0.161029,95.7746,0.084339,97.3843,0.326749,99.1952,0.082631,73.4406,40,91.448692


We're almost finished. All we need to do is to frmat our output into an Excel file in order for it to be accessible to the general public.

## Formatting Our Excel Output:

We will be using the XlsxWriter library for Python to create nicely-formatted Excel files. XlsxWriter is an excellent package and offers tons of customization. We start by initialising the xlsxwriter object.

In [121]:
writer = pd.ExcelWriter('momentum_strategy.xlsx', engine='xlsxwriter')
hqm_dataframe.to_excel(writer, sheet_name='Momentum Strategy', index = False)

You'll recall from our first project (Equal Weight S&P 500) that formats include colors, fonts, and also symbols like % and $. We'll need four main formats for our Excel document:

- String format for tickers
- $XX.XX format for stock prices
- $XX,XXX format for market capitalization
- Integer format for the number of shares to purchase
- Percent format for the ratios
- Decimal format for the HQM Values
Since we already built our formats in the last project, I've included them below for you. Run this code cell before proceeding.

In [122]:
background_color = '#ffffff'
font_color = '#000000'
# let's start with string format
string_format = writer.book.add_format(
        {
            'font_color': font_color,
            'bg_color': background_color,
            'border': 1
        }
    )
dollar_format = writer.book.add_format(
        {
            'num_format': '$0.00',
            'font_color': font_color,
            'bg_color': background_color,
            'border': 1
        }
    )
integer_format = writer.book.add_format(
        {
            'num_format': '0',
            'font_color': font_color,
            'bg_color': background_color,
            'border': 1
        }
    )
# this assigns formats to each of the variables

To this we'll add a few other formats that are missing, which handles the percentages, float, and, bold letters for the column labels.

In [123]:
percent_format = writer.book.add_format(
        {
            'num_format':'0.0%',
            'font_color': font_color,
            'bg_color': background_color,
            'border': 1
        }
    )
decimal_format = writer.book.add_format(
        {
            'num_format': '0.00',
            'font_color': font_color,
            'bg_color': background_color,
            'border': 1
        }
    )
bold_format = writer.book.add_format({'bold': True})

Now we apply the formats to the columns of the data frame. We create a dictionary with the column details and loop through that.

In [124]:
column_formats = { 
                    'A': ['Ticker', string_format],
                    'B': ['Price', dollar_format],
                    'C': ['One-month Return', percent_format],
                    'D': ['One-month Return Percentile', decimal_format],
                    'E': ['Three-month Return', percent_format],
                    'F': ['Three-month Return Percentile', decimal_format],
                    'G': ['Six-month Return', percent_format],
                    'H': ['Six-month Return Percentile', decimal_format],
                    'I': ['One-Year Return', percent_format],
                    'J': ['One-Year Return Percentile', decimal_format],
                    'K': ['Number of Shares to Buy', integer_format],
                    'L': ['HQM Score', decimal_format]
                    }
for column in column_formats.keys():
    writer.sheets['Momentum Strategy'].set_column(f'{column}:{column}', 20, column_formats[column][1])
    writer.sheets['Momentum Strategy'].write(f'{column}1', column_formats[column][0], bold_format)
writer.save()

Voila! And that brings us to the end of this project. The purposes of which was to practice API calling, data cleaning and exploration, debugging, and, formatting of our data into xlsx files.