# Equal-Weight S&P 500 Index Fund

# Start of Project
"The goal of this section is to create a Python script that will accept the value of the user's portfolio and tell the user how many shares of each S&P 500 constituent the user should purchase to get an equal-weight version of the index fund." 

In [1]:
#Import all the open source libraries 
import numpy as np
import pandas as pd
import requests
import xlsxwriter
import math

Calling a csv file saved in this directory to have a list of the 500 symbols of companies comprising the S&P 500, and saving this file as a pandas dataframe 

In [2]:
stocks = pd.read_csv("sp_500_stocks.csv")
type(stocks)

pandas.core.frame.DataFrame

In [3]:
stocks

Unnamed: 0,Ticker
0,A
1,AAL
2,AAP
3,AAPL
4,ABBV
...,...
500,YUM
501,ZBH
502,ZBRA
503,ZION


Import API (sandbox) token from the https://iexcloud.io/console/ account (data provider). This token is normally secret and should not be pushed to Git. 
secrets.py

In [4]:
from secrets import IEX_CLOUD_API_TOKEN

Making our first API call to the IEX cloud. 
We are looking for these end points for each symbol in our panda dataframe:
- market capitalization
- stock price

Testing the call for one stock

In [5]:
# Test for 1 symbol (AAPL)
symbol = "AAPL"
# indicate the API's url - search for it on the API documentation (https://iexcloud.io/docs/api/)
# the base API is https://cloud.iexapis.com/, but because we are using a sandbox (https://sandbox.iexapis.com)
#api_url = "https://sandbox.iexapis.com" + endpoint 
# search for the ideal endpoint(s). Here the quote endpoint gives both the stock price and market cap
#HTTP REQUEST : GET /stock/{symbol}/quote/{field}
# add end point to the url: "https://sandbox.iexapis.com/stock/{symbol}/quote/"
#make it an f string so that the placeholder is recognized
api_url = f'https://sandbox.iexapis.com/stock/{symbol}/quote/'
print(api_url)

https://sandbox.iexapis.com/stock/AAPL/quote/


In [6]:
# Add secret API token
api_url = f'https://sandbox.iexapis.com/stock/{symbol}/quote/?token={IEX_CLOUD_API_TOKEN}'
print(api_url)

https://sandbox.iexapis.com/stock/AAPL/quote/?token=Tpk_f5a406208fe6498c881594c2cd1ba5b8


Next: do a get request and store the data in a variable


In [7]:
data = requests.get(api_url)
type(data)

requests.models.Response

In [8]:
print(data.status_code)

404


Code 404 is because we did not precise that we wanted to access the latest STABLE API version (versus the latest API version)

In [9]:
#Adding /stable/ to the source address
api_url = f'https://sandbox.iexapis.com/stable/stock/{symbol}/quote/?token={IEX_CLOUD_API_TOKEN}'
data = requests.get(api_url)
type(data)

requests.models.Response

In [10]:
print(data.status_code == 200) #200 meaning the data came through

True


In [11]:
#Checking the data itself: turning that data into a json object using the json() method
data = requests.get(api_url).json()
print(data)

{'avgTotalVolume': 88317780, 'calculationPrice': 'close', 'change': -1.6, 'changePercent': -0.01066, 'close': 150.59, 'closeSource': 'ifcifola', 'closeTime': 1690521603022, 'companyName': 'Apple Inc', 'currency': 'USD', 'delayedPrice': 148.56, 'delayedPriceTime': 1636926413052, 'extendedChange': -0.35, 'extendedChangePercent': -0.00244, 'extendedPrice': 145.48, 'extendedPriceTime': 1634548507034, 'high': 150.87, 'highSource': 'e5d e  niaeyd1trlicemup', 'highTime': 1695956894095, 'iexAskPrice': 0, 'iexAskSize': 0, 'iexBidPrice': 0, 'iexBidSize': 0, 'iexClose': 148.47, 'iexCloseTime': 1664406878877, 'iexLastUpdated': 1704379774065, 'iexMarketPercent': 0.011733182987883688, 'iexOpen': 148.52, 'iexOpenTime': 1636812165075, 'iexRealtimePrice': 152.24, 'iexRealtimeSize': 1, 'iexVolume': 871058, 'lastTradeTime': 1680943185329, 'latestPrice': 150.49, 'latestSource': 'Close', 'latestTime': 'September 27, 2021', 'latestUpdate': 1694419072116, 'latestVolume': 74333611, 'low': 147.65, 'lowSource':

We recognize a python dictionary. Now can parse the key/value pair we are interested in.

In [12]:
price = data['latestPrice']
price

150.49

In [13]:
market_cap = data['marketCap']
market_cap

2477658375161

In [14]:
#make it readable :)
print(market_cap/1000000000000)

2.477658375161


In [15]:
market_cap = market_cap/1000000000000

# Adding Our Stocks Data to a Pandas DataFrame
The next thing we need to do is add our stock's price and market capitalization to a pandas DataFrame. 
We need to think about what we want that dataframe to look like and contend

In [16]:
my_columns = ['Ticker', 'Stock Price', 'Market Cap', '# of shares to buy'] #a simple python list
#now use the panda DataFrame class initiation to initialize the dataframe
#have a row with default values of 0 (list within a list)
final_dataframe = pd.DataFrame([[0,0,0,0]],columns = my_columns)
final_dataframe

Unnamed: 0,Ticker,Stock Price,Market Cap,# of shares to buy
0,0,0,0,0


now we need to append the values we obtain from our data

In [17]:
#appending a panda series that lists the data in the dataframe. 
#In a panda dataframe every row and every column is a panda series. pd.series which takes a python list as argument
final_dataframe.append(
    pd.Series(
        [
            #python list
            symbol,
            price,
            market_cap,
            "N/A" # this at the moment is N/A (later calculations will be performed for # of shares to buy)
        ],
        index = my_columns  #tells the append method which column the data is added to      
    ),
    ignore_index = True # necessary when adding data to a panda dataframe

)

Unnamed: 0,Ticker,Stock Price,Market Cap,# of shares to buy
0,0,0.0,0.0,0.0
1,AAPL,150.49,2.477658,


Now that we know that it works for 1 symbol, we can run the program for all the symbols.
We will overwrite the existing dataframe, and start a new one using a loop through the csv file's ticker column.

In [18]:
# Try the loop with the first 5 symbols
final_dataframe = pd.DataFrame(columns = my_columns)
for stock in stocks["Ticker"][:5]:#for the first 5 stocks in the list
    #print(stock)
    #access the API, replace symbol placeholder with stock placeholder
    api_url = f'https://sandbox.iexapis.com/stable/stock/{stock}/quote/?token={IEX_CLOUD_API_TOKEN}'
    data = requests.get(api_url).json()
    
    final_dataframe = final_dataframe.append(
        pd.Series(
        [
            stock,
            data['latestPrice'],
            data['marketCap']/1000000000000,
            "N/A" 
        ],
        index = my_columns  #tells the append method which column the data is added to      
    ),
    ignore_index = True # necessary when adding data to a panda dataframe
    
    )
    
    

In [19]:
final_dataframe

Unnamed: 0,Ticker,Stock Price,Market Cap,# of shares to buy
0,A,172.6,0.050762,
1,AAL,22.53,0.014076,
2,AAP,222.9,0.014246,
3,AAPL,152.0,2.410098,
4,ABBV,110.21,0.195004,


Given that the ticker list is 500 long, best is to run this in batch (otherwise too long to run)
and overwrite the empty dataframe with the results from the append method

Nick McCallum's notes:
"Using Batch API Calls to Improve Performance
Batch API calls are one of the easiest ways to improve the performance of your code.This is because HTTP requests are typically one of the slowest components of a script.Also, API providers will often give you discounted rates for using batch API calls since they are easier for the API provider to respond to.

IEX Cloud limits their batch API calls to 100 tickers per request. Still, this reduces the number of API calls we'll make in this section from 500 to 5 - huge improvement! In this section, we'll split our list of stocks into groups of 100 and then make a batch API call for each group."


In [20]:
def chunks (lst, n):
    for i in range(0, len(lst), n):
        yield lst [i:i + n]
        #yields successive n-sized chunks from list

In [21]:
symbol_groups = list (chunks (stocks['Ticker'], 100)) #creates a list of list where the second list is no longer than 100
symbol_strings = []
for i in range (0, len(symbol_groups)):
    #print(i)
    #print(symbol_groups[i])
    symbol_strings.append(','.join(symbol_groups[i]))
    #print(symbol_strings[i])
final_dataframe = pd.DataFrame(columns = my_columns)
#final_dataframe   
#for symbol_string in symbol_strings[:1]: --> test for first chunk
for symbol_string in symbol_strings:
    #print(symbol_string)
    batch_api_call_url = f'https://sandbox.iexapis.com/stable/stock/market/batch?symbols={symbol_string},fb,tsla&types=quote&token={IEX_CLOUD_API_TOKEN}'
    #print(batch_api_call_url)
    #data = requests.get(batch_api_call_url)
    #print(data.status_code)
    data = requests.get(batch_api_call_url).json()
    for symbol in symbol_string.split(','):
        #print(symbol)
        final_dataframe = final_dataframe.append(
            pd.Series(
            [
                symbol,
                data[symbol]['quote']['latestPrice'],
                data[symbol]['quote']['marketCap'],
                "N/A"
            ],
                index = my_columns
            ),
            ignore_index = True
        )
final_dataframe
        

Unnamed: 0,Ticker,Stock Price,Market Cap,# of shares to buy
0,A,171.75,50683839369,
1,AAL,22.18,14351556314,
2,AAP,220.70,14292581103,
3,AAPL,147.62,2496315609890,
4,ABBV,109.68,196836548185,
...,...,...,...,...
500,YUM,131.49,38083428422,
501,ZBH,157.28,33204895304,
502,ZBRA,574.68,29600088897,
503,ZION,64.08,10534798712,


# Calculating the Number of Shares to Buy
with user input for the size of the portfolio (money available to buy shares)

In [22]:
portfolio_size = input("Enter the amount you have available to buy shares: ")
try:
    val = float(portfolio_size)
    #print(val)
except ValueError:
    print("That is not a number. Please enter a valid number (digits only)")
    portfolio_size = input("Enter the amount you have available to buy shares")
    val = float(portfolio_size)
    

Enter the amount you have available to buy shares: 1000000


In [23]:
#in this program we will allocate the portfolio size equally among all shares in the index
position_size = val/len(final_dataframe.index)
position_size

1980.1980198019803

Here is the calculation to know the number of shares that can be bought for each stock

In [24]:
#number_of_shares = position_size / stock price
#constraint: fractional share purchase not allowed
#testing for 1 stock

#number_of_apple_shares = math.floor(position_size/500)
#number_of_apple_shares

In [25]:
for i in range (0, len(final_dataframe.index)):
    #print(i) --> 504 rows
    #now we need to replace the N/A in the # of shares to buy with the calulated value
    #First locate the column affected
    final_dataframe.loc[i, '# of shares to buy'] = math.floor(position_size/final_dataframe.loc[i, 'Stock Price'])
    
    
final_dataframe
    


Unnamed: 0,Ticker,Stock Price,Market Cap,# of shares to buy
0,A,171.75,50683839369,11
1,AAL,22.18,14351556314,89
2,AAP,220.70,14292581103,8
3,AAPL,147.62,2496315609890,13
4,ABBV,109.68,196836548185,18
...,...,...,...,...
500,YUM,131.49,38083428422,15
501,ZBH,157.28,33204895304,12
502,ZBRA,574.68,29600088897,3
503,ZION,64.08,10534798712,30


# Formatting Our Excel Output
Saving the panda dataframe in an Excel file.
We will be using the XlsxWriter library for Python to create nicely-formatted Excel files.

## Initializing our XlsxWriter Object

In [26]:
# initializing the class xlswriter from the pd library
writer = pd.ExcelWriter("recommanded trades.xlsx", engine = "xlsxwriter")
# next we need to pass our panda data frame and specify which tab of the excel file we want that to go
# first thing we pass in is the writer object, then the excel tab, and last set an index = false argument
final_dataframe.to_excel(writer, "Recommended Trades", index = False)

# Creating the Formats We'll Need For Our .xlsx File
Formats include colors, fonts, and also symbols like % and $. We'll need four main formats for our Excel document:

String format for tickers
\$XX.XX format for stock prices
\$XX,XXX format for market capitalization
Integer format for the number of shares to purchase

In [27]:
# creating 2 variables for the that specify the color scheme of the Excel file that we can reference later on
background_color = "#808080"
font_color = "#ffffff"

string_format = writer.book.add_format(
    {
        'font_color': font_color,
        'bg_color' : background_color,
        'border' : 1
    }

)
dollar_format = writer.book.add_format(
    {
        'num_format': '$0.00',
        'font_color': font_color,
        'bg_color' : background_color,
        'border' : 1
    }

)
integer_format = writer.book.add_format(
    {
        'num_format': '0',
        'font_color': font_color,
        'bg_color' : background_color,
        'border' : 1
    }

)

# Applying the Formats to the Columns of Our .xlsx File
We can use the set_column method applied to the writer.sheets['Recommended Trades'] object to apply formats to specific columns of our spreadsheets.

Here's an example:

writer.sheets['Recommended Trades'].set_column('B:B', #This tells the method to apply the format to column B
                     18, #This tells the method to apply a column width of 18 pixels
                     string_template #This applies the format 'string_template' to the column
                    )

In [28]:
#we test that for one column, and then incorporate in a loop so the program extends to all columns
#writer.sheets['Recommended Trades'].set_column('A:A', 18, string_format)
#since the first column will contain the ticker
#we must save this
#writer.save()

#writer.sheets['Recommended Trades'].set_column('B:B', 18, string_format)
#writer.sheets['Recommended Trades'].set_column('C:C', 18, string_format)
#writer.sheets['Recommended Trades'].set_column('D:D', 18, string_format)
#writer.save()

#writer.sheets['Recommended Trades'].write('A1', 'Ticker', string_format)
#writer.sheets['Recommended Trades'].write('B1', 'Stock Price', dollar_format)
#writer.sheets['Recommended Trades'].write('C1', 'Market Cap', dollar_format)
#writer.sheets['Recommended Trades'].write('D1', '# of shares to buy', integer_format)

In [29]:
#this works but it violates the rule of don't repeat yourself.
#we fix that by putting this code in 2 loops
#we create a column format and make a dictionary
column_formats = {
    'A':['Ticker', string_format], # value for the key is a list
    'B':['Stock Price', dollar_format],
    'C':['Market Cap', dollar_format],
    'D':['# of shares to buy', integer_format]
}

for column in column_formats.keys():
    writer.sheets['Recommended Trades'].set_column(f'{column}:{column}', 18, column_formats[column][1])
    writer.sheets['Recommended Trades'].write(f'{column}1', column_formats[column][0], column_formats[column][1])
                                                   
writer.save()