## Intermediate Programming in Python S&P Stock and Weight Analysis: Gabriel Haley

### Research Question

Are higher stock prices indicative of higher weight in the stock market?

## Source

https://www.slickcharts.com/sp500

In [17]:
import requests
import pandas as pd
from bs4 import BeautifulSoup

In [19]:
url = "https://www.slickcharts.com/sp500"
headers = {'User-Agent': 'Mozilla/5'} #Need headers to use API
response = requests.get(url, headers = headers)
soup = BeautifulSoup(response.content, 'html.parser') 
table = soup.find('table', class_='table') 
rows = table.find_all('tr') #extract rows in table

data = [] 
for row in rows[1:]: #loops through rows
    cols = row.find_all('td') #gets all table cells
    if len(cols) >= 5: #five columns for each row
        company = cols[1].text.strip()
        symbol = cols[2].text.strip()
        weight = cols[3].text.strip().replace('%', '')
        price = cols[4].text.strip().replace('$', '')
        data.append([company, symbol, weight, price])


In [21]:
#Create dataframes
df = pd.DataFrame(data, columns=['Company', 'Symbol', 'Weight (%)', 'Price']) 
df['Price'] = pd.to_numeric(df['Price'], errors='coerce') #converts price and weight to a numeric value, 'coerce' inserts 'NaN' if the value cannot be converted
df['Weight (%)'] = pd.to_numeric(df['Weight (%)'], errors='coerce')
df = df.dropna(subset=['Weight (%)', 'Price'])
print(df)

                      Company Symbol  Weight (%)   Price
0                      Nvidia   NVDA        7.23  171.47
1                   Microsoft   MSFT        6.49  504.40
2                  Apple Inc.   AAPL        6.07  236.25
3                      Amazon   AMZN        4.17  225.70
4              Meta Platforms   META        3.20  736.99
..                        ...    ...         ...     ...
498  Eastman Chemical Company    EMN        0.01   67.66
499               MarketAxess   MKTX        0.01  179.89
500       News Corp (Class B)    NWS        0.01   32.52
501     Caesars Entertainment    CZR        0.01   25.04
502            Enphase Energy   ENPH        0.01   37.31

[494 rows x 4 columns]


In [23]:
correlation = df['Price'].corr(df['Weight (%)']) #calculates correlation between price and weight
print(f"""With a correlation of {correlation}, the relationship between price and weight is weak in this case which suggests
that stock price has little influence on a stock's market weight.""")

With a correlation of 0.18756373530947143, the relationship between price and weight is weak in this case which suggests
that stock price has little influence on a stock's market weight.
