# Examining relation between Nvidia stock prices and consumer hardware marketshare


By Izaan Syed, 100922342

Provide the reader with some background on your question and data. Where
did you get your data? What question looked interesting given the data you chose? This
section should be between 150-500 words

In recent times, Nvidia stock prices have skyrocketed massively. Because Nvidia's business sector closest to consumers is selling discrete graphics cards, I wanted to see if their consumer GPU marketshare can be linked to their stock prices going up.

The time period observed will be October 2024, to November 2019. This time period was chosen as the latest steam hardware survery data was last updated for October 2024.

The data was collected from Marketwatch; https://www.marketwatch.com/investing/stock/amd/download-data?startDate=11/15/2019&endDate=11/15/2020 , and the steam hardware survery, found in CSV format here: https://github.com/jdegene/steamHWsurvey

Unfortunately, the steam hardware survey does not report total participants, rather only percentages. This means that the comparison will have to be made between marketshare dominance (compared to Intel and AMD) and stock prices, rather than just GPU units compared to stock price.

Spikes in marketshare can be positively attributed 20/16 series launch, 30 series launch, 40 series launch, and negatively attributed to AMD 5XXX series 6XXX series, 7XXX series

stock split dates: 6/10/2024, 7/20/2021

In [48]:
#Cleaning of SHS survery for data on Nvidia Marketshare

import pandas as pd

shs = pd.read_csv('shs.csv')

# find the index of the row of 2019-11-01
idx = shs[shs['date'] == '2019-11-01'].index[0]

# delete all rows prior to idx
shs = shs.loc[idx:]

# delete all columns that do not have 'Video Card Description' in the category row
shs = shs[shs['category'].str.contains('Video Card Description')]

# delete all rows that do not have 'NVIDIA' in the description row. an example of what the description row looks like is 'NVIDIA GeForce 840M,'
shs = shs[shs['name'].str.contains('NVIDIA', na=False, case=False)]

#save as shs2.csv
shs.to_csv('shs2.csv', index=False)

#for each month add up the percentage of marketshare of each video card, creating a total nvidia marketshare column for every month. this will be stored in a different variable

# Calculate the total NVIDIA market share for each month
# Convert the 'date' column to datetime for grouping
shs['date'] = pd.to_datetime(shs['date'])

# Group by year and month, and sum up the percentages
nvidia_marketshare = shs.groupby(shs['date'].dt.to_period('M'))['percentage'].sum().reset_index()

# Convert period to a string representation for easier handling
nvidia_marketshare['date'] = nvidia_marketshare['date'].astype(str)

#add day to YYYY-MM
nvidia_marketshare['date'] = nvidia_marketshare['date'] + '-01'

#change 'date' to 'Date' and 'percentage' to 'Marketshare Percentage'
nvidia_marketshare.columns = ['Date', 'Marketshare Percentage']

# Save NVIDIA market share data to a new CSV file if needed
nvidia_marketshare.to_csv('nvidia_marketshare.csv', index=False)

# Print the resulting market share DataFrame
print(nvidia_marketshare)


          Date  Marketshare Percentage
0   2019-11-01                  0.6919
1   2019-12-01                  0.6935
2   2020-01-01                  0.6921
3   2020-02-01                  0.6861
4   2020-03-01                  0.6804
5   2020-04-01                  0.6680
6   2020-05-01                  0.6621
7   2020-06-01                  0.6744
8   2020-07-01                  0.6785
9   2020-08-01                  0.6786
10  2020-09-01                  0.6776
11  2020-10-01                  0.6766
12  2020-11-01                  0.6822
13  2020-12-01                  0.7219
14  2021-01-01                  0.6856
15  2021-02-01                  0.6914
16  2021-03-01                  0.6961
17  2021-04-01                  0.7009
18  2021-05-01                  0.6965
19  2021-06-01                  0.7060
20  2021-07-01                  0.6916
21  2021-08-01                  0.7142
22  2021-09-01                  0.7033
23  2021-10-01                  0.6952
24  2021-11-01           

In [49]:
#Cleaning of Nvidia Stock market data.

import pandas as pd

#combine NVDA1, NVDA2, NVDA3, NVDA4, NVDA5 csv files

nvda1 = pd.read_csv('NVDA1.csv')
nvda2 = pd.read_csv('NVDA2.csv')
nvda3 = pd.read_csv('NVDA3.csv')
nvda4 = pd.read_csv('NVDA4.csv')
nvda5 = pd.read_csv('NVDA5.csv')

nvda = pd.concat([nvda1, nvda2, nvda3, nvda4, nvda5])

#Remove high, low, close, volume rows. Only open is relevant because it matches the most closely to the SHS measurement dates
nvda = nvda[['Date', 'Open']]

#Rename 'Open' to 'Stock Price'
nvda = nvda.rename(columns={'Open': 'Stock Price'})


#Add /01 to date (MM/YYYY to MM/DD/YYYY)
nvda['Date'] = nvda['Date'].apply(lambda x: f"{x.split('/')[0]}/01/{x.split('/')[1]}") #chatgpt

#make year first (MM/DD/YYYY to YYYY/MM/DD)
nvda['Date'] = pd.to_datetime(nvda['Date'])

#sort in ascending order

nvda = nvda.sort_values(by='Date')

#save as nvda-final.csv
nvda.to_csv('NVDA-Final.csv', index=False)

print(nvda)


         Date  Stock Price
12 2019-11-01         5.24
11 2019-12-01         5.41
10 2020-01-01         5.97
9  2020-02-01         5.89
8  2020-03-01         6.92
7  2020-04-01         6.39
6  2020-05-01         7.11
5  2020-06-01         8.83
4  2020-07-01         9.52
3  2020-08-01        10.73
2  2020-09-01        13.48
1  2020-10-01        13.76
0  2020-11-01        12.66
11 2020-12-01        13.49
10 2021-01-01        13.10
9  2021-02-01        13.05
8  2021-03-01        13.88
7  2021-04-01        13.57
6  2021-05-01        15.13
5  2021-06-01        16.27
4  2021-07-01        20.13
3  2021-08-01        19.70
2  2021-09-01        22.49
1  2021-10-01        20.75
0  2021-11-01        25.65
11 2021-12-01        33.22
10 2022-01-01        29.82
9  2022-02-01        25.10
8  2022-03-01        24.29
7  2022-04-01        27.38
6  2022-05-01        18.54
5  2022-06-01        18.72
4  2022-07-01        14.90
3  2022-08-01        18.18
2  2022-09-01        14.21
1  2022-10-01        12.35
0

In [50]:
# Ensure the 'Date' column in both DataFrames is of the same type
nvda['Date'] = pd.to_datetime(nvda['Date'])  # Convert to datetime
nvidia_marketshare['Date'] = pd.to_datetime(nvidia_marketshare['Date'])  # Convert to datetime

# Merge the two DataFrames on the 'Date' column
nvidia_analysis = pd.merge(nvda, nvidia_marketshare, on='Date', how='outer')

# Sort by date after merging to maintain chronological order
nvidia_analysis = nvidia_analysis.sort_values(by='Date')

# Save the combined DataFrame to a new CSV file
nvidia_analysis.to_csv('nvidia_analysis.csv', index=False)

# Print the resulting combined DataFrame
print(nvidia_analysis)


         Date  Stock Price  Marketshare Percentage
0  2019-11-01         5.24                  0.6919
1  2019-12-01         5.41                  0.6935
2  2020-01-01         5.97                  0.6921
3  2020-02-01         5.89                  0.6861
4  2020-03-01         6.92                  0.6804
5  2020-04-01         6.39                  0.6680
6  2020-05-01         7.11                  0.6621
7  2020-06-01         8.83                  0.6744
8  2020-07-01         9.52                  0.6785
9  2020-08-01        10.73                  0.6786
10 2020-09-01        13.48                  0.6776
11 2020-10-01        13.76                  0.6766
12 2020-11-01        12.66                  0.6822
13 2020-12-01        13.49                  0.7219
14 2021-01-01        13.10                  0.6856
15 2021-02-01        13.05                  0.6914
16 2021-03-01        13.88                  0.6961
17 2021-04-01        13.57                  0.7009
18 2021-05-01        15.13     