# Examining relation between Nvidia stock prices and consumer hardware marketshare


By Izaan Syed, 100922342

Provide the reader with some background on your question and data. Where
did you get your data? What question looked interesting given the data you chose? This
section should be between 150-500 words

In recent days, Nvidia stock prices have skyrocketed. I wanted to see if their consumer GPU marketshare can be attributed to their stock prices going up.

The time period observed will be October 2024, to November 2019. Chosen as the latest steam hardware survery data was last updated for October 2024, and last full month of stock trading (november not finished yet).

The data was collected from Marketwatch; https://www.marketwatch.com/investing/stock/amd/download-data?startDate=11/15/2019&endDate=11/15/2020 , and the steam hardware survery, found in CSV format here: https://github.com/jdegene/steamHWsurvey

Unfortunately, the steam hardware survey does not report total participants, rather only percentages. This means that the comparison will have to be made between marketshare dominance (compared to Intel and AMD) and stock prices, rather than just GPU units compared to stock price.

Spikes can be attributed 20/16 series launch, 30 series launch, 40 series launch, AMD 5XXX series 6XXX series, 7XXX series

stock split dates: 6/10/2024, 7/20/2021

In [None]:
#Cleaning of SHS survery for data on Nvidia Marketshare

import pandas as pd

shs = pd.read_csv('shs.csv')

# find the index of the row of 2019-11-01
idx = shs[shs['date'] == '2019-11-01'].index[0]

# delete all rows prior to idx
shs = shs.loc[idx:]

# delete all columns that do not have 'Video Card Description' in the category row
shs = shs[shs['category'].str.contains('Video Card Description')]

# delete all rows that do not have 'NVIDIA' in the description row. an example of what the description row looks like is 'NVIDIA GeForce 840M,'
shs = shs[shs['name'].str.contains('NVIDIA', na=False, case=False)]

#save as shs2.csv
shs.to_csv('shs2.csv', index=False)

#for each month add up the percentage of marketshare of each video card, creating a total nvidia marketshare column for every month. this will be stored in a different variable

# Calculate the total NVIDIA market share for each month
# Convert the 'date' column to datetime for grouping
shs['date'] = pd.to_datetime(shs['date'])

# Group by year and month, and sum up the percentages
nvidia_marketshare = shs.groupby(shs['date'].dt.to_period('M'))['percentage'].sum().reset_index()

# Convert period to a string representation for easier handling
nvidia_marketshare['date'] = nvidia_marketshare['date'].astype(str)

# Save NVIDIA market share data to a new CSV file if needed
nvidia_marketshare.to_csv('nvidia_marketshare.csv', index=False)

# Print the resulting market share DataFrame
print(nvidia_marketshare)


       date  percentage
0   2019-11      0.6919
1   2019-12      0.6935
2   2020-01      0.6921
3   2020-02      0.6861
4   2020-03      0.6804
5   2020-04      0.6680
6   2020-05      0.6621
7   2020-06      0.6744
8   2020-07      0.6785
9   2020-08      0.6786
10  2020-09      0.6776
11  2020-10      0.6766
12  2020-11      0.6822
13  2020-12      0.7219
14  2021-01      0.6856
15  2021-02      0.6914
16  2021-03      0.6961
17  2021-04      0.7009
18  2021-05      0.6965
19  2021-06      0.7060
20  2021-07      0.6916
21  2021-08      0.7142
22  2021-09      0.7033
23  2021-10      0.6952
24  2021-11      0.6988
25  2021-12      0.7064
26  2022-01      0.6973
27  2022-02      0.7067
28  2022-03      0.7168
29  2022-04      0.7073
30  2022-05      0.7214
31  2022-06      0.7062
32  2022-07      0.7395
33  2022-08      0.7085
34  2022-09      0.7218
35  2022-10      0.7357
36  2022-11      0.7057
37  2022-12      0.7031
38  2023-01      0.6936
39  2023-02      0.7105
40  2023-03     

In [29]:
#Cleaning of Nvidia Stock market data.

import pandas as pd

#combine NVDA1, NVDA2, NVDA3, NVDA4, NVDA5 csv files

nvda1 = pd.read_csv('NVDA1.csv')
nvda2 = pd.read_csv('NVDA2.csv')
nvda3 = pd.read_csv('NVDA3.csv')
nvda4 = pd.read_csv('NVDA4.csv')
nvda5 = pd.read_csv('NVDA5.csv')

nvda = pd.concat([nvda1, nvda2, nvda3, nvda4, nvda5])

#Novembers are recorded twice. Remove the first november at it is not measured at the correct day.
#example:
#11/2023,49.94,50.55,46.42,46.77,"5,023,731,040"
#11/2023,40.88,49.96,40.87,48.89,"4,601,516,714"

#save as nvda-final.csv
nvda.to_csv('NVDA-Final.csv', index=False)

print(nvda)


       Date    Open    High     Low   Close          Volume
0   11/2024  134.70  149.77  134.57  141.98   2,197,686,188
1   10/2024  121.77  144.42  115.14  132.76   5,628,704,705
2   09/2024  116.01  127.67  100.95  121.44   6,270,527,952
3   08/2024  117.53  131.26   90.69  119.37   8,105,367,751
4   07/2024  123.47  136.15  102.54  117.02   6,407,092,844
..      ...     ...     ...     ...     ...             ...
8   03/2020    6.92    7.12    4.52    6.59  15,777,535,522
9   02/2020    5.89    7.91    5.89    6.75  11,855,193,681
10  01/2020    5.97    6.49    5.78    5.91   6,132,787,699
11  12/2019    5.41    6.05    5.01    5.88   6,572,843,654
12  11/2019    5.24    5.54    5.02    5.42   4,281,434,917

[61 rows x 6 columns]
