# CSCA 5622 Final Project - Consumer PC Hardware Trends and Predictions
### By Moshiur Howlader

## Introduction

In the digital age, computer hardware is ubiquitous, and its performance continues to improve year by year. Intel's co-founder provided valuable insight into how computers would evolve, known as **Moore's Law** ([see Wikipedia](https://en.wikipedia.org/wiki/Moore%27s_law)). This observation states that the number of transistors in an integrated circuit (IC) doubles approximately every two years. The chart below illustrates the trend from 1970 to 2020:

<br><br>
<img src="../images/moores_law_transistor_count_1970_2020.png" alt="Transistor count over time" width="1200" height="800">

Based on Moore's Law, consumers might expect to get computer hardware with double the transistors every two years—leading to predictable and consistent increases in computing power. However, the reality is far more complex. As the number of transistors crammed into a fixed area increases, **quantum physics** begins to interfere, imposing physical limitations. These constraints prevent engineers from continuing to follow Moore's Law indefinitely. According to [nano.gov](https://www.nano.gov/nanotech-101/what/nano-size), the average size of a gold atom is 1/3 nm! Clearly, there is a limit to how many transistors can be packed into computer parts. Below are the trends in chip lithography size according to Wikipedia:

| Feature Size | Year |
|--------------|------|
| 20 μm        | 1968 |
| 10 μm        | 1971 |
| 6 μm         | 1974 |
| 3 μm         | 1977 |
| 1.5 μm       | 1981 |
| 1 μm         | 1984 |
| 800 nm       | 1987 |
| 600 nm       | 1990 |
| 350 nm       | 1993 |
| 250 nm       | 1996 |
| 180 nm       | 1999 |
| 130 nm       | 2001 |
| 90 nm        | 2003 |
| 65 nm        | 2005 |
| 45 nm        | 2007 |
| 32 nm        | 2009 |
| 28 nm        | 2010 |
| 22 nm        | 2012 |
| 14 nm        | 2014 |
| 10 nm        | 2016 |
| 7 nm         | 2018 |
| 5 nm         | 2020 |
| 3 nm         | 2022 |
| 2 nm         | ~2025 (Future) |

According to Jensen Huang, the CEO of Nvidia, **Moore's Law is dead** ([TechSpot article](https://www.techspot.com/news/96094-nvidia-jensen-huang-once-again-claims-moore-law.html)). This statement seems reasonable given the physical limitations of current chip designs. As the rate of improvement in transistor count decreases year over year, will consumers start paying more for diminishing performance gains?

## Why Should Consumers Care About the Death of Moore's Law?

With the decline of Moore's Law, we can expect fewer improvements in transistor density in upcoming generations. This poses a concern for consumers, as we may start paying more for diminishing returns on performance. As traditional computing approaches its physical limits, incremental improvements will become smaller, potentially benefiting corporations more than consumers. This could lead to a scenario where consumers pay more for fewer benefits, which is undesirable.

## The Economic Reality Today

Inflation has steadily eroded purchasing power in the USA over the last 50 years. As inflation rises, the real cost of consumer goods, including technology, increases, affecting affordability. Here are links to inflation-related data:

- [Purchasing power of the US dollar over time](https://elements.visualcapitalist.com/purchasing-power-of-the-u-s-dollar-over-time/)
- [America's growing rent burden](https://www.axios.com/2023/05/22/americas-growing-rent-burden)

## Purpose of This Project

This project aims to answer the following key questions:

1. What are the trends in CPU and GPU parts over the past 20 years?
2. Is the price-to-performance ratio of these parts keeping up? Are consumers getting a fair deal compared to 10 to 20 years ago?
3. Can we predict the performance of next-gen, unreleased CPU and GPU parts using supervised machine learning models?


## Data Collection & Description

The data for both CPU/GPU was collected from:

Note that various other sources were considered but was difficult to scrape/obtain or the data quality was not thorough enough. Hence they were skipped for the purposes of data source.
- https://www.hwcompare.com/
- https://www.userbenchmark.com/Software
- https://www.tomshardware.com/reviews/gpu-hierarchy,4388.html
- https://www.tomshardware.com/reviews/cpu-hierarchy,4312.html

The script used to collect them is below (uncomment entire code to run):

In [4]:
# import os
# import requests
# import time

# # Define the relative paths to save HTML files for GPU and CPU
# gpu_html_directory = os.path.join('..', 'data', 'gpu')
# cpu_html_directory = os.path.join('..', 'data', 'cpu')

# # Create the directories if they don't exist
# for directory in [gpu_html_directory, cpu_html_directory]:
#     if not os.path.exists(directory):
#         os.makedirs(directory)

# # Base URLs for TechPowerUp GPU and CPU Specs by year
# gpu_base_url = 'https://www.techpowerup.com/gpu-specs/?released='
# cpu_base_url = 'https://www.techpowerup.com/cpu-specs/?released='

# # List of years from 2004 to 2024
# years = list(range(2004, 2024 + 1))

# def get_filename_from_year_and_type(year, spec_type):
#     # Generate the filename in the format <year>_<type>_database_TechPowerUp.html
#     return f'{year}_{spec_type}_database_TechPowerUp.html'

# def download_html_for_year(year, spec_type, base_url, directory):
#     full_url = f'{base_url}{year}&sort=name'
#     file_name = get_filename_from_year_and_type(year, spec_type)
#     file_path = os.path.join(directory, file_name)

#     # If file already exists, skip downloading
#     if os.path.exists(file_path):
#         print(f"{file_name} already exists. Skipping download.")
#         return True  # Indicate that the download was successful or skipped

#     try:
#         print(f"Downloading {full_url}...")
#         response = requests.get(full_url, timeout=10)  # Set a 10-second timeout for the request

#         # Check if request was successful
#         if response.status_code == 200:
#             # Write the HTML content to a file
#             with open(file_path, 'w', encoding='utf-8') as file:
#                 file.write(response.text)
#             print(f"Saved {file_name}")
#             return True  # Indicate success
#         else:
#             print(f"Error downloading {full_url}: {response.status_code}")
#             return False  # Indicate failure
#     except requests.exceptions.Timeout:
#         print(f"Timeout error occurred for {year}. Retrying...")
#         return False  # Indicate failure due to timeout
#     except Exception as e:
#         print(f"Failed to download {full_url}: {e}")
#         return False  # Indicate failure due to other exceptions

# # Function to handle downloading for both GPU and CPU
# def download_specs(spec_type, base_url, directory):
#     idx = 0
#     retries = 0
#     while idx < len(years):
#         year = years[idx]

#         success = download_html_for_year(year, spec_type, base_url, directory)
#         if success:
#             idx += 1  # Move to the next year only if the download is successful
#             retries = 0  # Reset retries after successful download
#         else:
#             retries += 1
#             if retries >= 3:
#                 print(f"Skipping year {year} after 3 failed attempts.")
#                 idx += 1  # Move to the next year after 3 failed attempts
#                 retries = 0  # Reset retries for the next year
#             else:
#                 print(f"Retrying download for {spec_type} year {year} due to error...")
#                 time.sleep(10)  # Wait 10 seconds before retrying

#         time.sleep(1)  # Add a delay between requests to avoid overwhelming the server

#         # Pause for 1 minute after every 7 iterations
#         if idx % 7 == 0 and idx != 0:
#             print(f"Pausing for 1 minute after {idx} {spec_type} iterations...")
#             time.sleep(60)  # Sleep for 1 minute

# # Start downloading GPU and CPU specs
# download_specs('GPU', gpu_base_url, gpu_html_directory)
# download_specs('CPU', cpu_base_url, cpu_html_directory)
# print("Script finished!")

Please note that the website rate limits how much data you can scrape in a day.
After running the script above, I went and manually grabbed the sublinks for all the GPU and CPU and stored them into a Python list to use for further webscraping for individual GPU and CPU details.

I essentially looked for parts of the HTML that started with:
```html
<div id="list" class="table-wrapper">
```
and ended near:
```
<div id="ajaxresults" class="table-wrapper">
```

In [21]:
# Obtained list for GPU and CPU:
gpu_2024 = [
    '/gpu-specs/playstation-5-pro-gpu.c4232',
    '/gpu-specs/amd-oberon-plus.g1019',
    '/gpu-specs/radeon-880m.c4225',
    '/gpu-specs/amd-strix-point.g1079',
    '/gpu-specs/radeon-890m.c4224',
    '/gpu-specs/radeon-instinct-mi325x.c4231',
    '/gpu-specs/amd-aqua-vanjaram.g1023',
    '/gpu-specs/radeon-rx-7600-xt.c4190',
    '/gpu-specs/amd-navi-33.g1001',
    '/gpu-specs/radeon-rx-7700.c4159',
    '/gpu-specs/amd-navi-32.g1000',
    '/gpu-specs/radeon-rx-8800-xt.c4229',
    '/gpu-specs/amd-navi-48.g1071',
    '/gpu-specs/data-center-gpu-max-next.c4069',
    '/gpu-specs/intel-rialto-bridge.g1047',
    '/gpu-specs/b200-sxm-192-gb.c4210',
    '/gpu-specs/nvidia-gb100.g1069',
    '/gpu-specs/geforce-rtx-3050-6-gb.c4188',
    '/gpu-specs/nvidia-ga107.g988',
    '/gpu-specs/geforce-rtx-3050-a-mobile.c4227',
    '/gpu-specs/nvidia-ga106.g966',
    '/gpu-specs/geforce-rtx-4060-ad106.c3891',
    '/gpu-specs/nvidia-ad106.g1014',
    '/gpu-specs/geforce-rtx-4060-ti-ad104.c4204',
    '/gpu-specs/nvidia-ad104.g1013',
    '/gpu-specs/geforce-rtx-4070-10-gb.c4226',
    '/gpu-specs/geforce-rtx-4070-ad103.c4205',
    '/gpu-specs/nvidia-ad103.g1012',
    '/gpu-specs/geforce-rtx-4070-gddr6.c4228',
    '/gpu-specs/geforce-rtx-4070-super.c4186',
    '/gpu-specs/geforce-rtx-4070-ti-super.c4187',
    '/gpu-specs/geforce-rtx-4070-ti-super-ad102.c4215',
    '/gpu-specs/nvidia-ad102.g1005',
    '/gpu-specs/geforce-rtx-4080-super.c4182',
    '/gpu-specs/rtx-1000-mobile-ada-generation.c4208',
    '/gpu-specs/rtx-2000-ada-generation.c4199',
    '/gpu-specs/rtx-500-mobile-ada-generation.c4207',
    '/gpu-specs/rtx-5880-ada-generation.c4191',
    '/gpu-specs/rtx-a1000.c4211',
    '/gpu-specs/rtx-a400.c4212'
]

gpu_2023 =[
    "/gpu-specs/rog-ally-extreme-gpu.c4157",
    "/gpu-specs/rog-ally-gpu.c4158",
    "/gpu-specs/radeon-740m.c4162",
    "/gpu-specs/radeon-760m.c4022",
    "/gpu-specs/radeon-760m.c4222",
    "/gpu-specs/radeon-780m.c4020",
    "/gpu-specs/radeon-780m.c4221",
    "/gpu-specs/radeon-instinct-mi300.c4019",
    "/gpu-specs/radeon-instinct-mi300x.c4179",
    "/gpu-specs/radeon-pro-w7500.c4170",
    "/gpu-specs/radeon-pro-w7600.c4169",
    "/gpu-specs/radeon-pro-w7700.c4184",
    "/gpu-specs/radeon-pro-w7800.c4148",
    "/gpu-specs/radeon-pro-w7900.c4147",
    "/gpu-specs/radeon-rx-6450m.c4018",
    "/gpu-specs/radeon-rx-6550m.c4017",
    "/gpu-specs/radeon-rx-6550s.c3981",
    "/gpu-specs/radeon-rx-6600-le.c4223",
    "/gpu-specs/radeon-rx-6750-gre-10-gb.c4192",
    "/gpu-specs/radeon-rx-6750-gre-12-gb.c4183",
    "/gpu-specs/radeon-rx-7500-xt.c4116",
    "/gpu-specs/radeon-rx-7600.c4153",
    "/gpu-specs/radeon-rx-7600m.c4014",
    "/gpu-specs/radeon-rx-7600m-xt.c4013",
    "/gpu-specs/radeon-rx-7600s.c4016",
    "/gpu-specs/radeon-rx-7700-xt.c3911",
    "/gpu-specs/radeon-rx-7700s.c4015",
    "/gpu-specs/radeon-rx-7800-xt.c3839",
    "/gpu-specs/radeon-rx-7900-gre.c4166",
    "/gpu-specs/radeon-rx-7900m.c4178",
    "/gpu-specs/radeon-rx-7990-xtx.c3973",
    "/gpu-specs/steam-deck-oled-gpu.c4185",
    "/gpu-specs/arc-a380m.c4060",
    "/gpu-specs/arc-a530m.c4167",
    "/gpu-specs/arc-a570m.c4168",
    "/gpu-specs/arc-a580.c3928",
    "/gpu-specs/arc-graphics-112eu-mobile.c4196",
    "/gpu-specs/arc-graphics-128eu-mobile.c4193",
    "/gpu-specs/arc-graphics-48eu-mobile.c4198",
    "/gpu-specs/arc-graphics-64eu-mobile.c4194",
    "/gpu-specs/arc-pro-a60.c4160",
    "/gpu-specs/arc-pro-a60m.c4161",
    "/gpu-specs/data-center-gpu-max-1100.c4066",
    "/gpu-specs/data-center-gpu-max-1350.c4067",
    "/gpu-specs/data-center-gpu-max-1550.c4068",
    "/gpu-specs/data-center-gpu-max-subsystem.c4070",
    "/gpu-specs/iris-xe-graphics-80eu-mobile.c4059",
    "/gpu-specs/iris-xe-graphics-96eu-mobile.c4145",
    "/gpu-specs/uhd-graphics-64eu-mobile.c4143",
    "/gpu-specs/uhd-graphics-710-mobile.c4129",
    "/gpu-specs/uhd-graphics-710-mobile.c4128",
    "/gpu-specs/uhd-graphics-770-mobile.c4127",
    "/gpu-specs/geforce-rtx-4050.c3892",
    "/gpu-specs/geforce-rtx-4050-max-q.c3987",
    "/gpu-specs/geforce-rtx-4050-mobile.c3953",
    "/gpu-specs/geforce-rtx-4060.c4107",
    "/gpu-specs/geforce-rtx-4060-max-q.c3986",
    "/gpu-specs/geforce-rtx-4060-mobile.c3946",
    "/gpu-specs/geforce-rtx-4060-ti-16-gb.c4155",
    "/gpu-specs/geforce-rtx-4060-ti-8-gb.c3890",
    "/gpu-specs/geforce-rtx-4070.c3924",
    "/gpu-specs/geforce-rtx-4070-max-q.c3954",
    "/gpu-specs/geforce-rtx-4070-mobile.c3944",
    "/gpu-specs/geforce-rtx-4070-ti.c3950",
    "/gpu-specs/geforce-rtx-4080-max-q.c3948",
    "/gpu-specs/geforce-rtx-4080-mobile.c3947",
    "/gpu-specs/geforce-rtx-4080-ti.c3887",
    "/gpu-specs/geforce-rtx-4090-d.c4189",
    "/gpu-specs/geforce-rtx-4090-mobile.c3949",
    "/gpu-specs/geforce-rtx-4090-ti.c3917",
    "/gpu-specs/h100-cnx.c4131",
    "/gpu-specs/h100-pcie-80-gb.c3899",
    "/gpu-specs/h100-pcie-96-gb.c4164",
    "/gpu-specs/h100-sxm5-64-gb.c4165",
    "/gpu-specs/h100-sxm5-80-gb.c3900",
    "/gpu-specs/h100-sxm5-96-gb.c3974",
    "/gpu-specs/h800-pcie-80-gb.c4181",
    "/gpu-specs/h800-sxm5.c3975",
    "/gpu-specs/jetson-agx-orin-32-gb.c4084",
    "/gpu-specs/jetson-agx-orin-64-gb.c4085",
    "/gpu-specs/jetson-orin-nx-16-gb.c4086",
    "/gpu-specs/jetson-orin-nx-8-gb.c4081",
    "/gpu-specs/jetson-orin-nano-4-gb.c4083",
    "/gpu-specs/jetson-orin-nano-8-gb.c4082",
    "/gpu-specs/l20.c4206",
    "/gpu-specs/l4.c4091",
    "/gpu-specs/rtx-2000-embedded-ada-generation.c4177",
    "/gpu-specs/rtx-2000-max-q-ada-generation.c4094",
    "/gpu-specs/rtx-2000-mobile-ada-generation.c4093",
    "/gpu-specs/rtx-3000-mobile-ada-generation.c4095",
    "/gpu-specs/rtx-3500-embedded-ada-generation.c4201",
    "/gpu-specs/rtx-3500-mobile-ada-generation.c4098",
    "/gpu-specs/rtx-4000-ada-generation.c4171",
    "/gpu-specs/rtx-4000-mobile-ada-generation.c4096",
    "/gpu-specs/rtx-4000-sff-ada-generation.c4139",
    "/gpu-specs/rtx-4500-ada-generation.c4172",
    "/gpu-specs/rtx-5000-ada-generation.c4152",
    "/gpu-specs/rtx-5000-embedded-ada-generation.c4176",
    "/gpu-specs/rtx-5000-max-q-ada-generation.c4154",
    "/gpu-specs/rtx-5000-mobile-ada-generation.c4097",
    "/gpu-specs/titan-ada.c3985"
]

gpu_2022 = 
gpu_2021
gpu_2020
gpu_2019
gpu_2018
gpu_2017
gpu_2016
gpu_2015
gpu_2014
gpu_2013
gpu_2012
gpu_2011
gpu_2010
gpu_2009
gpu_2008
gpu_2007
gpu_2006
gpu_2005
gpu_2004

['/gpu-specs/rog-ally-extreme-gpu.c4157',
 '/gpu-specs/rog-ally-gpu.c4158',
 '/gpu-specs/radeon-740m.c4162',
 '/gpu-specs/radeon-760m.c4022',
 '/gpu-specs/radeon-760m.c4222',
 '/gpu-specs/radeon-780m.c4020',
 '/gpu-specs/radeon-780m.c4221',
 '/gpu-specs/radeon-instinct-mi300.c4019',
 '/gpu-specs/radeon-instinct-mi300x.c4179',
 '/gpu-specs/radeon-pro-w7500.c4170',
 '/gpu-specs/radeon-pro-w7600.c4169',
 '/gpu-specs/radeon-pro-w7700.c4184',
 '/gpu-specs/radeon-pro-w7800.c4148',
 '/gpu-specs/radeon-pro-w7900.c4147',
 '/gpu-specs/radeon-rx-6450m.c4018',
 '/gpu-specs/radeon-rx-6550m.c4017',
 '/gpu-specs/radeon-rx-6550s.c3981',
 '/gpu-specs/radeon-rx-6600-le.c4223',
 '/gpu-specs/radeon-rx-6750-gre-10-gb.c4192',
 '/gpu-specs/radeon-rx-6750-gre-12-gb.c4183',
 '/gpu-specs/radeon-rx-7500-xt.c4116',
 '/gpu-specs/radeon-rx-7600.c4153',
 '/gpu-specs/radeon-rx-7600m.c4014',
 '/gpu-specs/radeon-rx-7600m-xt.c4013',
 '/gpu-specs/radeon-rx-7600s.c4016',
 '/gpu-specs/radeon-rx-7700-xt.c3911',
 '/gpu-specs

## Data Collection & Description

## Feature Engineering

## Model Selection

## Model Evaluation

## Prediction & Results

## Conclusion

## References