# Assessing the Impact of Product Specifications and Brand Origin on the Pricing of Mechanical Keyboards in 2025

## Problem Statement
Global supply chains have undergone significant political and economic disruption in recent years, particularly in the technology and consumer electronics industries. 

Mechanical keyboards, an essential component of modern computing and creative work, have become a notable example of how Chinese manufacturers have entered the enthusiast market with competitive alternatives.

Historically, branding and Western design heritage contributed greatly to pricing. However, with increased transparency and direct-to-consumer models from Chinese factories, this may no longer hold true.

## Goal
This project aims to use mechanical keyboard listings as a case study to explore whether technical specifications and country/brand of origin still meaningfully influence pricing in 2025.

## Hypothesis
H₀ (Null Hypothesis): Product specifications and brand origin (e.g., Chinese vs Western brands) have no significant effect on price.

H₁ (Alternative Hypothesis): Product specifications and brand origin significantly affect price.

## Objectives
- Determine which features (e.g., switch type, brand, layout, connectivity) influence pricing.

- Analyze whether branding and origin remain significant predictors of pricing.

- Provide insights into broader trends of consumer electronics pricing post-supply-chain globalization.



# Seeing what data we are working with

In [16]:
import pandas as pd

df = pd.read_csv('../ds_capstone_project/keebfinder_keyboards2.csv')
df.head()

Unnamed: 0,title,price,layout,mount,hall_effect,hotswap,case_material,backlight,connectivity,screen,knob
0,0.01 Z62,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
1,0.01 Z62 Blank Blank,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
2,80retros GB65 X Click Inc,$169,"65%,",Gasket Mount,no,yes,"Alu case,",no,"Wired,",no,no
3,80retros Pad Numpad X Click Inc,$129,,Gasket Mount,no,yes,"Alu case,",no,,no,no
4,8BitDo Retro,$119,"80%,",Top Mount,no,yes,"Alu case,",no,"Wireless,",no,yes


In [17]:
df.shape

(2368, 11)

In [18]:
df.isna().sum()

title               0
price               0
layout            223
mount             599
hall_effect         0
hotswap             0
case_material    1357
backlight           0
connectivity      254
screen              0
knob                0
dtype: int64

## knowing there are missing values, determine what they are and deal with them

In [19]:
# check for all the missing values in the layout column
df['layout'].unique()

array(['60%,', '65%,', nan, '80%,', '100%,', '75%,', '96%,', '98%,',
       '95%,', '40%,', '68%,', '60%', '64%,', '66%,', '85%,', '90%,',
       '70%,', '97%,', '80%', '40%', '65%', '50%,', '100%', '87%,', '75%',
       '78%,', '84%,'], dtype=object)

In [20]:
#check for all the unique values in the mount column
df['mount'].unique()

array(['Plate Mount', 'Gasket Mount', 'Top Mount', nan, 'Tray Mount',
       'Sandwich Mount', 'Bottom Mount', 'PCB Mount'], dtype=object)

In [21]:
#check for all the unique values in the case_material column
df['case_material'].unique()

array(['Alu case,', nan, 'Alu case', 'PCB Mount'], dtype=object)

In [22]:
# check for all the unique values in the 'connectivity' column
df['connectivity'].unique()

array(['Wired,', nan, 'Wireless,', 'Wireless', 'Wired'], dtype=object)

In [23]:
# fill missing values with 'Unknown' for categorical columns
df['layout'].fillna('Unknown', inplace=True)
df['mount'].fillna('Unknown', inplace=True)
df['case_material'].fillna('Unknown', inplace=True)
df['connectivity'].fillna('Unknown', inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['layout'].fillna('Unknown', inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['mount'].fillna('Unknown', inplace=True)
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values alwa

In [24]:
df

Unnamed: 0,title,price,layout,mount,hall_effect,hotswap,case_material,backlight,connectivity,screen,knob
0,0.01 Z62,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
1,0.01 Z62 Blank Blank,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
2,80retros GB65 X Click Inc,$169,"65%,",Gasket Mount,no,yes,"Alu case,",no,"Wired,",no,no
3,80retros Pad Numpad X Click Inc,$129,Unknown,Gasket Mount,no,yes,"Alu case,",no,Unknown,no,no
4,8BitDo Retro,$119,"80%,",Top Mount,no,yes,"Alu case,",no,"Wireless,",no,yes
...,...,...,...,...,...,...,...,...,...,...,...
2363,zFrontier Y2K 76 Metropolis,$215,"75%,",Top Mount,no,yes,Unknown,no,Unknown,yes,yes
2364,zFrontier Y2K 76 Redline,$195,"75%,",Top Mount,no,yes,Unknown,no,Unknown,yes,yes
2365,zFrontier Y2K 76 Strong Spirit,$195,"75%,",Top Mount,no,yes,Unknown,no,Unknown,yes,yes
2366,zFrontier Y2K 76 Superuser,$165,"75%,",Top Mount,no,yes,Unknown,yes,Unknown,yes,yes


In [None]:
# # extract brand from title n make a new column
# df['brand'] = df['title'].str.split().str[0]
# # df.head()

# #reorder the columns to make brand go first
# df = df[['brand', 'title', 'price', 'layout', 'mount', 'hall_effect', 'hotswap', 'case_material', 'backlight', 'connectivity', 'screen', 'knob']]
# df.head()



Unnamed: 0,brand,title,price,layout,mount,hall_effect,hotswap,case_material,backlight,connectivity,screen,knob
0,0.01,0.01 Z62,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
1,0.01,0.01 Z62 Blank Blank,$59,"60%,",Plate Mount,no,no,"Alu case,",yes,"Wired,",no,no
2,80retros,80retros GB65 X Click Inc,$169,"65%,",Gasket Mount,no,yes,"Alu case,",no,"Wired,",no,no
3,80retros,80retros Pad Numpad X Click Inc,$129,Unknown,Gasket Mount,no,yes,"Alu case,",no,Unknown,no,no
4,8BitDo,8BitDo Retro,$119,"80%,",Top Mount,no,yes,"Alu case,",no,"Wireless,",no,yes


In [None]:
# save updated dataframe to a new CSV file
# df.to_csv('keebfinder_keyboards_rev3.csv', index=False)

In [None]:
# categorize brands into pricing tiers (Provided by GPT-4)
brand_categories = {
    "premium": [
        "HHKB", "Realforce", "Geon", "Monokei", "Wooting", "Ergodox", "Keebwerk",
        "Protozoa", "Mechboards", "Meletrix", "MelGeek", "Dygma", "Keycult", "HIBI", "GMK"
    ],
    "midrange": [
        "Keychron", "Akko", "Varmilo", "Ducky", "Leopold", "IQUNIX", "Glorious", "Novelkeys",
        "Mistel", "Tex", "Vortex", "KBDFans", "KBParadise", "Nuphy", "Epomaker", "MelGeek"
    ],
    "budget": [
        "Ajazz", "Redragon", "Royalaxe", "Feker", "Skyloong", "Dareu", "Delux", "Zerodate",
        "Outemu", "Jamesdonkey", "Kemove", "AULA", "Langtu", "Womier", "Akko", "Dagk", "GamaKay",
        "MIIIW", "Darmoshark", "Monka", "Monsgeek", "Keydous", "Irok", "Newmen", "Niuniu"
    ]
}

# Create a mapping from brand name to category (Provided by GPT-4)
brand_to_category = {}
for category, brands in brand_categories.items():
    for brand in brands:
        brand_to_category[brand] = category

# Default uncategorized brands to 'unknown' (Provided by GPT-4)
unique_brands = df['brand'].dropna().unique()
for brand in unique_brands:
    if brand not in brand_to_category:
        brand_to_category[brand] = "unknown"

# Map the category to the dataframe
df['brand_category'] = df['brand'].map(brand_to_category)