<p><a name="sections"></a></p>


# Gaming Mice in Newegg

- <a href="#DP">Data Processing</a><br>
- <a href="#GA">General Analysis</a><br>
- <a href="#FP">Factors Contributing to Price</a><br>
- <a href="#FR">Factors Contributing to Reviews</a><br>


<p><a name="DP"></a></p>

### Data Processing

The data was scraped from newegg. The specific mice looked at are new, right-handed gaming mice that ship from the U.S. 

The fields scraped include:
- Brand
- Product Name
- Model
- Grip Style
- Maximum DPI
- Buttons
- Connection Type
- Color
- Average Review Rating (1-5 Stars to the nearest star)
- Number of Reviews
- Price

In [1]:
import numpy as np
import pandas as pd

# Read in csv
mice = pd.read_csv('./mice/mice.csv')

# Rearrange and rename columns
mice = mice[['brand', 'name', 'model', 'style', 'dpi', 'buttons', 'ctype', 'color', 'rating', 'reviews', 'price']]
mice.columns = ['Brand', 'Name', 'Model', 'Style', 'DPI', 'Buttons', 'Connection', 'Color', 'Rating', 'Reviews', 'Price']
mice.head()

Unnamed: 0,Brand,Name,Model,Style,DPI,Buttons,Connection,Color,Rating,Reviews,Price
0,Corsair,HARPOON,CH-9311011-NA,Claw,10000 dpi,6,Wired / Wireless,Black,4.0,83,49.99
1,UtechSmart,Venus,US-D16400-GM,,16400 dpi,19,Wired,Black,4.0,242,29.99
2,Sades,,,Palm,3200dpi,6,Wired,Pink,4.0,4,18.49
3,AULA,Wired Gaming Mouse,F805,,6400 dpi,7,Wired,Black,4.0,2,14.99
4,RAZER,DeathAdder Elite,RZ01-02010100-R3U1,,16000 dpi,7,Wired,Black,4.0,642,30.99


**Brand**
- Will capitalize the Brand to reduce inconsistencies.
- Check items that do not make sense.

**Grip Style**
- Will convert instances of more than one and "All" to "Multiple".
- Check items that do not make sense.

**Maximum DPI**
- Check items that do not make sense.
- Convert to int.

**Buttons**
- Check items that do not make sense.

**Connection Type**
- Will only differentiate between wire or wireless.

In [2]:
# Brand
mice['Brand'] = mice['Brand'].str.upper()
mice.loc[(mice.Brand == 'FREE WORF'),'Brand']='FREE WOLF'

In [3]:
# Style
mice['Style'] = mice['Style'].str.replace('Grip','')
mice.loc[(mice['Style'].str.contains('brand', na=False)),'Style']=np.nan
mice.loc[((mice.Style == 'All') | (mice.Style.str.contains('/| and | or |,')) | (mice.Style == 'Adjustable')),'Style']='Multiple'

In [4]:
# DPI
mice.loc[(mice['DPI'].str.contains('200-', na=False)),'DPI']='10000'
mice.loc[(mice['DPI'].str.contains('50~', na=False)),'DPI']='6400'
mice.loc[(mice['DPI'].str.contains('Pixart', na=False)),'DPI']='16000'
mice['DPI'] = mice['DPI'].str.extract('(\d+)')
mice.loc[mice['DPI'].notnull(), 'DPI'] = mice.loc[mice['DPI'].notnull(), 'DPI'].astype(int)

In [5]:
# Buttons
mice.loc[(mice['Buttons'].str.contains('digital', na=False)),'Buttons']='9'
mice.loc[(mice['Buttons'].str.contains('programmable buttons', na=False)),'Buttons']='6'
mice.loc[(mice['Buttons'].str.contains('8 programmable', na=False)),'Buttons']='9'
mice.loc[(mice['Buttons'].str.contains('OMRON', na=False)),'Buttons']='5'
mice.loc[mice['Buttons'].notnull(), 'Buttons'] = mice.loc[mice['Buttons'].notnull(), 'Buttons'].astype(int)

In [6]:
# Connection Type
mice.loc[(mice.Connection.str.contains('wireless|Wireless', na=False)),'Connection']='Wireless'

In [7]:
# Save to new file and load.
mice.to_csv(r'./mice/mice_modified.csv')
mice = pd.read_csv('./mice/mice_modified.csv')

In [8]:
mice.head()

Unnamed: 0.1,Unnamed: 0,Brand,Name,Model,Style,DPI,Buttons,Connection,Color,Rating,Reviews,Price
0,0,CORSAIR,HARPOON,CH-9311011-NA,Claw,10000.0,6.0,Wireless,Black,4.0,83,49.99
1,1,UTECHSMART,Venus,US-D16400-GM,,16400.0,19.0,Wired,Black,4.0,242,29.99
2,2,SADES,,,Palm,3200.0,6.0,Wired,Pink,4.0,4,18.49
3,3,AULA,Wired Gaming Mouse,F805,,6400.0,7.0,Wired,Black,4.0,2,14.99
4,4,RAZER,DeathAdder Elite,RZ01-02010100-R3U1,,16000.0,7.0,Wired,Black,4.0,642,30.99


<p><a name="GA"></a></p>

### General Analysis

**Brand**

Obervations:
- The brand with the highest number of products is not very well known and only has products in the lower price range. 5 out of the top 10 brands with the most products only produces lower price range mice, with the max price for each product not exceeding 40 dollars and median price for all products not exceeding 30 dollars.
- Only 5 out of the top 10 brands with the most products have more than 100 total number of reviews, indicating that having more products does not necessarily mean more popularity amongst customers.
- The top 10 brands with the most reviews, with the exception of 2, all have a mean rating of around 4 stars and a median of 4 stars. This suggests that the products from the 8 brands are all generally consistently good and no brand is disliked or has noticeably poorer quality.
- 

In [49]:
colFun = {'Brand':['size'],
          'Rating': ['mean', 'median'], 
          'Reviews': ['sum'],
          'Price':['min','max', 'median']}

mice_brand = mice.groupby('Brand').agg(colFun)
mice_brand_product = mice_brand.sort_values(by=[('Brand', 'size')], ascending = False).head(10)
mice_brand_reviews = mice_brand.sort_values(by=[('Reviews', 'sum')], ascending = False).head(10)

print(mice_brand_product)
print('\n' * 2)
print(mice_brand_reviews)

            Brand Rating        Reviews  Price                
             size   mean median     sum    min     max  median
Brand                                                         
ZELOTES        30   3.90   4.50      88  11.99   28.99   14.99
CORSAIR        24   3.88   4.00    1290  29.99  128.33   79.99
LOGITECH       22   4.18   4.00    2522  39.88  199.99   95.38
RAZER          18   3.88   4.00    5827  25.46  236.99   79.99
IMICE          17    NaN    NaN       0  13.99   23.99   14.99
STEELSERIES    11   4.33   4.00     390  39.75  213.00  107.99
LUOM           11   5.00   5.00      12  13.99   14.99   13.99
GLORIOUS       11   4.30   4.00     100  78.99  145.99   81.99
AULA           10   4.00   4.00       2  14.99   39.99   25.49
RAJFOO         10    NaN    NaN       0  16.99   18.99   18.99



            Brand Rating        Reviews   Price                
             size   mean median     sum     min     max  median
Brand                                             

### Things to Note

Rating is to the nearest star, not very good resolution.

Information can be incorrect.

Significant factors not looked at in project:
- One of the biggest contributing factors is most likely RGB, but difficult to extract whether or not product has RGB.
- Aesthetics. More "gamer" or futuristic aesthetics vs basic or more sleek designs. Secondary color.
- Ergonomics/type of the mouse. For example there are mice that are vertical or are in the form factor of a gun.
- Weight adjustability.