In [7]:
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression

# Importing dataset
weapons = pd.read_csv('cleaned dataset with categories.csv')

#weapons.loc[weapons['category'] == 'none']

## Weapon category indicator inputs

In this cell, we show the number of induvidual weapons that belong to each weapon category.

In [8]:
# Calculating the number of weapons in each category
weapons.groupby("category")["category"].count()

category
heavy     1089
knife     2513
pistol    2123
rifle     2348
smg       1457
Name: category, dtype: int64

Next, we create indicator (dummy) variables for each weapon based on that weapon's membership in a specific category. We will use these dummy variables to create linear regressions later on. 

In [9]:
# Creating dummy variables for weapon category
category_dummies = pd.get_dummies(weapons["category"])
category_dummies.head()


Unnamed: 0,heavy,knife,pistol,rifle,smg
0,0,0,0,1,0
1,0,0,0,1,0
2,0,0,0,1,0
3,0,0,0,1,0
4,0,0,0,1,0


In the cell below, we create a linear regression for each category of weapon and display its slope to see the relationship between a weapon's category and its price. 

In [27]:
# Calculating linear regression for each column in category_dummies
heavies = category_dummies['heavy']
knives = category_dummies['knife']
pistols = category_dummies['pistol']
rifles = category_dummies['rifle']
smgs = category_dummies['smg']

# Creating linear regression models
heavy_model = LinearRegression().fit(np.array(heavies).reshape(-1,1), weapons['price'])
knife_model = LinearRegression().fit(np.array(knives).reshape(-1,1), weapons['price'])
pistol_model = LinearRegression().fit(np.array(pistols).reshape(-1,1), weapons['price'])
rifle_model = LinearRegression().fit(np.array(rifles).reshape(-1,1), weapons['price'])
smg_model = LinearRegression().fit(np.array(smgs).reshape(-1,1), weapons['price'])

# Displaying slopes of each regression
print("Heavies:", heavy_model.coef_[0])
print("Knives:", knife_model.coef_[0])
print("Pistols:", pistol_model.coef_[0])
print("Rifles:", rifle_model.coef_[0])
print("SMGs:", smg_model.coef_[0])

Heavies: -96.90968171227775
Knives: 274.90222174491913
Pistols: -101.10398529264816
Rifles: -68.49790460914652
SMGs: -103.02027279472205


## Interpretation

Based on the slopes displayed above, we can see how the category that a weapon falls into correlates with its price. The slope of `knife_model` is approximately $275$, which is the highest of any category by far and means that there is a very strong correlation between knives and price. On average, knives are much more expensive than any other weapon type. Given this very high correlation, we can infer that knives are among the most desirable weapons on the marketplace, since they fetch the highest price. 

Based on this interpretation, we can determine that rifles are the second most desirable category of weapon on the market, given that the slope of `rifle_model` is the second highest, at around $-68$. On average, they are less expensive than knives, but more expensive than SMGs, pistols, and heavies.

The models of all of the other weapon categories have very similar slopes that center around $-100$. These weapons are among the least desireable on the market and are the cheapest type of weapon on average.