![WoW logo](datasets/logo.png)


# What is World of Wacraft?
According to Wikipedia:

> World of Warcraft (WoW) is a massively multiplayer online role-playing game (MMORPG) released in 2004 by Blizzard Entertainment. It is the fourth released game set in the Warcraft fantasy universe. World of Warcraft takes place within the Warcraft world of Azeroth, approximately four years after the events at the conclusion of Blizzard's previous Warcraft release, Warcraft III: The Frozen Throne. The game was announced in 2001, and was released for the 10th anniversary of the Warcraft franchise on November 23, 2004. Since launch, World of Warcraft has had eight major expansion packs produced for it: The Burning Crusade, Wrath of the Lich King, Cataclysm, Mists of Pandaria, Warlords of Draenor, Legion, Battle for Azeroth and Shadowlands.

> World of Warcraft was the world's most popular MMORPG by player count of nearly 10 million in 2009. The game had a total of over a hundred million registered accounts by 2014. By 2017, the game had grossed over $9.23 billion in revenue, making it one of the highest-grossing video game franchises of all time. At BlizzCon 2017, a vanilla version of the game titled World of Warcraft Classic was announced, which planned to provide a way to experience the base game before any of its expansions launched. It was released in August 2019.

### Why is it worth investigating?

The game World of Warcraft has already been a gold mine when it comes to scientific work (see [google scholar](https://scholar.google.com/scholar?hl=pl&as_sdt=0,5&q=world+of+warcraft)). From social studies to finance, the world created by Blizzard has given many information about how players behave with their created characters, modelling how they act in real life. In this notebook, I will explore the auction house prices driven by demand over a few months for raid supplies.

### What is the meaning of all those strange words?

World of Wacraft as any other game has specific terminology for items that are present in the world. Here is a summary of the specific items and words that will be used while researching this notebook.

- **PvE** - Player versus Environment, gameplay that focuses on fighting computer generated enemies,
- **PvP** - Player versus Player, gameplay that focuses on fighting other players,
- **Party** - A group of 2-5 players, created for participating in PvE or PvP scenarios,
- **Raid Group** - A group of 6-40 players divided in groups of 5, created for participating in PvE or PvP scenarios,
- **Raid** - An instanced dungeon, with several bosses that, proves a challenge for a raid group,
- **Flask** - One hour character stats buff that persists through death,
- **Potion** - A short boost to stats, that increases damage, survivability etc,
- **Food** - Long character stats buff that dissapears on death,
- **Feast** - A table that provides food buffs, but can be used by multiple people in a group.
- **Money** - The amount of money you have. It's divided into gold, silver and copper:
   - 100 copper = 1 silver,
   - 100 silver = 1 gold.
- **Auction House (AH)** - Place where people sell their goods in-game.  

# Loading in the data

In this section we will load the data from the csv file and set a couple of things:

- Select `;` separator and select `id` as index,
- `item_name` and `item_subclass` as categorical data,
- Parse dates for `created_at` - this is the time the data was gathered from the AH,

In [1]:
# Import libraries
import pandas as pd
import seaborn as sns

In [2]:
# Read the data and preview sample rows
wow = pd.read_csv(
    'datasets/tsm_data.csv',
    sep=';',
    index_col='id',
    parse_dates=['created_at'],
    dtype={
        'item_name': 'category',
        'item_subclass': 'category',
    },
)
display(wow.sample(5, random_state=42))
wow.info()

Unnamed: 0_level_0,item_name,item_subclass,item_vendor_buy,item_vendor_sell,item_market_value,item_min_buyout,item_quantity,item_num_auctions,created_at
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
38240,Greater Flask of the Vast Horizon,Flask,10000,2500,6730521,6899900,1821,37,2020-03-13 19:54:26
9850,Potion of Wild Mending,Potion,10000,2500,759781,739548,449,39,2019-12-08 15:30:40
2819,Potion of Unbridled Fury,Potion,10000,2500,1026773,1000200,7629,532,2019-11-22 23:03:07
30800,Superior Battle Potion of Agility,Potion,10000,2500,1192344,1399800,2100,30,2020-02-26 05:56:37
43488,Fragrant Kakavia,Food & Drink,25000,1250,54392,45800,2005,10,2020-03-25 14:31:04


<class 'pandas.core.frame.DataFrame'>
Int64Index: 44859 entries, 305 to 45163
Data columns (total 9 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   item_name          44859 non-null  category      
 1   item_subclass      44859 non-null  category      
 2   item_vendor_buy    44859 non-null  int64         
 3   item_vendor_sell   44859 non-null  int64         
 4   item_market_value  44859 non-null  int64         
 5   item_min_buyout    44859 non-null  int64         
 6   item_quantity      44859 non-null  int64         
 7   item_num_auctions  44859 non-null  int64         
 8   created_at         44859 non-null  datetime64[ns]
dtypes: category(2), datetime64[ns](1), int64(6)
memory usage: 2.8 MB


# Data Cleaning

The good thing is that we don't have any null values. In this section we will do a couple of things tho:

- Remove `item_vendor_buy` and `item_vendor_sell` columns as they don't carry any relevant information,
- Filter out items with `0` auctions at a specified time,
- Update values to be in gold instead of copper, for readability (and noone really uses copper as their base).

In [3]:
# Filter out empty rows
wow = wow[wow['item_num_auctions'] > 0]

# Remove irrelevant columns
wow.drop(
    columns=[
        'item_vendor_buy',
        'item_vendor_sell',
    ],
    inplace=True
)

# Update copper to gold
wow['item_market_value'] = wow['item_market_value'] / 10000
wow['item_min_buyout'] = wow['item_min_buyout'] / 10000

In [4]:
# Preview dataframe after cleaning
display(wow.sample(5, random_state=42))
wow.info()

Unnamed: 0_level_0,item_name,item_subclass,item_market_value,item_min_buyout,item_quantity,item_num_auctions,created_at
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
30736,Greater Flask of the Undertow,Flask,397.4604,399.99,2863,38,2020-02-26 01:56:45
26097,Greater Flask of the Currents,Flask,498.9769,399.0,12532,154,2020-01-25 10:06:57
29099,Greater Flask of the Currents,Flask,388.1892,380.87,4256,56,2020-02-22 10:16:50
16381,Superior Battle Potion of Stamina,Potion,68.7077,38.4997,234,45,2019-12-24 01:14:48
38055,Baked Port Tato,Food & Drink,35.0065,50.95,21003,87,2020-03-13 09:54:34


<class 'pandas.core.frame.DataFrame'>
Int64Index: 44707 entries, 305 to 45163
Data columns (total 7 columns):
 #   Column             Non-Null Count  Dtype         
---  ------             --------------  -----         
 0   item_name          44707 non-null  category      
 1   item_subclass      44707 non-null  category      
 2   item_market_value  44707 non-null  float64       
 3   item_min_buyout    44707 non-null  float64       
 4   item_quantity      44707 non-null  int64         
 5   item_num_auctions  44707 non-null  int64         
 6   created_at         44707 non-null  datetime64[ns]
dtypes: category(2), datetime64[ns](1), float64(2), int64(2)
memory usage: 2.1 MB


# Data Analysis