# <center>Analysis of World of Warcraft Population Data</center>
## <center>Part Three: Exploratory Data Analysis (EDA)</center>
### <center>Data obtained from [realmpop.com](https://realmpop.com/)</center>

This blog post uses names and images from World of Warcraft and data proprietary to World of Warcraft. World of Warcraft, Warcraft and Blizzard Entertainment are trademarks or registered trademarks of Blizzard Entertainment, Inc. in the U.S. and/or other countries.

Welcome back to the third in my series of notebooks detailing the analysis of World of Warcraft poulation data from [Realm Pop](https://realmpop.com). In the previous entries in our series ([Part One](https://github.com/ereidelbach/wow/blob/master/warcraft1_data_collection.ipynb) and [Part Two](https://github.com/ereidelbach/wow/blob/master/warcraft2_exploratory_data_analysis.ipynb)) we collected the data for every realm, transformed it into a flattened structure, and engineered a few features we thought might be useful in our analysis.

With all those steps complete, we're now ready to begin the fun process of uncovering the insights that are buried within the data. Let's begin!

(Note: Almost all of these statistics are already available in the higher-level realm data we obtained in ([Part One](https://github.com/ereidelbach/wow/blob/master/warcraft1_data_collection.ipynb) with the exception of the Name analysis. However, we're going to forge ahead with our own analysis to verify the integrity of that data when compared with the realm-level data)

# 1. Data Ingest

In [13]:
# Import the necessary packages
import altair as alt
import json
import pandas as pd

# Enable Altair in the notebook
alt.renderers.enable('notebook')

# Remove the Altair limit on plotting a maximum of 5000 rows of data
alt.data_transformers.enable('default', max_rows=None)

# Load the data for Ysera
with open('data/realms-us/us-ysera_08-15-2018-15:46_flattened.json', 'r') as f:
    df_ysera = pd.DataFrame.from_dict(json.load(f), orient='index')

In [14]:
df_ysera.head()

Unnamed: 0,class,faction,gender,healer,level,race,realm,tank
AabankD07267,Rogue,Horde,Male,0,1,Undead,ysera,0
Aabankman,Warrior,Alliance,Male,0,22,Dwarf,ysera,1
Aabbee,Priest,Alliance,Female,1,1,Human,ysera,0
Aabbyy,Hunter,Horde,Female,0,36,Goblin,ysera,0
Aabf,Priest,Alliance,Female,1,1,Human,ysera,0


# 2. Realm Data Analysis

### A. Question: Does one faction struggle getting to level 110 or above more than another?

To begin, let's start by grouping characters by faction and then analyzing the breakdown of player levels within those groups.

In [54]:
# Enable in-line plotting of charts
%matplotlib inline
#df_ysera.groupby('faction')['level'].count()
groupby_faction = df_ysera[df_ysera['faction'] == 'Alliance'].groupby('level').count()
groupby_faction.index = groupby_faction.index.map(int)
groupby_faction.sort_index()
#df_ysera.groupby(['faction','level']).size().reset_index().groupby('level')[[0]].max()
#groupby_faction = df_ysera.groupby(['faction','level']).count()
#groupby_faction = df_ysera.groupby('level')['faction'].count()
#print(groupby_faction)
groupby_faction.head()

Unnamed: 0_level_0,class,faction,gender,healer,race,realm,tank
level,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,1985,1985,1985,1985,1985,1985,1985
10,1124,1124,1124,1124,1124,1124,1124
100,15195,15195,15195,15195,15195,15195,15195
101,1512,1512,1512,1512,1512,1512,1512
102,1063,1063,1063,1063,1063,1063,1063


In [10]:
# Altair plot of player counts by faction
# alt.Chart(df_ysera).mark_bar().encode(
#     alt.X('faction:N', bin=False),
#     y='count()',
#     tooltip='count()',
#     color='class'
# )

# import seaborn as sns
# sns.countplot(x='faction', data=df_ysera)

# 2. Realm Data Extraction

# 3. Region-Based Analysis

## A. Faction Info

## B. Gender Info

## C. Class Info

## D. Race Info

## E. Level Info

## F. Name Info

# 4. Realm-Based Statistics

## A. Faction Info

#### 1. Horde Players vs Alliance Players

## B. Gender Info

## C. Class Info

## D. Race Info

## E. Level Info

## F. Name Info