## Data Visualization in an e-Sports Environment
### Nathaniel Cooper

e-Sports is, historically, a recent phenomenon: Video-game game competitions with well-regulated play for prize money. This is a fast-growing business with 900 Million USD generated in 2018 up from 497 Million USD in 2016 according to https://newzoo.com/key-numbers/. These tournaments are streamed on the internet on site such as twitch.tv and youtube.com, for example a top World of Warcraft (WoW) guild, Method (https://www.youtube.com/user/MethodNetwork/videos?flow=grid&view=0&sort=p) has YouTube videos that number in the millions of views. Blizzard, the publisher of World of Warcraft, hosts an annual event Blizzcon that had 35,000+ attendees in 2017 (https://wow.gamepedia.com/BlizzCon_2017) along with 'Virtual Ticket' holders who had access to streaming of the events. Blizzcon features e-Sports events for many of their game (https://esports.blizzard.com/en-us/), such as WoW, StarCraft, and Overwatch.

This analysis will focus on World of Warcraft, a Massively Multiplayer Online Roleplaying Game. Like Pen and Paper roleplaying games, such as Dungeons and Dragons, players control characters in a fantasy environment with the goal of defeating challenges, often fantasy style antagonists such as monsters and humanoid villains. Defeating a villain gives players access to experience points (at lower levels), and more powerful gear that allows them to defeat more powerful challenges. To accomplish these goals players, organize into teams called Guilds.

Although fighting other players (PvP) is one aspect of the game, this analysis will focus on fighting non-player characters in Dungeons and Raids, which is referred to as Player vs Environment (PvE). Dungeons are set areas, not necessary actual dungeons, as they can be towns, ruins, castles, etc., that have a fixed number of powerful enemies to defeat called Bosses (typically 3 to 5) and their henchmen, called Trash. Dungeons require 5 players to defeat, a tank who keeps enemies attacking them and absorbs damage, a healer who uses magic spells to heal damage to the tank and other players, and 3 dps (damage per second) players who kill the bad guys. Raids are typically more challenging and require 10-30 players, the difficulty of the bosses and trash scale with the number of players. Dungeons can be set to four difficulties: Normal, Heroic, Mythic, and Mythic Plus (2-15). Raid difficulties are Raid Finder, Normal, Heroic, and Mythic. Mythic Raids require exactly 25 players. Note that in terms of difficulty Mythic Dungeons are about as difficult as Normal Raids, and Mythic Plus Dungeons are about as difficult as Heroic or Mythic Raids, depending on the plus level.


## The Data

To show that the data are sufficiently complex for Graduate Level Analysis, I have provided an example below. These data are from the last half of a run in the Uldir raid. This raid is set in the ruins of a long-abandoned laboratory where Titans (think Greek Mythos) did research to defeat The Old Gods (think H.P. Lovecraft). The lab is defended by two Golems (magical robots) Taloc, and M.O.T.H.E.R who are tasked with keeping people away from the failed experiments within (Vectis, Fetid Devourer, Zek'vos) and the followers (Zul, Mythrax) of an Old God (G'Huun) that has taken the facility as its home. All named characters are Bosses. The data covers my Guild's (Wicked Claw-Lightbringer, https://www.wowprogress.com/guild/us/lightbringer/Wicked+Claw) fights versus bosses Vectis, Zek'vos, Zul, and Mythrax. 

The data frame contains 524594 rows in 35 columns. Data types are dates, timestamps, numerical and string. A challenge is that WoW's combat logger does not have column headers and so columns must be identified from documentation (https://wow.gamepedia.com/COMBAT_LOG_EVENT) and context. Some exploration will be required to make sure that I am graphing spell damage and not a backend id number.

In [1]:
import sys
from io import StringIO
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
import plotly
plotly.tools.set_credentials_file(username='njcooper137', api_key='xBAwHZxZgXUKSpj91hyr')
import plotly.plotly as py
import plotly.graph_objs as go
from plotly import tools

In [3]:
file = "C:\\Users\\Nate\\Documents\\DataSet\\warcraftlogsarchive\\WoWCombatLog-archive-2018-11-06T14-20-52.865Z.txt"
raw_log = open(file, mode = 'r')
data = raw_log.read()
# Set all delimiters to spaces for consistancy
data = data.replace(',', ' ')
data = StringIO(data)

In [4]:
# Since the number of column betweeen rows are inconsistent, 
#I had to set the column count manually to get pandas to read all rows.
#my_cols = list(range(0,35))
#Column titles come after a lot of trial and error: The documentation is out-of-date
#My technique was to record a fight with the combat log showing so I could identify the
#column containing the damage or healing dealt
my_cols = ["Day", "Time", "Event", "sourceGUID", "sourceName", "sourceFlags", 
          "sourceRaidFlags", "destGUID", "destName", "destFlags", "destRaidFlags",
          "spellId", "spellName", "spellSchool", "amount/type", "extraInfo1", 
           "extraInfo2","dsp_resist/heal_crit","blocked", "absorbed", "criitcal", 
           "glancing", "crushing", "isOffHand", "extraInfo3", "extraInfo4","extraInfo5" ,
           "swing_damage","extraInfo7","extraInfo8","spell_damage"]
wow_log = pd.read_table(data,
                        delim_whitespace=True, 
                        skiprows=1, 
                        names = my_cols)


Columns (16,17,19,20,21,22,23,24,26,27) have mixed types. Specify dtype option on import or set low_memory=False.



In [5]:
wow_log.head(5)

Unnamed: 0,Day,Time,Event,sourceGUID,sourceName,sourceFlags,sourceRaidFlags,destGUID,destName,destFlags,...,glancing,crushing,isOffHand,extraInfo3,extraInfo4,extraInfo5,swing_damage,extraInfo7,extraInfo8,spell_damage
0,11/3,22:18:57.470,SPELL_AURA_REFRESH,Player-3694-07C4D75C,Krodorr-Lightbringer,0x514,0x0,Player-3694-07C4D75C,Krodorr-Lightbringer,0x514,...,,,,,,,,,,
1,11/3,22:18:59.288,SPELL_AURA_APPLIED,Player-3694-079BEF72,Yosana-Lightbringer,0x514,0x0,Player-3694-079BEF72,Yosana-Lightbringer,0x514,...,,,,,,,,,,
2,11/3,22:19:02.304,SPELL_AURA_REFRESH,Player-3694-08370F7D,Fahris-Lightbringer,0x514,0x0,Player-3694-08370F7D,Fahris-Lightbringer,0x514,...,,,,,,,,,,
3,11/3,22:19:07.649,SPELL_AURA_APPLIED,Player-3694-0882D23B,Redshell-Lightbringer,0x40514,0x0,Player-3694-0882D23B,Redshell-Lightbringer,0x40514,...,,,,,,,,,,
4,11/3,22:19:07.649,SPELL_CAST_SUCCESS,Player-3694-0882D23B,Redshell-Lightbringer,0x40514,0x0,0000000000000000,nil,0x80000000,...,3.0,105.0,105.0,0.0,659.9,-257.71,1152.0,0.0885,369.0,


The names below that end in "-lightbringer" are player characters. The set at the top of the list are my teammates and me. Non "-lightbringer" names are non-player characters: player pets (e.g., 'Ghostwinkle'), Bosses (e.g., 'Vectis'), friendly NPCs (e.g., 'Brann Bronzebeard'), trash (e.g., 'Nazmani Bloodweaver'), and spell names (e.g., 'Efflorescence').

In [6]:
#Identify unique players and NPCs in the set
wow_log.sourceName.unique()

array(['Krodorr-Lightbringer', 'Yosana-Lightbringer',
       'Fahris-Lightbringer', 'Redshell-Lightbringer',
       'Novakaan-Lightbringer', 'ZoobÃ\xad-Lightbringer',
       'Aohige-Lightbringer', 'Kylaara-Lightbringer', 'Ayl-Lightbringer',
       'Shameonjohn-Lightbringer', 'Hippydoc-Lightbringer',
       'Efflorescence', 'Mefistophel-Lightbringer', 'Pocanu-Lightbringer',
       'Vectis', '557', '853', '851', '897', '5886', '554', '5591', '628',
       '725', '629', '689', '1216', '710', 'FuryosÃ¡-Lightbringer',
       'Gibolt-Lightbringer', 'Ragnaar-Lightbringer', 'Ghostwinkle',
       'Jaktog', 'Piprin', 'Kupmir', 'nil', 'Batte-Lightbringer',
       'Infernal', 'Mindbender', "Pip'tok", 'Primal Fire Elemental',
       'Liquid Magma Totem', 'Treant', 'Darkglare', 'Plague Amalgam',
       'Lightspawn', 'Ember Elemental', 'Shadowfiend', 'Blood Ritual',
       'Brann Bronzebeard', 'Defense Grid', 'Nazmani Dominator',
       'Orb of Harmony', 'Demonic Gateway', '6581', '5353', 'MOTHER',
 

In [7]:
wow_log.shape

(524594, 31)

In [8]:
#Dataframe that will allow me to subset based on Boss Fights
boss_fights_df = wow_log[(wow_log.Event == 'ENCOUNTER_START') | 
                          (wow_log.Event == 'ENCOUNTER_END')]

In [9]:
boss_fights_df.head()

Unnamed: 0,Day,Time,Event,sourceGUID,sourceName,sourceFlags,sourceRaidFlags,destGUID,destName,destFlags,...,glancing,crushing,isOffHand,extraInfo3,extraInfo4,extraInfo5,swing_damage,extraInfo7,extraInfo8,spell_damage
56,11/3,22:19:24.220,ENCOUNTER_START,2134,Vectis,15,17,1861,,,...,,,,,,,,,,
40114,11/3,22:23:52.445,ENCOUNTER_END,2134,Vectis,15,17,0,,,...,,,,,,,,,,
40767,11/3,22:26:21.482,ENCOUNTER_START,2134,Vectis,15,17,1861,,,...,,,,,,,,,,
63085,11/3,22:28:43.796,ENCOUNTER_END,2134,Vectis,15,17,0,,,...,,,,,,,,,,
63524,11/3,22:30:54.684,ENCOUNTER_START,2134,Vectis,15,17,1861,,,...,,,,,,,,,,


In [10]:
#using the baove dataframe I can slice a dataframe from our Vectis Kill
vectis_kill_df = wow_log.loc[63524:109849]

In [11]:
#break the dataframe into Event-based dataframes
d = {}
for catagory in vectis_kill_df.Event.unique():
    d["df_{0}".format(catagory)]= vectis_kill_df[vectis_kill_df.Event == catagory]

In [12]:
d.keys()

dict_keys(['df_ENCOUNTER_START', 'df_COMBATANT_INFO', 'df_SPELL_CAST_START', 'df_SPELL_AURA_APPLIED', 'df_SPELL_AURA_REFRESH', 'df_SPELL_CAST_SUCCESS', 'df_SPELL_HEAL', 'df_SPELL_PERIODIC_HEAL', 'df_SPELL_DAMAGE', 'df_SPELL_CAST_FAILED', 'df_SPELL_ENERGIZE', 'df_RANGE_DAMAGE', 'df_SPELL_AURA_REMOVED_DOSE', 'df_SPELL_PERIODIC_ENERGIZE', 'df_SPELL_AURA_REMOVED', 'df_SWING_MISSED', 'df_SWING_DAMAGE', 'df_SWING_DAMAGE_LANDED', 'df_SPELL_MISSED', 'df_SPELL_AURA_BROKEN_SPELL', 'df_SPELL_AURA_APPLIED_DOSE', 'df_SPELL_SUMMON', 'df_SPELL_PERIODIC_DAMAGE', 'df_SPELL_ABSORBED', 'df_SPELL_PERIODIC_MISSED', 'df_SPELL_HEAL_ABSORBED', 'df_UNIT_DIED', 'df_SPELL_DRAIN', 'df_PARTY_KILL', 'df_EMOTE', 'df_ENCOUNTER_END'])

In [13]:
Vectis_spell_damage = d["df_SPELL_DAMAGE"][(d["df_SPELL_DAMAGE"].sourceName == 'Vectis') 
                                           | (d["df_SPELL_DAMAGE"].sourceName == 'Plague Amalgam')]
Raid_spell_damage = d["df_SPELL_DAMAGE"][(d["df_SPELL_DAMAGE"].sourceName != 'Vectis') 
                                           | (d["df_SPELL_DAMAGE"].sourceName != 'Plague Amalgam')]

In [14]:
trace0 = go.Scatter(
    x = Raid_spell_damage['Time'],
    y = Raid_spell_damage['spell_damage'],
    name = 'Raid Spell Damage'
)
trace1 = go.Scatter(
    x = Vectis_spell_damage['Time'],
    y = Vectis_spell_damage['spell_damage'],
    name = 'Boss Spell Damage'
)
data = [trace0,trace1]

py.iplot(data, filename='damage-line')

### Damage Time Series

The Initial brust of high damage for the raid is not surprising. We all drink a stat boosting potion just before the tank starts the encounter. We also have a team member use an ability that speeds everyone up for 30s. 

The Boss Vectis tends to do steady damage throughout the encounter. This is largely due to careful placement of where teammates are standing through the fight. This prevents a large damage effect from hitting multiple people. Note that the bursts represent about 30% of a tank's health points in a single hit. If healers are falling behind, this could easily kill a tank and trigger a wipe. 

In [15]:
Vectis_spell_heal = d["df_SPELL_HEAL"][(d["df_SPELL_HEAL"].sourceName == 'Vectis') 
                                           | (d["df_SPELL_HEAL"].sourceName == 'Plague Amalgam')]
#Vectis has no self healing so the above df is empty
Raid_spell_heal = d["df_SPELL_HEAL"][(d["df_SPELL_HEAL"].sourceName != 'Vectis') 
                                           | (d["df_SPELL_HEAL"].sourceName != 'Plague Amalgam')]

In [16]:
trace = go.Scatter(
    x = Raid_spell_heal['Time'],
    y = Raid_spell_heal['spell_damage'],
    name = 'Raid Spell Heals'
)

data = [trace]

py.iplot(data, filename='heal-line')

### Healing Spell Time Series

Healing Spells just do one healing effect after cast, although it could be to multiple targets. If a healing spell lands when someone is 100% health zero healing is done. If you mouse over the data you will see serveral points where this is the case. It is impossible to coordinate healing effects with 100% effectiveness, so sometimes two or more healing effects land at the same time rendering one or more useless. 

I am surprised by the number of heals above 40K health. This represents about 1/3 of a non-tanks health pool. Such large hits are the result of critical sucesses. 

In [17]:
Raid_spell_HoTs = d['df_SPELL_PERIODIC_HEAL'][(d['df_SPELL_PERIODIC_HEAL'].sourceName != 'Vectis') 
                                           | (d['df_SPELL_PERIODIC_HEAL'].sourceName != 'Plague Amalgam')]

In [18]:
trace = go.Scatter(
    x = Raid_spell_HoTs['Time'],
    y = Raid_spell_HoTs['spell_damage'],
    name = 'Raid Spell HoTs'
)

data = [trace]

py.iplot(data, filename='hot-line')

### Heal over Time (HoT) Time Series

Heal over time (HoT) spells have there healing effects happen in discrete amounts (called ticks) over a specified amount of time. These spells often time account for more health points per spell than a Heal spell, and provide a base-line amount of health points coming in whereas Healing spells can off-set sudden drops in a players health more effectively. HoTs stack with other other HoTs, that is a player can have several active on their character simultaneously.   

You will note that the HoTs graph has higher frequency of data points than the Healing graph. However, they do not heal for as much per tick as a Healing Spell's effect. Although, as you see below, HoTs only account for about 1/3 of total heals. They are often an instant cast spell, so they can be applied when the healer is moving. Normal heal spells often require the healer to be standing still. Therefore, they can provide healing during phases that require a lot of motion. 

In [19]:
hot_total = Raid_spell_HoTs.spell_damage.sum()
heal_total=  Raid_spell_heal.spell_damage.sum()
data = [go.Bar(
            x=['HoTs', 'Heals'],
            y=[hot_total, heal_total]
    )]

py.iplot(data, filename='heals-bar')

### An Analysis of My Character, Hippydoc 

The Character I played in this data Set is Hippydoc, a Restoration Druid. Restoration Druids specialize in Heal over Time Effects. Which we see above accounts for about 29% of the total healing. Did I carry my weight in this Boss Fight?

In [20]:
Hippydoc_Spell_heal = Raid_spell_heal[Raid_spell_heal.sourceName== "Hippydoc-Lightbringer"]
Hippydoc_Spell_HoTs = Raid_spell_HoTs[Raid_spell_HoTs.sourceName== "Hippydoc-Lightbringer"]

In [21]:
trace = go.Scatter(
    x = Hippydoc_Spell_heal['Time'],
    y = Hippydoc_Spell_heal['spell_damage'],
    name = 'Hippydoc Spell Heals'
)

data = [trace]

py.iplot(data, filename='hippy_heals-line')

We can see that I did provide periodic healing effects, the smaller more frequent ones come from a spell called Efflorescence which is an area of effect that heals three teammates standing within. The bursts are from Regrowth (which also applies a HoT) and Swiftmend. You will also notice two places, one starting 22:32:08, the other 22:34:27 where the baseline is over 10K for severel seconds. This is from Tranquility, a raid-wide channeled (I have to be still for the duration) heal.

In [22]:
trace = go.Scatter(
    x = Hippydoc_Spell_HoTs['Time'],
    y = Hippydoc_Spell_HoTs['spell_damage'],
    mode = 'markers',
    name = 'Hippydoc Spell HoTs'
)
data = [trace]

py.iplot(data, filename='hippy_hots-line')

We can see from the HoT time series,  that HoTs can be applied on several targets at once. This made markers more practical than lines in showing the data. 

In [23]:
Hippydoc_Spell_HoTs.spellName.unique()

array(['Rejuvenation', 'Lifebloom', 'Wild Growth', 'Regrowth',
       'Cenarion Ward', 'Azerite Veins', 'Tranquility'], dtype=object)

In [24]:
Hippydoc_Spell_heal.spellName.unique()

array(['Efflorescence', 'Autumn Leaves', "Ysera's Gift", 'Regrowth',
       'Lifebloom', 'Leech', 'Swiftmend', 'Tranquility',
       'Azerite Fortification', 'Mutating Antibody'], dtype=object)

In [25]:
hippy_hot_total =  Hippydoc_Spell_HoTs.spell_damage.sum()
data = [go.Bar(
            x=['Raid HoTs', 'Hippydoc Hots'],
            y=[hot_total, hippy_hot_total]
    )]

py.iplot(data, filename='hots-hippy-bar')

I wanted to see how much I contributed to healing compared to my teammates. Above we can see that I accounted for about 81% of the HoT healing.  

In [26]:
hippy_heal_total =  Hippydoc_Spell_heal.spell_damage.sum()
data = [go.Bar(
            x=['Raid Heals', 'Hippydoc Heals'],
            y=[heal_total, hippy_heal_total]
    )]

py.iplot(data, filename='heal-hippy-bar')

However, I only accounted for about 16% of the non-periodic healing.

In [27]:
data = [go.Bar(
            x=['Raid Heals Total', 'Hippydoc Heals Totals'],
            y=[heal_total+hot_total, hippy_hot_total+hippy_heal_total]
    )]

py.iplot(data, filename='heal-hippy-total-bar')

In [28]:
Raid_spell_heal.sourceName.unique()

array(['Hippydoc-Lightbringer', 'Redshell-Lightbringer',
       'Batte-Lightbringer', 'Ayl-Lightbringer', 'FuryosÃ¡-Lightbringer',
       'Ragnaar-Lightbringer', 'ZoobÃ\xad-Lightbringer',
       'Fahris-Lightbringer', 'Novakaan-Lightbringer',
       'Kylaara-Lightbringer', 'Aohige-Lightbringer',
       'Mefistophel-Lightbringer', 'Ghostwinkle', 'Pocanu-Lightbringer',
       'Shameonjohn-Lightbringer', 'Krodorr-Lightbringer',
       'Yosana-Lightbringer', 'Gibolt-Lightbringer'], dtype=object)

My total healing accounts for about 31% of the total heal (7.97M over 25.38M). All 17 team members contributed some healing, and 1 pet. Considering that there were 3 healers: Batte, Ayl, and Hippydoc. Batte and Ayl are preists and specialize in non-periodic healing. Hippydoc doing the majority of the HoT healing and a relativily small amount of regular healing was a matter of specialization.

## Summary

- Data log format is inconsistant with documenation. Futher testing may be needed to make sure data is presented correctly
- Healing Data is consistant with character specs in terms of healing types and percent contributions.
- Critical successes for damage and healing were higher than I expected.
- Next step is to build a user input method of slicing the data for analysis.