# Dota K-Means
Welcome to a data exploration on Dota! Dota is a multiplayer online battle arena (MOBA) that I played throughout high school. I love the passion the community has for the game, and it was tons of fun to play with my friends. I look forward to seeing what we can find :D

# The Story

Some of the heroes have felt recently that their assignments into the three categories of Strength, Agility, and Intelligence isn't the most accurate.  After years of buffs, nerfs, and reworks, the heros think it's time to do one big reassignmnet. What would happen if the heros were reassigned to new groupings?  Lets find out!

# The Data Process

The data used in calculating these groupings are standard Dota hero attributes.  Each hero has a set template inherent to their character, so for example, Abaddon has a shorter range than a hero like Windrunner as Abaddon uses melee attacks, while Windrunner uses ranged attacks.  Certain parts of the data will be excluded (e.g. their vision range, and number of legs).  I will also normalize over all columns, as I want to have no one stat's scaling to outweigh the others in determining groupings.

In [1]:
# Imports
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import MinMaxScaler

In [2]:
"""
Standard data cleaning.  Removing columns that may not be of 
interest (e.g. number of legs)
"""
# Data pulled from Dota wiki.  See "webscraping" section of project
#  for more details!
dota_df = pd.read_csv("data/hero_stats.csv")

le = LabelEncoder()
le.fit(dota_df['A'])

trueClasses = list(le.classes_)

dota_df['A'] = le.transform(dota_df['A'])
slice_df = dota_df[['HERO', 'A']]
dota_df = dota_df.drop(columns=['L', 'A', 'COL', 'HERO', 'VS-D', 'VS-N'])

min_max_scaler = MinMaxScaler()
scaled_values = min_max_scaler.fit_transform(dota_df) 
dota_df.loc[:,:] = scaled_values

print(dota_df)

          STR      STR+     STR30       AGI      AGI+     AGI30       INT  \
0    0.600000  0.515152  0.505397  0.676471  0.312500  0.412531  0.333333   
1    0.733333  0.424242  0.439647  0.647059  0.312500  0.406328  0.722222   
2    0.400000  0.181818  0.162905  0.588235  0.458333  0.519851  0.611111   
3    0.600000  0.000000  0.021590  0.705882  0.625000  0.688586  0.000000   
4    0.733333  0.515152  0.525025  0.441176  0.500000  0.524814  0.666667   
..        ...       ...       ...       ...       ...       ...       ...   
114  0.266667  0.515152  0.456330  0.500000  0.291667  0.357320  0.333333   
115  0.666667  0.393939  0.401374  0.470588  0.395833  0.441067  0.777778   
116  0.266667  0.303030  0.257115  0.382353  0.291667  0.332506  0.555556   
117  0.533333  0.575758  0.552502  0.529412  0.354167  0.417494  0.333333   
118  0.466667  0.242424  0.229637  0.323529  0.250000  0.284119  0.555556   

         INT+     INT30         T  ...        AR  DMG(MIN)  DMG(MAX)  \
0  

In [3]:
kmeans = KMeans(n_clusters=3, n_init=25)
kmeans.fit(dota_df)

KMeans(algorithm='auto', copy_x=True, init='k-means++', max_iter=300,
       n_clusters=3, n_init=25, n_jobs=None, precompute_distances='auto',
       random_state=None, tol=0.0001, verbose=0)

In [4]:
hero_groups = [[], [], []]
og_labels = [[], [], []]

for hero_ind in range(len(kmeans.labels_)):
    hero_name = slice_df['HERO'][hero_ind]
    pred_label = kmeans.labels_[hero_ind]
    true_label = trueClasses[slice_df['A'][hero_ind]]
    hero_groups[pred_label].append((hero_name, true_label))
    
hero_groups.sort()

# The Results: A First Look

With the processing out of the way, let's take a look at how the heros grouped up!

# Group A

In [5]:
print("Group A:")

for hero in hero_groups[0]:
    print("{:<20s} \t {}".format(hero[0], hero[1]))

Group A:
Abaddon              	 Strength
Alchemist            	 Strength
Axe                  	 Strength
Batrider             	 Intelligence
Beastmaster          	 Strength
Bloodseeker          	 Agility
Bounty Hunter        	 Agility
Brewmaster           	 Strength
Bristleback          	 Strength
Centaur Warrunner    	 Strength
Chaos Knight         	 Strength
Clockwerk            	 Strength
Dark Seer            	 Intelligence
Doom                 	 Strength
Dragon Knight        	 Strength
Earth Spirit         	 Strength
Earthshaker          	 Strength
Elder Titan          	 Strength
Ember Spirit         	 Agility
Faceless Void        	 Agility
Kunkka               	 Strength
Legion Commander     	 Strength
Lifestealer          	 Strength
Lycan                	 Strength
Magnus               	 Strength
Mars                 	 Strength
Meepo                	 Agility
Naga Siren           	 Agility
Night Stalker        	 Strength
Nyx Assassin         	 Agility
Ogre Magi            	 Intelli

As we can see, **Group A** has a ton of representation from the past classication of "Strength".  It would make sense that a majority of each group will be made by the traditional classification, as these classifications can be reflected by a hero's initial stats and stat growth. 

Looking over the heros in this grouping, we can see tons of melee representation, with the only two outliers being Batrider and Phoenix.

Looking across all outliers (baseline being for **Group A** being a Strength assignment), we get the following heros:
- Batrider
- Bloodseeker
- Bounty Hunter
- Dark Seer
- Ember Spirit
- Faceless Void
- Meepo
- Naga siren
- Nyx Assassin
- Ogre Magi
- Riki
- Slark
- Spectre
- Ursa
- Void Spirit

With the exception of Phoenix and Batrider, these are all melee characters with some of the highest initial strength stat in the game.  The only other characters in the list that have these high initial strength stat that aren't part of this grouping are all ranged characters, which would greatly indicate that the range of a characters attack greatly weighs in its grouping.

Looking at the two outliers, Batrider and Phoenix, it is important to note that although they do have 'ranged' attacks, Batrider's is only at a range of 375 units, compared to the roughly standard 500 units that most ranged heros have.  As for Phoenix, it has a range of 525, making it the longest range of all **Group A**.  Phoenix truly is an outlier, and may be interesting to explore further.

# Group B

In [6]:
for hero in hero_groups[1]:
    print("{:<20s} \t {}".format(hero[0], hero[1]))

Ancient Apparition   	 Intelligence
Arc Warden           	 Agility
Bane                 	 Intelligence
Chen                 	 Intelligence
Crystal Maiden       	 Intelligence
Dark Willow          	 Intelligence
Dazzle               	 Intelligence
Death Prophet        	 Intelligence
Disruptor            	 Intelligence
Enchantress          	 Intelligence
Enigma               	 Intelligence
Grimstroke           	 Intelligence
Invoker              	 Intelligence
Jakiro               	 Intelligence
Keeper of the Light  	 Intelligence
Leshrac              	 Intelligence
Lich                 	 Intelligence
Lina                 	 Intelligence
Lion                 	 Intelligence
Medusa               	 Agility
Nature's Prophet     	 Intelligence
Necrophos            	 Intelligence
Oracle               	 Intelligence
Outworld Devourer    	 Intelligence
Puck                 	 Intelligence
Pugna                	 Intelligence
Queen of Pain        	 Intelligence
Rubick               	 Intelligence
Sh

**Group B** is nearly completely made from the past classication of "Intelligence".  The only two heroes not part of the intelligence group originally are both Arc Warden and Medusa.

# Group C

In [7]:
for hero in hero_groups[2]:
    print("{:<20s} \t {}".format(hero[0], hero[1]))

Anti-Mage            	 Agility
Broodmother          	 Agility
Clinkz               	 Agility
Drow Ranger          	 Agility
Gyrocopter           	 Agility
Huskar               	 Strength
Io                   	 Strength
Juggernaut           	 Agility
Lone Druid           	 Agility
Luna                 	 Agility
Mirana               	 Agility
Monkey King          	 Agility
Morphling            	 Agility
Pangolier            	 Agility
Phantom Assassin     	 Agility
Phantom Lancer       	 Agility
Razor                	 Agility
Shadow Fiend         	 Agility
Snapfire             	 Strength
Sniper               	 Agility
Templar Assassin     	 Agility
Terrorblade          	 Agility
Troll Warlord        	 Agility
Vengeful Spirit      	 Agility
Venomancer           	 Agility
Viper                	 Agility
Weaver               	 Agility
