# Analyzing League of Legends - Bot Lane Match Ups 

Carl Xiong

# Introduction

League of Legends is a MOBA (Massive Online Battle Arena) type game. It is a competitive game in which two teams of 10 players (5v5) battle it out. Each player selects their own unique champion and they have to work together with their team to battle against the enemy. There is not only a lot of skill involved, but also strategy as well. 

The premise of the game is simple. Players are placed into a map called summoners rift. Each player's respective champion can gain level by beating the game's monsters. By defeating the game control mobs, called creeps, their champions gain experience and gold. Experience allows the champion to level up and gold lets the champion purchase items from the shop. Levels and items will increase a champion stats such as health, attack damage, etc. The end object is to become more powerful than the opposing team's champions and to evantually defeat them by attacking their base, also called a Nexus in game. 

In my tutorial, I will go over some basic mechanics and patterns that have developed over the course of the game. I will be analyzing the win rates of certain champions compared to others and dissecting this reasoning. The game has developed a META, which means the game within the game. A more direct way of explaining it is "Most Effective Tactic Available." This means that a majority of the players will follow these specific set of guidelines because they yield the most results, in this case, wins for the game.

The game is set up as a 5v5 game but each of the players in the 5 man group all play a different way. Usually each player plays a unique lane or role in this case. The most widely accepted roles are Top, Jungle, Mid, Bot, and Support. For my tutorial I will be analyzing the Bot role and their respective dynamics. 

# Step 1 : Collecting the data.

Before we start analyzing anything, we have to find data to analyze. In my case, I will be using a dataset provide by Kaggle. This data set is provided in the from of a csv, or comma separated value. This file is useful for transfering data in table form, that is, each "cell" in a table, like an excel sheet, is separated by a comma. This is useful when transfering data from different sources or interpreters, like excel to google sheets.

The link for the data can be found in the link below. It is called games.csv
https://www.kaggle.com/jaytegge/league-of-legends-data-analysis/data

To begin processing it, we will have to use some python libraries, namely pandas and numpy.

In [18]:
import pandas as pd
import numpy as np
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

We will use pandas' built in read csv function in order to read the csv file into a pandas' dataframe.

In [21]:
games = pd.read_csv("games.csv")
games.head()

Unnamed: 0,gameId,creationTime,gameDuration,seasonId,winner,firstBlood,firstTower,firstInhibitor,firstBaron,firstDragon,...,t2_towerKills,t2_inhibitorKills,t2_baronKills,t2_dragonKills,t2_riftHeraldKills,t2_ban1,t2_ban2,t2_ban3,t2_ban4,t2_ban5
0,3326086514,1504279457970,1949,9,1,2,1,1,1,1,...,5,0,0,1,1,114,67,43,16,51
1,3229566029,1497848803862,1851,9,1,1,1,1,0,1,...,2,0,0,0,0,11,67,238,51,420
2,3327363504,1504360103310,1493,9,1,2,1,1,1,2,...,2,0,0,1,0,157,238,121,57,28
3,3326856598,1504348503996,1758,9,1,1,1,1,1,1,...,0,0,0,0,0,164,18,141,40,51
4,3330080762,1504554410899,2094,9,1,2,1,1,1,1,...,3,0,0,1,0,86,11,201,122,18


In the above dataframe table, we have multiple rows and columns. Each row corresponds to a different game and the columns correspond to the statistics of that particular game such as duration (in seconds), winner (team 1 or team 2), the amount of objectives taken in each game by each team, towerKills, inhibitor kills, etc. The table also contains the champion IDs for each team. I will be analyzing this section in particular in order to come up with a conclusion about the underlying mechanics between each Top lane match up.

I'm mainly concerned with the win rates between the matchups of each champion, so lets only focus on those specific columns.

# Step 2 : Data Cleaning

Now that we have the usable data in a dataframe, we have to tidy it up so that it will be easier to focus on the important columns. For my particular analysis, I will be focusing on the win rates of the top lane champions. So the columns and variables that I will need are mainly going to be the winning team, and the individual champion IDs of each team.

Down below, I simply take a slice of the above dataframe with only the relevant columns.

In [132]:
wins = games[['winner','t1_champ1id','t1_champ2id','t1_champ3id','t1_champ4id','t1_champ5id','t2_champ1id','t2_champ2id','t2_champ3id','t2_champ4id','t2_champ5id']]
wins.head()

Unnamed: 0,winner,t1_champ1id,t1_champ2id,t1_champ3id,t1_champ4id,t1_champ5id,t2_champ1id,t2_champ2id,t2_champ3id,t2_champ4id,t2_champ5id
0,1,8,432,96,11,112,104,498,122,238,412
1,1,119,39,76,10,35,54,25,120,157,92
2,1,18,141,267,68,38,69,412,126,24,22
3,1,57,63,29,61,36,90,19,412,92,22
4,1,19,29,40,119,134,37,59,141,38,51


The table above shows the winner of each game, whether it was team 1 or team 2. The champion Ids are also shown above. However, it is not very intuitive to look at the data this way so lets try to match the champion IDs with their respective names. Thankfully in the dataset provided by Kaggle, they also list a json file which contains exactly that.

In [9]:
jDict = pd.read_json('champion_info.json')
champInfo = pd.read_json((jDict['data']).to_json(), orient='index')
champInfo.head()

Unnamed: 0,id,key,name,title
1,1,Annie,Annie,the Dark Child
10,10,Kayle,Kayle,The Judicator
101,101,Xerath,Xerath,the Magus Ascendant
102,102,Shyvana,Shyvana,the Half-Dragon
103,103,Ahri,Ahri,the Nine-Tailed Fox


If we wanted to get the name of Champion Id 119, this is how we would go about doing so.

In [41]:
id = 119
champInfo.loc[119,'key']

'Draven'

Now we have a table correlating the champion IDs with their names. I will save this part for last because it is very computationally intensive. The next part is focusing on the win rates. I can simply sum up the amount of times a particular champion won or lost to another and use that as the win percentage. I will represent this win data as a 2D matrix.

I will create 2 tables, one to record the wins and one to record the losses. At the end of the tallying, I can divide the win by the loss to get a win ratio for each champion against another champion.

The size of these win/loss tables will be the amount of champion currently in the game.

In [327]:
champList = (champInfo.loc[:,"id"])
champions = [0]
winRatio = pd.DataFrame(columns=champList)
winRatio['champions'] = champList.values
#winRatio.rename(index=str, columns={"Annie": "Champion", "champions": "Annie"})
#winRatio['Champion'] = winRatio['Annie']
winRatio.head()

id,1,10,101,102,103,104,105,106,107,11,...,86,89,9,90,91,92,96,98,99,champions
0,,,,,,,,,,,...,,,,,,,,,,1
1,,,,,,,,,,,...,,,,,,,,,,10
2,,,,,,,,,,,...,,,,,,,,,,101
3,,,,,,,,,,,...,,,,,,,,,,102
4,,,,,,,,,,,...,,,,,,,,,,103


I will also set all the values to 0

In [328]:
i = 0
j = 0
while i < len(winRatio.index):
    j = 0
    while j < len(winRatio.index):
        winRatio.iloc[i,j] = 0
        j += 1
    i +=1
    

Now we have a 2x2 Matrix that represeents every champion ID. Next I will create a copy of this array for the losses and a third copy for the final win percentage.

In [241]:
lossRatio = winRatio.copy()
finalRatio = winRatio.copy()

In [316]:
winRatio

id,1,10,101,102,103,104,105,106,107,11,...,86,89,9,90,91,92,96,98,99,champions
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,10
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,101
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,102
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,103
5,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,104
6,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,105
7,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,106
8,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,107
9,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,11


Now I will being tallying up the wins and losses for each champion against another champion

In [297]:
temp = winRatio.loc[winRatio['champions'] == 101]
id = temp.index
winRatio.loc[id, 10] +=1

In [325]:
winRatio.loc[122,104]

0

In [None]:
i = 0
while i < len(wins.index):
    j = 1
    winTeam = wins.iloc[i,0]
    if winTeam == 1:
        j = 1
        while j < 6:
            currID = wins.iloc[i,j]
            won1 = wins.iloc[i,6]
            won2 = wins.iloc[i,7]
            won3 = wins.iloc[i,8]
            won4 = wins.iloc[i,9]
            won5 = wins.iloc[i,10]
            temp = winRatio.loc[winRatio['champions'] == currID]
            id = temp.index.tolist()[0]
            winRatio.loc[id,won1] += 1
            winRatio.loc[id,won2] += 1
            winRatio.loc[id,won3] += 1
            winRatio.loc[id,won4] += 1
            winRatio.loc[id,won5] += 1
            j += 1
        j = 6
        while j < 11:
            currID = wins.iloc[i,j]
            loss1 = wins.iloc[i,1]
            loss2 = wins.iloc[i,2]
            loss3 = wins.iloc[i,3]
            loss4 = wins.iloc[i,4]
            loss5 = wins.iloc[i,5]
            temp = winRatio.loc[winRatio['champions'] == currID]
            id = temp.index.tolist()[0]
            lossRatio.loc[id,loss1] += 1
            lossRatio.loc[id,loss2] += 1
            lossRatio.loc[id,loss3] += 1
            lossRatio.loc[id,loss4] += 1
            lossRatio.loc[id,loss5] += 1
            j += 1
            
    elif winTeam == 2:
        j = 6
        while j < 11:
            currID = wins.iloc[i,j]
            won1 = wins.iloc[i,1]
            won2 = wins.iloc[i,2]
            won3 = wins.iloc[i,3]
            won4 = wins.iloc[i,4]
            won5 = wins.iloc[i,5]
            temp = winRatio.loc[winRatio['champions'] == currID]
            id = temp.index.tolist()[0]
            winRatio.loc[id,won1] += 1
            winRatio.loc[id,won2] += 1
            winRatio.loc[id,won3] += 1
            winRatio.loc[id,won4] += 1
            winRatio.loc[id,won5] += 1
            j += 1
        j = 1
        while j < 6:
            currID = wins.iloc[i,j]
            loss1 = wins.iloc[i,6]
            loss2 = wins.iloc[i,7]
            loss3 = wins.iloc[i,8]
            loss4 = wins.iloc[i,9]
            loss5 = wins.iloc[i,10]
            temp = winRatio.loc[winRatio['champions'] == currID]
            id = temp.index.tolist()[0]
            lossRatio.loc[id,loss1] += 1
            lossRatio.loc[id,loss2] += 1
            lossRatio.loc[id,loss3] += 1
            lossRatio.loc[id,loss4] += 1
            lossRatio.loc[id,loss5] += 1
            j += 1            
    i += 1
            


In [333]:
winRatio.head()

id,1,10,101,102,103,104,105,106,107,11,...,86,89,9,90,91,92,96,98,99,champions
0,0,9,2,3,23,6,5,5,5,26,...,7,11,6,14,7,18,6,9,18,1
1,2,0,1,6,9,0,3,0,5,8,...,8,10,2,12,9,10,4,6,13,10
2,5,0,0,6,11,3,10,0,4,8,...,1,9,0,6,6,6,9,4,16,101
3,7,1,2,0,10,4,9,2,6,14,...,6,10,1,4,7,10,10,8,15,102
4,14,11,4,20,0,3,14,7,12,37,...,27,20,4,13,19,35,15,16,41,103


This is an example of what the out put should be but the program takes a while to run so I manually stopped it short.

# Step 3 : Visuallization

After cleaning the data, this step is for ease of use. This lets a un-informed reader intepret the data easily.

Below, I replace all the ugly champion IDs with their respective names so that we can get a better understanding of who is being picked into the game and how the matchups are.

In [159]:
new = pd.DataFrame(columns=['winner','t1_champ1id','t1_champ2id','t1_champ3id','t1_champ4id','t1_champ5id','t2_champ1id','t2_champ2id','t2_champ3id','t2_champ4id','t2_champ5id'])
#new = pd.DataFrame()
new.head()

Unnamed: 0,winner,t1_champ1id,t1_champ2id,t1_champ3id,t1_champ4id,t1_champ5id,t2_champ1id,t2_champ2id,t2_champ3id,t2_champ4id,t2_champ5id


In [168]:
i = 0
while i < len(wins.index):
    j = 1
    test = wins.iloc[i,:]
    new = new.append({'winner': wins.iloc[i,0]},ignore_index=True)
    while j < len(test):
        tempId = test[j]
        champName = champInfo.loc[tempId,'key']
        #testtemp.append(champName)
        new.iloc[i,j] = champName
        j += 1
    i += 1
        

KeyboardInterrupt: 

In [167]:
new.head()

Unnamed: 0,winner,t1_champ1id,t1_champ2id,t1_champ3id,t1_champ4id,t1_champ5id,t2_champ1id,t2_champ2id,t2_champ3id,t2_champ4id,t2_champ5id
0,1.0,Vladimir,Bard,KogMaw,MasterYi,Viktor,Graves,Xayah,Darius,Zed,Thresh
1,1.0,Draven,Irelia,Nidalee,Kayle,Shaco,Malphite,Morgana,Hecarim,Yasuo,Riven
2,1.0,Tristana,Kayn,Nami,Rumble,Kassadin,Cassiopeia,Thresh,Jayce,Jax,Ashe
3,1.0,Maokai,Brand,Twitch,Orianna,DrMundo,Malzahar,Warwick,Thresh,Riven,Ashe
4,1.0,Warwick,Twitch,Janna,Draven,Syndra,Sona,JarvanIV,Kayn,Kassadin,Caitlyn


This is an example of how the original table would have looked if the program succesfully finished running but it is very computationally intenstive so I manually stopped it.

Moving on, things can get a little messy, as multiple champions can going different lanes. . I will be using champion.gg as classification on whether or not a particular champion is a in their respective lane or not. Obviously this method is not perfect but since we don't have any further data, this is the best we can do, infering a champion's specific role.

# Step 4 : Conclusion

Had I let the entire program runs its course I would've been left with some data that I could make a inference out of. However, that is not the case so I have to make a conclusion based on the current set of data which I have. The only conclusion I can make out of this data is that most teams follow the meta pretty well, usually filling in the roles of top, jungle, mid, bot, and support. 