 ## League of Legends Competitive Analysis
    
    by Varun Nadgir

## Milestone Report

### The Problem & Client

I would like to put myself in the role of an esports analyst, so my client could be any of the North American organizations with a League of Legends team. As someone in this role, not only would I need to be familiar with trends in our local region, but when international tournaments come around, the organization will be relying on their team of analysts to help prepare the players for new competition. Identifying any sort of trends and patterns in another team or region's playstyle would be a big leg up, and that will come from intelligent study of the data.

### The Data

All of the competitive League of Legends data is collected by [Oracle's Elixir](http://oracleselixir.com/match-data/), and for this study I will be looking at the Spring 2018 data. There are about 15,000 rows and almost 100 columns in this dataset - each match has 5 entries for the players of one team, 5 for the other, and then 2 for team-wide stats. After importing the .csv file, it was just a matter of preparing the dataset for each question by subsetting into smaller tables.

At some point in the future, it would also be helpful to look at the data collected by Riot Games in the competitive ranked ladder. These data could shed light into how certain Champion matchups might be expected to go or if some objectives tend to lead to wins more than others, among other things. There are hundreds of thousands of games played around the world daily, meaning there is a lot of valuable data being generated constantly.

### Importing Data

In [2]:
# import packages
import pandas as pd
import csv

# read in csv
loldf = pd.read_csv('loldata.csv')

  interactivity=interactivity, compiler=compiler, result=result)


### Data Wrangling

##### Champion Head to Head Performance

In [7]:
# filter out specific columns needed
df = loldf[['gameid', 'league', 'position', 'team', 'champion', 'result']]

# subset data for the 5 major regions
major_regions = df[df['league'].isin(['NALCS', 'EULCS', 'LCK', 'LPL', 'LMS'])]

# subset data to just Top lane 
major_regions_top = major_regions[major_regions['position']=='Top']

# get list of unique game IDs
major_regions_gamelist = major_regions['gameid'].unique()

# create groupby object based on 'gameid', each item contains winner and loser of the game
major_group = major_regions_top.groupby('gameid')[['champion', 'result']]

# create a square matrix of 0's with rows/columns equal to number of unique champions picked
major_matrix = pd.DataFrame(0, index=major_regions_top['champion'].unique(), 
                            columns=major_regions_top['champion'].unique())

# loop through each unique gameid and add a 1 to each cell location where the ROW defeats the COLUMN
for i in range(len(major_regions_gamelist)):
    major_group_champs = major_group.get_group(major_regions_gamelist[i]).sort_values('result', ascending=False)['champion']
    win_champ = major_group_champs.iloc[0]
    lose_champ = major_group_champs.iloc[1]
    major_matrix[lose_champ].loc[win_champ] += 1
    
# example of what one row looks like
major_matrix.iloc[[0]]

Unnamed: 0,Ornn,Gnar,Gangplank,Vladimir,Illaoi,Lucian,Camille,Jayce,Ryze,Cho'gath,...,Malphite,Yasuo,Kassadin,Kennen,Swain,Aatrox,Renekton,Darius,Cassiopeia,Singed
Ornn,0,31,19,9,4,0,10,0,1,9,...,0,0,0,0,1,0,1,0,1,0


##### Inferential Statistics 

In [8]:
# get data for just team-wide statistics
teams = loldf[loldf['player']=='Team']
# subset specific columns having to do with in-game performance
teams = teams[['gameid', 'league', 'team', 'result', 'ft', 'firstmidouter', 'firsttothreetowers', 'fbaron', 'totalgold', 'goldspent', 'goldat10', 'gdat10', 'goldat15', 'gdat15', 'xpat10', 'xpdat10']]

# split up data into winning team data and losing team data
win_teams = teams[teams['result']==1]
lose_teams = teams[teams['result']==0]

### Initial Findings

Just from taking a peek at the final matrix, it looks like Champions such as Ornn, Gnar, Gangplank, Vladimir, and Camille saw a lot of play. At the time, these Champions were very strong and were able to be picked regardless of the opponent due to how reliable and useful they were to their teams. Other Champions in this matrix with fewer games could be seen as 'counterpicks', or response picks specifically chosen to answer another Champion. 

In comparing the winning teams to losing teams, I also noticed that the winning teams have much higher percentages of taking the first tower, the mid-outer tower, and first to three towers. However, the next question this raises is whether teams are winning because of taking these towers or if teams were able to take these towers because they were already winning. In other words, at what point in the game do we say the teams are even and when do we say one comfortably has a lead?