<br>

# League of Legends Data Project
___

## Part I: Pulling Data from the Riot API
using the [Cassiopeia](https://zenodo.org/record/1183285#.WukZMIgvyUk) package
___

This section outlines the steps necessary to connect to the Riot Games API and query its data. 

It is intended as a guide for current *League of Legends* players who wish to learn how to access and analyze their own data. To that end, replicating the results has been made as simple as possible—no prior Python or programming experience is required. One simply needs to follow the ***Before You Begin*** steps below, and then modify a few lines of code to specify their Summoner name and any desired changes to the types of data requested.

### Table of Contents
___
* [Getting Started](#Getting-Started)
* [Requesting Match History Data](#Calling-Match-History-Data)
    * [Dictionaries](#Creating-Customized-Data-Fields-with-Dictionaries)
    * [Player Stats](#stats)
    * [Teammate Data](#Teammate-Data)
* [Merging Dataframes](#Merging-All-Data-Together)
___

##### Before You Begin

If you haven't yet, you'll want to download [Anaconda (Python 3.x Version)](https://www.anaconda.com/download/). This is an all-in-one data science package that includes the Python programming language, numerous packages for data prep, analysis, and visualization, and the Jupyter Notebook, for executing code and creating documents just like this one.

Next, follow the instructions below to install [Cassiopeia](https://github.com/meraki-analytics/cassiopeia/blob/master/README.md). The link takes you to the project's GitHub repository, where you can find the package's documentation and examples.
* Open **Anaconda Navigator**.
* In the **Environments** tab, click the "play" button next to `base (root)`.
* Select **Open Terminal**.
* In the command prompt, simply type `pip install cassiopeia` and hit enter. 
* Close the prompt once installation is complete.

Finally, generate your [Riot Games API Key](https://developer.riotgames.com/) after signing in with your League of Legends account.

### Getting Started
___

In [1]:
import cassiopeia as cass
import pandas as pd
import numpy as np
import sys
import os

In [2]:
# Hide deprecation warning messages
import warnings
warnings.filterwarnings("ignore")

In [3]:
print("This project was created in Python version %s.%s.%s" % sys.version_info[:3])

This project was created in Python version 3.6.4


In [1]:
# Get current working directory
os.getcwd()

In [9]:
# Change working directory, if necessary
os.chdir('')

Paste in your API Key from the [Riot Developer Portal](https://developer.riotgames.com/) and set your default region below.

Remember, your key will expire 24 hours after it is generated. You'll need to return to the portal and generate a new key when this happens.

In [10]:
cass.set_riot_api_key("PASTE_API_KEY_HERE")
cass.set_default_region("NA")

Enter your Summoner Name in the code below (once at the top, again at the bottom).

This will test to make sure you're connecting to the API.

In [12]:
summoner = cass.get_summoner(name = 'YOUR_NAME_HERE', region = 'NA')

def print_summoner(name: str, region: str):
    print("Name:", summoner.name)
    print("ID:", summoner.id)
    print("Account ID:", summoner.account.id)
    print("Level:", summoner.level)
    print("Revision date:", summoner.revision_date)
    print("Profile icon URL:", summoner.profile_icon.url)

if __name__ == "__main__":
    print_summoner("YOUR_NAME_HERE", "NA")

Name: Proxy Fox
ID: 51575158
Account ID: 214362000
Level: 74
Revision date: 2018-05-02
Profile icon URL: https://ddragon.leagueoflegends.com/cdn/8.8.2/img/profileicon/3382.png


___

We can pull a list of all the Summoner's played Champions and their respective Mastery Levels.

In [13]:
# Get a list of all played champions and their mastery scores. (Mastery 0 indicates never played)
my_masteries = summoner.champion_masteries.filter(lambda cm: cm.level >= 1)
print([(cm.champion.name, cm.level) for cm in my_masteries])

Making call: https://ddragon.leagueoflegends.com/cdn/8.8.2/data/en_US/championFull.json
Making call: https://na1.api.riotgames.com/lol/champion-mastery/v3/champion-masteries/by-summoner/51575158
[('Janna', 5), ('Nami', 5), ('Miss Fortune', 5), ('Karma', 5), ('Lux', 5), ('Ahri', 4), ('Veigar', 4), ("Kog'Maw", 4), ('Kayle', 4), ('Rakan', 4), ('Taric', 4), ('Nasus', 4), ('Blitzcrank', 3), ('Morgana', 3), ('Singed', 3), ('Twisted Fate', 3), ('Soraka', 3), ('Maokai', 3), ('Karthus', 3), ('Ziggs', 3), ('Swain', 3), ('Master Yi', 3), ('Galio', 3), ('Sona', 2), ('Xayah', 2), ('Caitlyn', 2), ('Ashe', 2), ('Twitch', 2), ('Zyra', 2), ('Sivir', 2), ('Syndra', 2), ('Ezreal', 2), ('Lulu', 2), ('Jayce', 2), ("Vel'Koz", 2), ('Lucian', 2), ('Heimerdinger', 2), ('Cassiopeia', 2), ('Vladimir', 2), ('Vi', 2), ('Thresh', 2), ('Ivern', 2), ('Wukong', 2), ('Vayne', 1), ('Hecarim', 1), ('Jinx', 1), ('Brand', 1), ('Malphite', 1), ('Graves', 1), ('Viktor', 1), ('Amumu', 1), ('Illaoi', 1), ('Zed', 1), ('Yasuo', 

> These can be stored permanently, to avoid having to make this call on the API again in the future.
>
> *Note: starting from scratch, ignoring the cell above.*
> * Create an empty list, `masteries = []`
> * For each item in the list pulled from the server, append the champion.name and level attributes
> * Create a pandas dataframe from the list, complete with named columns
> * Preview the result with your Top 10 Champions by Mastery

In [14]:
masteries = []

for cm in summoner.champion_masteries:
    masteries.append((cm.champion.name, cm.level))

df_cm = pd.DataFrame(masteries, columns = ('champion', 'mastery_level'))

df_cm.head(10)

Unnamed: 0,champion,mastery_level
0,Janna,5
1,Nami,5
2,Miss Fortune,5
3,Karma,5
4,Lux,5
5,Ahri,4
6,Veigar,4
7,Kog'Maw,4
8,Kayle,4
9,Rakan,4


### Calling Match History Data
___
> First, define an object (`match_hist`) to represent your personal match history. ("summoner" is an object we defined earlier using the chosen Summoner Name. In this example, the `summoner` object references me, "Proxy Fox".)
>
> You can filter by `Queue`, `Season`, and more. The default (an empty set of parentheses) returns the entire match history.
> For this example, I'm pulling all my ARAM matches.
>
> If, for example, you instead wanted data for only your Ranked matches in Season 8, you would change the code as follows:

```match_hist = summoner.match_history(queues = {Queue.ranked_solo_fives, Queue.ranked_flex_fives}, seasons = {Season.season_8)
```

In [15]:
from cassiopeia import Queue, Season
match_hist = summoner.match_history(queues = {Queue.aram, 
                                              Queue.depreciated_aram})

len(match_hist)

Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=0&endIndex=100&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=100&endIndex=200&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=200&endIndex=300&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=300&endIndex=400&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=400&endIndex=500&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=500&endIndex=600&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/matchlists/by-account/214362000?beginIndex=600&endIndex=700&queue=65&queue=450
Making call: https://na1.api.riotgames.com/lol/match/v3/m

1100

> Calling `len(match_hist)` gives the number of matches in the history that we just created. Now, we can use the `match_hist` object as the foundation for gathering specific data from single matches.
>
> The stat we most want to focus on is match outcome (i.e. Did we win?), so let's use that as an example.
>
> This code looks at every match in your `match_hist`, defines "p" as the participant identified as `[summoner]`, and returns a boolean (True or False) value based on whether your team won or lost that match.

In [16]:
# Example code, print the 10 most recent match outcomes
for m in match_hist[0:10]:
    p = m.participants[summoner]
    print(p.team.win)

Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2775186012
Making call: https://na1.api.riotgames.com/lol/summoner/v3/summoners/51575158
Making call: https://na1.api.riotgames.com/lol/summoner/v3/summoners/by-account/214362000
False
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2775165096
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774959782
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774936273
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774941224
False
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774915162
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774402461
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774348536
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774308114
True
Making call: https://na1.api.riotgames.com/lol/match/v3/matches/2774344903
False


**Tip:** *The first time we run a code chunk that calls the API, the output will include* "Making call: [url]" *for every call. Simply run the code chunk again to display only the desired output.*

In [17]:
# Example code, print the 10 most recent match outcomes
for m in match_hist[0:10]:
    p = m.participants[summoner]
    print(p.team.win)

False
True
True
True
False
True
True
True
True
False


> In the example, we asked for only the 10 most recent matches, by using `[0:10]`. (This Pythonese can be slightly counterintuitive. The first item in a list is always numbered 0. The range [0:10] specifies Item 0 through Item 9—ten total items. It stops at, but does not include, the last item in the range.)
>
> To pull from the entire match history, we just remove that filter. This causes the code to iterate over all matches.
>
> **Note: This will take several minutes.** API calls are rate-limited. Cassiopeia optimizes the time between calls automatically, but this means that anything more than a few dozen matches will require significant time.
>
> The upside to this, however, is that Cass will access **all** the data for these matches, so if we decide later on to add more statistics to the database, we won't need to re-run the entire call.

In [21]:
# Pull a set of stats for the entire match history

# First full-length API call
# Will take several minutes!

match_data = []

for m in match_hist:
    p = m.participants[summoner]
    match_data.append((m.id, 
                       m.duration, 
                       p.side.name, 
                       p.team.win, 
                       p.team.first_blood, 
                       p.team.first_tower))
    
len(match_data)

1100

> ##### Understanding Objects in Python
> ___
> Python is an Object-oriented programming (OOP) language, meaning we can work with data by defining and manipulating the objects which contain it. Python uses "dot notation" to indicate an attribute, method (function), or even another object, belonging to a parent object. The code above has many such examples, which we can break down line-by-line.
>
> `for m in match_hist:` — In this line, `match_hist` is a **class** of objects, which we defined as a history of matches. `m` is an **iterator** term used to represent a single **object** (match) within that class. The iterator term can be anything—I used "m" for "match", but it would be just as acceptable to use "x", "match", "game", or "banana". Just as long as it makes some sort of sense (so "banana" is probably out) and is consistent throughout the code chunk.
>
> The next lines of code are indented to show that they belong to the first line. That is, the first line says, *"For each object in the list, do the following:"*
>
> `p = m.participants[summoner]` — Here, we're just assigning a simpler name, `p`, to an object that identifies the Summoner (i.e. yourself) within a match. Notice that we reuse `m` from the previous line. This is an object representing a match. The dot (period) following `m` allows us to access the objects and attributes contained within a "match" object. Finally, the [brackets] act like a filter. Out of the ten objects (players) within `m.participants`, we only want the object that matches [summoner]. 
>
> `match_data.append((m.id, ...` — Now, we start actually collecting data. We first defined an empty list, `match_data = []` to act as a "data container". In this line, `match_data` is an **object**, and `append` is a **method**, or a function of that object. We will be appending (adding to) the list all of the following items contained in parentheses. 
>
> `m.id` and `m.duration` are **attributes** of a "match" object `m`. `p.side.name` contains two levels of objects: `p`, which we defined as our Summoner; and `side`, which contains various ways of expressing the "side" on which the participant played. (Blue team starts in the bottom-left corner of the map and plays left-to-right, Red team is opposite.) Finally, `name` is the attribute that gives us the data we want.
>
> ___
>
> ***Python pro tip:*** After typing a dot, press [Tab] to show a list of the possible objects and/or attributes within an object. For example, typing "p.team." in the appropriate spot in the code above, then pressing [Tab], will give you a list of the objects and attributes containing data collected on "participant's team".


In [22]:
# Convert the list into a pandas dataframe
df_match = pd.DataFrame(match_data, columns = ['match_id', 
                                               'duration', 
                                               'side', 
                                               'win', 
                                               'team_first_blood', 
                                               'team_first_tower'])

> ##### Pandas Dataframes
> ___
> The pandas library is a data scientist's best friend in Python. It facilitates working with "dataframes", which can be thought of as similar to an Excel spreadsheet. In fact, pandas dataframes can easily be imported from, and exported to, Excel files.
>
> We've turned the `match_data` "container" into a fully-functional dataframe, complete with column names. It can be previewed, and then saved to disk for easy retrieval later on.

In [23]:
# Preview
df_match.head()

Unnamed: 0,match_id,duration,side,win,team_first_blood,team_first_tower
0,2775186012,00:18:12,blue,False,True,False
1,2775165096,00:18:38,red,True,True,True
2,2774959782,00:22:05,red,True,True,False
3,2774936273,00:18:25,red,True,False,True
4,2774941224,00:19:21,blue,False,False,False


In [24]:
# Write it to a CSV file for permanent storage
df_match.to_csv('data/matches.csv', sep = ',', index = False)

#### Creating Customized Data Fields with Dictionaries
___
> Next, I wanted to include a column for Champion type (one of Assassin, Fighter, Mage, Marksman, Support, Tank), but no such item existed in the data structure of the API.
>
> Below is a manually-created *(you're welcome)* dictionary that associates each Champion with its type. 
> 
> Similar to Excel's VLOOKUP function, this can be used input into a "type" column the value associated with the name in a "champion" column.

In [28]:
# Define each Champion by its respective type.

type_dict = {"Akali":'Assassin', "Ekko":'Assassin', "Evelynn":'Assassin', "Fizz":'Assassin', 
             "Kassadin":'Assassin', "Katarina":'Assassin', "Kha'Zix":'Assassin', "LeBlanc":'Assassin', 
             "Master Yi":'Assassin', "Nidalee":'Assassin', "Nocturne":'Assassin', "Rengar":'Assassin', 
             "Shaco":'Assassin', "Talon":'Assassin', "Zed":'Assassin', 
             
             "Aatrox":'Fighter', "Camille":'Fighter', "Darius":'Fighter', "Diana":'Fighter', 
             "Dr. Mundo":'Fighter', "Fiora":'Fighter', "Gangplank":'Fighter', "Garen":'Fighter', 
             "Gnar":'Fighter', "Gragas":'Fighter', "Hecarim":'Fighter', "Illaoi":'Fighter', 
             "Irelia":'Fighter', "Jax":'Fighter', "Jayce":'Fighter', "Kayle":'Fighter', 
             "Kayn":'Fighter', "Kled":'Fighter', "Lee Sin":'Fighter', "Mordekaiser":'Fighter', 
             "Nasus":'Fighter', "Olaf":'Fighter', "Pantheon":'Fighter', "Rek'Sai":'Fighter', 
             "Renekton":'Fighter', "Riven":'Fighter', "Rumble":'Fighter', "Shyvana":'Fighter', 
             "Skarner":'Fighter', "Trundle":'Fighter', "Tryndamere":'Fighter', "Udyr":'Fighter', 
             "Urgot":'Fighter', "Vi":'Fighter', "Volibear":'Fighter', "Warwick":'Fighter', 
             "Wukong":'Fighter', "Xin Zhao":'Fighter', "Yasuo":'Fighter', "Yorick":'Fighter',
             
             "Ahri":'Mage', "Anivia":'Mage', "Annie":'Mage', "Aurelion Sol":'Mage', 
             "Azir":'Mage', "Brand":'Mage', "Cassiopeia":'Mage', "Elise":'Mage', 
             "Fiddlesticks":'Mage', "Heimerdinger":'Mage', "Karma":'Mage', "Karthus":'Mage', 
             "Kennen":'Mage', "Lissandra":'Mage', "Lux":'Mage', "Malzahar":'Mage', 
             "Morgana":'Mage', "Orianna":'Mage', "Ryze":'Mage', "Swain":'Mage', 
             "Syndra":'Mage', "Taliyah":'Mage', "Twisted Fate":'Mage', "Veigar":'Mage', 
             "Vel'Koz":'Mage', "Viktor":'Mage', "Vladimir":'Mage', "Xerath":'Mage', 
             "Ziggs":'Mage', "Zoe":'Mage', "Zyra":'Mage', 
             
             "Ashe":'Marksman', "Caitlyn":'Marksman', "Corki":'Marksman', "Draven":'Marksman', 
             "Ezreal":'Marksman', "Graves":'Marksman', "Jhin":'Marksman', "Jinx":'Marksman', 
             "Kai'Sa":'Marksman', "Kalista":'Marksman', "Kindred":'Marksman', "Kog'Maw":'Marksman', 
             "Lucian":'Marksman', "Miss Fortune":'Marksman', "Quinn":'Marksman', "Sivir":'Marksman', 
             "Teemo":'Marksman', "Tristana":'Marksman', "Twitch":'Marksman', "Varus":'Marksman', 
             "Vayne":'Marksman', "Xayah":'Marksman',
             
             "Bard":'Support', "Braum":'Support', "Ivern":'Support', "Janna":'Support', 
             "Lulu":'Support', "Nami":'Support', "Nunu":'Support', "Rakan":'Support', 
             "Sona":'Support', "Soraka":'Support', "Tahm Kench":'Support', "Taric":'Support', 
             "Thresh":'Support', "Zilean":'Support', 
             
             "Alistar":'Tank', "Amumu":'Tank', "Blitzcrank":'Tank', "Cho'Gath":'Tank', 
             "Galio":'Tank', "Jarvan IV":'Tank', "Leona":'Tank', "Malphite":'Tank', 
             "Maokai":'Tank', "Nautilus":'Tank', "Ornn":'Tank', "Poppy":'Tank', 
             "Rammus":'Tank', "Sejuani":'Tank', "Shen":'Tank', "Singed":'Tank', 
             "Sion":'Tank', "Zac":'Tank'}

len(type_dict)

140

> **Step 1:** Get the list of Champions played by [summoner] in every match.

In [29]:
# Convert Champion ID (may change over time) to fixed Champion Name
champion_id_to_name_mapping = {champion.id : champion.name for champion in cass.get_champions(region = 'NA')}

# Start a list that will contain the Champion played for every match
champion = []

for m in match_hist:
    p = m.participants[summoner]
    champion.append(champion_id_to_name_mapping[p.champion.id])

In [30]:
# Create an index of Match ID #s
m_id = []

for m in match_hist:
    m_id.append(m.id)

# Convert the list into a pandas dataframe
df_champion = pd.DataFrame(champion, columns = ['champion'], 
                           index = m_id)

> **Step 2:** Use the dictionary `type_dict` to get the Type associated with each Champion.

In [31]:
# Create a new column, fill it with the "translation" of each Champion to its Type
df_champion['champion_type'] = df_champion['champion'].apply(lambda x: type_dict[x])

In [32]:
# Preview
df_champion.head()

Unnamed: 0,champion,champion_type
2775186012,Xayah,Marksman
2775165096,Rakan,Support
2774959782,Zyra,Mage
2774936273,Maokai,Tank
2774941224,Rakan,Support


> ##### ELI5: Dictionaries
> ___
> In Python, a dictionary contains pairs of keys and values, where a key is defined by its value. Just as in a real dictionary, these can be thought of as "entries" and "definitions". Entries must be unique, but definitions don't need to be. In this case, each Champion is an entry (key) and my Mastery level with that Champion is the definition (value). Dictionaries are enclosed in {braces} and take the form {'key':'value', 'entry':'definition'}
>
> First, we define the column to add to the dataframe. (`df_champion['champion_type']`),
>
> Then, we specify how to fill that column, by passing a command to *"Apply the* `type_dict` *dictionary to the 'champion' column."*
>
> In other words, *"look at the name in the 'champion' column, find its entry in the dictionary, and give me the corresponding Type."*
>
> Rakan is a Support-type, so his dictionary entry would be {"Rakan" : 'Support'}. Thus, when I play a match as "Rakan", the function will output a value of "Support".

> ##### Match ID: A unique identifier
>___
> We used `index = m_id` when creating the dataframe. This defines each row by its unique Match ID.
>
> When joining multiple databases together, it is necessary to have a *key*—a unique value that exists in both databases, allowing the two to be related and joined properly.
>
> Although this list should be in the same order as other data collected, this is just an added failsafe to ensure that data is always where it belongs when these fragments are all merged into a single dataframe.

> **Step 3:** Repeat the process by creating a dictionary for your Mastery levels.

In [33]:
# Same code from before, only now we convert it to a dictionary
masteries = []

for cm in summoner.champion_masteries:
    masteries.append((cm.champion.name, cm.level))
    
masteries_dict = dict(masteries)

In [34]:
# Add the Mastery Level column
df_champion['mastery_level'] = df_champion['champion'].apply(lambda x: masteries_dict[x])

In [35]:
# Preview
df_champion.head()

Unnamed: 0,champion,champion_type,mastery_level
2775186012,Xayah,Marksman,2
2775165096,Rakan,Support,4
2774959782,Zyra,Mage,2
2774936273,Maokai,Tank,3
2774941224,Rakan,Support,4


In [36]:
# Remove spaces in names
df_champion['champion'] = df_champion['champion'].str.replace(' ', '_')

In [37]:
# Write it to a CSV file for permanent storage
df_champion.to_csv('data/champions.csv', sep = ',', index = True)

#### Player Stats
___
> And now for all the gameplay stats.
>
> You can find the full list of stats collected on individuals in the [Cassiopeia documentation](http://cassiopeia.readthedocs.io/en/latest/cassiopeia/match.html#cassiopeia.core.match.ParticipantStats).
>
> *Note:* These are just the post-game stats for each individual participant. You can also use [Team-wide stats](http://cassiopeia.readthedocs.io/en/latest/cassiopeia/match.html#cassiopeia.core.match.Team), like `p.team.win` earlier, as well as top-level [Participant info](http://cassiopeia.readthedocs.io/en/latest/cassiopeia/match.html#cassiopeia.core.match.Participant), of which `stats` is just a subset category.

In [38]:
stats = []

for m in match_hist:
    p = m.participants[summoner]
    stats.append((p.stats.level, 
                  p.stats.kills, 
                  p.stats.deaths, 
                  p.stats.assists, 
                  p.stats.total_damage_dealt_to_champions, 
                  p.stats.physical_damage_dealt_to_champions, 
                  p.stats.magic_damage_dealt_to_champions, 
                  p.stats.true_damage_dealt_to_champions, 
                  p.stats.total_damage_dealt, 
                  p.stats.physical_damage_dealt, 
                  p.stats.magic_damage_dealt, 
                  p.stats.true_damage_dealt, 
                  p.stats.largest_critical_strike, 
                  p.stats.largest_killing_spree, 
                  p.stats.largest_multi_kill, 
                  p.stats.killing_sprees, 
                  p.stats.damage_dealt_to_objectives, 
                  p.stats.total_heal, 
                  p.stats.damage_self_mitigated, 
                  p.stats.total_damage_taken, 
                  p.stats.physical_damage_taken, 
                  p.stats.magical_damage_taken, 
                  p.stats.true_damage_taken, 
                  p.stats.total_minions_killed, 
                  p.stats.gold_earned, 
                  p.stats.total_time_crowd_control_dealt, 
                  p.stats.longest_time_spent_living))

In [39]:
# Create an index of Match ID #s
m_id = []

for m in match_hist:
    m_id.append(m.id)

# Convert to dataframe and name all columns
df_stats = pd.DataFrame(stats, columns = ['level',
                                          'kills',
                                          'deaths', 
                                          'assists',
                                          'total_damage_dealt_to_champions',
                                          'physical_damage_dealt_to_champions', 
                                          'magic_damage_dealt_to_champions',
                                          'true_damage_dealt_to_champions',
                                          'total_damage_dealt', 
                                          'physical_damage_dealt',
                                          'magic_damage_dealt',
                                          'true_damage_dealt', 
                                          'largest_critical_strike',
                                          'largest_killing_spree',
                                          'largest_multi_kill', 
                                          'killing_sprees',
                                          'damage_dealt_to_objectives',
                                          'total_heal', 
                                          'damage_self_mitigated',
                                          'total_damage_taken',
                                          'physical_damage_taken', 
                                          'magical_damage_taken',
                                          'true_damage_taken',
                                          'total_minions_killed', 
                                          'gold_earned',
                                          'total_time_crowd_control_dealt',
                                          'longest_time_spent_living'], 
                        index = m_id)

In [40]:
# Preview
df_stats.head()

Unnamed: 0,level,kills,deaths,assists,total_damage_dealt_to_champions,physical_damage_dealt_to_champions,magic_damage_dealt_to_champions,true_damage_dealt_to_champions,total_damage_dealt,physical_damage_dealt,...,total_heal,damage_self_mitigated,total_damage_taken,physical_damage_taken,magical_damage_taken,true_damage_taken,total_minions_killed,gold_earned,total_time_crowd_control_dealt,longest_time_spent_living
2775186012,16,2,4,14,17025,15911,1113,0,81015,67851,...,4416,4402,12283,5785,6232,265,69,10808,660,425
2775165096,17,2,7,35,7433,1306,5277,849,17139,4673,...,6392,30909,20110,11038,8793,279,9,10921,105,168
2774959782,18,6,7,23,44403,512,43890,0,109344,4253,...,2520,5081,18443,1930,15777,734,93,14227,307,354
2774936273,17,10,5,14,26350,1697,24382,270,65390,5322,...,11140,18993,26980,10224,16755,0,48,12037,429,213
2774941224,16,2,11,20,9735,729,7291,1714,25409,4587,...,6242,31773,28825,12743,15308,773,11,10200,3071,174


In [41]:
# Save to CSV
df_stats.to_csv('data/stats.csv', sep = ',', index = True)

#### Teammate Data
___
> Next, let's get a list of the Champions played by each of summoner's teammates.
>
> Note that we don't want to gather data for ourselves again, so we need to exclude `summoner`.
>
> The line
> `if (p.team == m.participants[summoner].team) & (p != m.participants[summoner]):` <br>
> can be read as: <br>
> *"If a participant's team is the same as my team, AND the participant is NOT me:"* 
>
> Then, in the indented lines beneath, we print participant attributes which meet those criteria.

In [45]:
# Print an example to make sure it's working as intended
# The additional info helps us to verify that every four records belong to one match.
for m in match_hist[0:3]:
    print("Teammates for {player}".format(player = summoner.name))
    print("Match #{id}".format(id = m.id))
    print("")
    for p in m.participants:
        if (p.team == m.participants[summoner].team) & (p != m.participants[summoner]):
            print("Champion:", p.champion.name)
            print("Summoner:", p.summoner.name)
            print("Match won:", p.team.win)
            print("Map side:", p.side.name)
            print("")
    print("===================") # Insert a break after each match
    print("")                    # Indentation matters. This ensures the break happens only after every match.

Teammates for Proxy Fox
Match #2775186012

Champion: Zed
Summoner: WolveNova
Match won: False
Map side: blue

Champion: Urgot
Summoner: Chocolate Malk
Match won: False
Map side: blue

Champion: Akali
Summoner: TheWhiteChip
Match won: False
Map side: blue

Champion: Nidalee
Summoner: BillCosbysDrink
Match won: False
Map side: blue


Teammates for Proxy Fox
Match #2775165096

Champion: Ahri
Summoner: honest kodaline
Match won: True
Map side: red

Champion: Varus
Summoner: IamFabbs
Match won: True
Map side: red

Champion: Rumble
Summoner: Strydex
Match won: True
Map side: red

Champion: Kog'Maw
Summoner: STonED7
Match won: True
Map side: red


Teammates for Proxy Fox
Match #2774959782

Champion: Sona
Summoner: flipsyyy
Match won: True
Map side: red

Champion: Garen
Summoner: Aedheroth
Match won: True
Map side: red

Champion: Graves
Summoner: PrincessKenny0
Match won: True
Map side: red

Champion: Volibear
Summoner: TheTruthx
Match won: True
Map side: red




> The code below creates a "list of lists", where each "inner list" represents a group of teammates for a single match.
>
> We again include match ID, just to guarantee that we join these teammate lists to the correct match when we merge all the data together.

In [46]:
tmate_champions = []

for m in match_hist:
    tmates = [m.id]
    for p in m.participants:
        if (p.team == m.participants[summoner].team) & (p != m.participants[summoner]):
            tmates.append(champion_id_to_name_mapping[p.champion.id])
    tmate_champions.append(tmates)

In [47]:
tmate_champions[0:3]

[[2775186012, 'Zed', 'Urgot', 'Akali', 'Nidalee'],
 [2775165096, 'Ahri', 'Varus', 'Rumble', "Kog'Maw"],
 [2774959782, 'Sona', 'Garen', 'Graves', 'Volibear']]

> Check to make sure this is the same data that we printed out above.

Now those lists can be converted to the rows of a dataframe.

In [48]:
df_team = pd.DataFrame(tmate_champions, 
                       columns = ['match_id', 'tmate_1', 'tmate_2', 'tmate_3', 'tmate_4'])

In [49]:
df_team.head()

Unnamed: 0,match_id,tmate_1,tmate_2,tmate_3,tmate_4
0,2775186012,Zed,Urgot,Akali,Nidalee
1,2775165096,Ahri,Varus,Rumble,Kog'Maw
2,2774959782,Sona,Garen,Graves,Volibear
3,2774936273,Alistar,Janna,Fizz,Ezreal
4,2774941224,Nunu,Kog'Maw,Varus,Taliyah


> We now have a nice clean dataframe of our four teammates for every match.
>
> However, if we want to use this information for analytics, we'll need a way to convert those Champion names into numbers. (You can't use "Teemo" in a mathematical model)
>
> To do this we create "dummy variables"—expanding each column into one column for every possible value, which is filled with either a 0 or 1, representing True or False.

In [50]:
for col in df_team.columns[1:]:
	attName = col
	dType = df_team[col].dtype
	missing = pd.isnull(df_team[col]).any()
	uniqueCount = len(df_team[attName].value_counts(normalize = False))
	# Create dummies
	if dType == object:
		df_team = pd.concat([df_team, pd.get_dummies(df_team[col], prefix = col)], axis = 1)
		del df_team[attName]

In [51]:
df_team.head()

Unnamed: 0,match_id,tmate_1_Aatrox,tmate_1_Ahri,tmate_1_Akali,tmate_1_Alistar,tmate_1_Amumu,tmate_1_Anivia,tmate_1_Annie,tmate_1_Ashe,tmate_1_Aurelion Sol,...,tmate_4_Xerath,tmate_4_Xin Zhao,tmate_4_Yasuo,tmate_4_Yorick,tmate_4_Zac,tmate_4_Zed,tmate_4_Ziggs,tmate_4_Zilean,tmate_4_Zoe,tmate_4_Zyra
0,2775186012,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,2775165096,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,2774959782,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,2774936273,0,0,0,1,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,2774941224,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


> Oops.
>
> This is exactly what we **don't** want. When the team was split up so that each player got their own column, the dummy variable converter created the full list of 140 champions **for each teammate**. At 556 columns, that's a lot of extra complexity.
>
> We'll need to "trick" pandas into accepting a list of champions as a single column.

In [52]:
# Re-create the list, but this time, make Match ID its own separate list
tmate_champions = []

for m in match_hist:
    tmates = []
    for p in m.participants:
        if (p.team == m.participants[summoner].team) & (p != m.participants[summoner]):
            tmates.append(champion_id_to_name_mapping[p.champion.id])
    tmate_champions.append(tmates)
    
m_id = []

for m in match_hist:
    m_id.append(m.id)

> ##### Enemies?
> ___
> The same code, with slight modifications, can be used to build the enemy team's roster. I've chosen to avoid collecting this data because, on the basis of analytics for self-improvement, it's entirely out of my control. (Not to mention, it increases complexity by an additional 140 variables) However, the modified code is provided below.
>
> The largest modification to this code is the "not equal to" operator (!=). That is, we only want data for the "`p`"s who are not on our team.
>
> ```
> enemy_champions = []
>
> for m in match_hist:
>     enemies = []
>     for p in m.participants:
>         if (p.team != m.participants[summoner].team):
>             enemies.append(champion_id_to_name_mapping[p.champion.id])
>     enemy_champions.append(enemies)
> ```

In [53]:
df_team = pd.DataFrame({'teammate_champions' : tmate_champions}, index = m_id)

In [54]:
df_team.head()

Unnamed: 0,teammate_champions
2775186012,"[Zed, Urgot, Akali, Nidalee]"
2775165096,"[Ahri, Varus, Rumble, Kog'Maw]"
2774959782,"[Sona, Garen, Graves, Volibear]"
2774936273,"[Alistar, Janna, Fizz, Ezreal]"
2774941224,"[Nunu, Kog'Maw, Varus, Taliyah]"


> Now we need to get this out of list form (brackets) and tell Python/pandas to treat these as four separate, comma-separated items.

In [55]:
df_team['teammate_champions'] = df_team.teammate_champions.astype(str)

In [56]:
import string

df_team['teammate_champions'] = df_team.teammate_champions.apply(lambda x: str(x).strip("[]"))

In [57]:
df_team.head()

Unnamed: 0,teammate_champions
2775186012,"'Zed', 'Urgot', 'Akali', 'Nidalee'"
2775165096,"'Ahri', 'Varus', 'Rumble', ""Kog'Maw"""
2774959782,"'Sona', 'Garen', 'Graves', 'Volibear'"
2774936273,"'Alistar', 'Janna', 'Fizz', 'Ezreal'"
2774941224,"'Nunu', ""Kog'Maw"", 'Varus', 'Taliyah'"


This one beautiful line of code does exactly what we need. ([Source](https://stackoverflow.com/questions/46674157/pandas-turn-multiple-variables-into-a-single-set-of-dummy-variables) for this solution was found on StackOverflow.)

In [58]:
df_team_dummycoded = df_team.teammate_champions.str.get_dummies(', ')

In [59]:
df_team_dummycoded.head(10)

Unnamed: 0,"""Cho'Gath""","""Kai'Sa""","""Kha'Zix""","""Kog'Maw""","""Rek'Sai""","""Vel'Koz""",'Aatrox','Ahri','Akali','Alistar',...,'Xerath','Xin Zhao','Yasuo','Yorick','Zac','Zed','Ziggs','Zilean','Zoe','Zyra'
2775186012,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,1,0,0,0,0
2775165096,0,0,0,1,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
2774959782,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2774936273,0,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2774941224,0,0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2774915162,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,1,0,0,0
2774402461,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2774348536,0,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2774308114,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
2774344903,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


> This is what we want, but there's some cleaning we can do—the names with apostrophes are getting enclosed in double quotes to avoid confusing them with single quotes. Let's strip all the outer quotes and then re-sort the columns alphabetically.
>
> Additionally, we can replace spaces, like in "Xin Zhao", with underscores to make column names more code-friendly. We should also add a prefix to identify these as teammates, to differentiate these from the columns for the player's own Champion.

In [60]:
# Just some housekeeping with column names

# Removing quotes from column names
rm_quotes = lambda x: str(x).strip("\'")
df_team_dummycoded = df_team_dummycoded.rename(columns = rm_quotes)

rm_dblquotes = lambda x: str(x).strip('\"')
df_team_dummycoded = df_team_dummycoded.rename(columns = rm_dblquotes)

# Sorting alphabetically
df_team_dummycoded = df_team_dummycoded.reindex(sorted(df_team_dummycoded.columns), axis=1)

# No spaces
no_spaces = lambda x: str(x).replace(" ", "_")
df_team_dummycoded = df_team_dummycoded.rename(columns = no_spaces)

# Prefix to identify that these are teammates
df_team_dummycoded.columns = ['tmate_' + str(col) for col in df_team_dummycoded.columns]

In [62]:
df_team_dummycoded.head(3)

Unnamed: 0,tmate_Aatrox,tmate_Ahri,tmate_Akali,tmate_Alistar,tmate_Amumu,tmate_Anivia,tmate_Annie,tmate_Ashe,tmate_Aurelion_Sol,tmate_Azir,...,tmate_Xerath,tmate_Xin_Zhao,tmate_Yasuo,tmate_Yorick,tmate_Zac,tmate_Zed,tmate_Ziggs,tmate_Zilean,tmate_Zoe,tmate_Zyra
2775186012,0,0,1,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
2775165096,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2774959782,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


> We have a winner.
> 
> For every match, exactly four of these 140 columns will have a True value (1). This will tell us (and, more importantly, our analytics algorithms) which four Champions the [summoner] in question played alongside.
>
> **An example of how this information can be used in analytics:**
>
> *Maybe you, personally, prefer damage-dealing or support types because you don't have the best luck initiating fights and soaking up damage. However, it's helpful to have a tank or two on your team who can do just that. An algorithm could find that you're more likely to win if, for example* `tmate_Alistar == 1` *or* `tmate_Cho'Gath == 1`*, meaning you had some sturdy meat-shields to hide behind.*

In [63]:
# Save both variations of the data as CSV files
df_team.to_csv('data/teams.csv', sep = ',', index = True)
df_team_dummycoded.to_csv('data/teams_dummycoded.csv', sep = ',', index = True)

### Merging All Data Together
___
> Now that we've pulled all the data we need, it must be merged into a single dataframe. 
>
> This will be fairly easy if we use the Match ID.
>
> The dataframes that have been created are:
> * `df_match` (general match data)
> * `df_champion` (my champion, including type and mastery level)
> * `df_stats` (collection of post-game stats)
> * `df_team_dummycoded` (sparse matrix of team comp)

In [64]:
df = df_match.merge(df_champion, how = 'inner', left_on = 'match_id', right_index = True)
df = df.merge(df_stats, how = 'inner', left_on = 'match_id', right_index = True)
df = df.merge(df_team_dummycoded, how = 'inner', left_on = 'match_id', right_index = True)

> Merge the four dataframes together in order, starting with `df_match`.

> ##### About Merging Dataframes
> ___
>
> The syntax for merging dataframes is based on Relational Database Management System (RDMS) applications, like SQL. We want to use an Inner Join method, `how = 'inner'`, which merges only the entries that occur in both dataframes. (Since we know that all entries are present in both frames, it technically doesn't matter in this case. However, it's good form for ensuring that only complete records make it into the final dataframe.)
>
> We want to use `'match_id'` as the unique key field for the "left" (first) dataframe, so we specify `left_on = 'match_id'`.
>
> In each of the other three dataframes, the index serves as the key. Thus, `right_index = True`. This merges each line of data by comparing the keys in both sources—only merging when the keys match.

In [70]:
# Preview
df.head()

Unnamed: 0,match_id,duration,side,win,team_first_blood,team_first_tower,champion,champion_type,mastery_level,level,...,tmate_Xerath,tmate_Xin_Zhao,tmate_Yasuo,tmate_Yorick,tmate_Zac,tmate_Zed,tmate_Ziggs,tmate_Zilean,tmate_Zoe,tmate_Zyra
0,2775186012,00:18:12,blue,False,True,False,Xayah,Marksman,2,16,...,0,0,0,0,0,1,0,0,0,0
1,2775165096,00:18:38,red,True,True,True,Rakan,Support,4,17,...,0,0,0,0,0,0,0,0,0,0
2,2774959782,00:22:05,red,True,True,False,Zyra,Mage,2,18,...,0,0,0,0,0,0,0,0,0,0
3,2774936273,00:18:25,red,True,False,True,Maokai,Tank,3,17,...,0,0,0,0,0,0,0,0,0,0
4,2774941224,00:19:21,blue,False,False,False,Rakan,Support,4,16,...,0,0,0,0,0,0,0,0,0,0


> The dataframe is now complete, and ready to be used for analytics.

In [67]:
# Save the final df to CSV
df.to_csv('data/proxy_fox_aram_1100.csv', sep = ',', index = False)

We can also export a table containing summary statistics (average, min and max values, etc.) for all relevant variables.

In [71]:
# And output summary stats for the relevant variables
summary_stats = df.iloc[:, 1:36].describe()

In [72]:
summary_stats.to_csv('data/proxy_fox_aram_1100_summary.csv', sep = ',', index = True)

> Please see the Part II, **Analytics & Predictions**, for the next steps in using this data.

<br>

___
<center><i>— End of Notebook —</i></center>
___

[Respawn at Base](#top)