# **Extracting League of Legends Match Data using Riot API - PART I, II and III**

**What motivated me to create this guide?**

I play League since season 4 and I´ve always been a fan of the game. Recently I have started studying Data Science and the analyst sector in esports, manly in LOL. And one day when I was trying to figure out a good project to my portifolio I came acroos a idea of using league of legends match data to make exploratory analysis, data visualization and ML. 

So why not?

Since then, I have been looking where could I find LOL match data to use in my project. I even found some places with data sources available, but I decided to "create" my own data source, to my project be even more unique. After I decided this, I started studying about RIOT API to understand how could I use it to create my DataFrame. When I was able to figure out how I could do this using python + RIOT API, I immediately though about creating this guide to help others that might be in the same situation that I was.

#### **Do I need to know python?**

This guide does not require you to know all about python. But it is recommended that if you do not know or if you know just a little bit, you look for some free python tutorials on the internet and Googling certain areas where you are not understanding what is going on.

If you have any questions, feel free to send me a message on linkedin.

### **Importing the Libraries**

We will need 3 libraries thourghout this guide:
- Request: Managing all our API requests 
- Pandas: Manipulating Data
- Time: Handle if the riot API rate limite


In [2]:
import requests
import pandas as pd
import time

### **Getting your API Key**

Log in with your [Riot Account](https://developer.riotgames.com/) and accept the terms of service. At the bottom, click "I'm not a Robot" and regenerate your API key.

Copy the API key below

In [36]:
api_key = 'RGAPI-4e950325-1e13-428c-8460-37d232e75654'

### **Part I: [LEAGUE-V4 Endpoint](https://developer.riotgames.com/apis#league-v4)**

In this part we will use the League - V4 Endpoint to get the summoner information from players with elo greater than master

First lets do our first API Call. [Click here](https://developer.riotgames.com/apis#league-v4) to go to the League-V4 ENDPOINT.

Now click on the ENDPOINT "/lol/league/v4/challengerleagues/by-queue/{queue}", scroll to the bottom of the page and select RANKED_SOLO_5x5. Then, click "EXECUTE REQUEST".

If all is correct you should see a 200 response code. And bellow it, on the Response Body, you should see, in this case, information about challangers players.

Now go back to the beggining where you can see the request url. This is the url we are going to use to make the resquest, so copy and paste it in the code bellow.


**IMPORTANT:** When dealing with APIs, we need to add "?" in the end of the ulr and declare the parameter name and value as we did bellow. 

In [46]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key

resp = requests.get(api_ulr)
resp

<Response [200]>

We are getting a 200 response code, which means sucessfull. But to see the data from this request we need to use the json method.

In [None]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key

resp = requests.get(api_ulr)
resp.json()

The important information is stored in the `entries` key, so if we pass it between [] we can acess only this part of the data

In [None]:
resp.json()['entries']

Now to transform it into a DataFrame, to use this information, we can use the `pd.json_normalize` to the JSON data returned by the get request to turn it into a flat table.

In [None]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key
 
resp = requests.get(api_ulr)
challenger_summoners = pd.json_normalize(resp.json()['entries'])
challenger_summoners.head()

Now that we now the basics about sending a request, getting the request data and transforming it into a DataFrame. We can join all we have done to create a funcion that will retunr all the information of players from Master to Challanger in Brazil, using the [League V4 Endpoint](https://developer.riotgames.com/apis#league-v4/GET_getMasterLeague).

For this we will create a function that will take as a parameter the api_key.

Now more about functions [here](https://www.programiz.com/python-programming/function)

In [6]:
def get_summonerId(api_key):
    challenger_url = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Challanger ENDPOINT
    grandmaster_url = 'https://br1.api.riotgames.com/lol/league/v4/grandmasterleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Grandmaster EDNPOINT
    master_url = 'https://br1.api.riotgames.com/lol/league/v4/masterleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Master ENDPOINT

    league_url_lists = [challenger_url, grandmaster_url, master_url] # Creating a list with the 3 endpoints (master, grandmaster and challenger)

    df_list = [] # A list that will store the response from the get request (league_df)

    for url in league_url_lists: # Loop interation through league_url_lists
        resp = requests.get(url) # Sending a get request to the url
        league_df = pd.json_normalize(resp.json()['entries'])
        df_list.append(league_df)
    
    final_league_df = pd.concat(df_list) # Concatenating the stored information in league_url_lists to create the final_league_df

    return final_league_df

If we execute the function and assing it to a variable we can see the results

In [None]:
league_df = get_summonerId(api_key)
league_df.head()

Now that we have this information we can head to the Second part of this guide.

### **PART II: [SUMMONER-V4 Endpoint](https://developer.riotgames.com/apis#summoner-v4/GET_getBySummonerId)**

In this part we will use the `summonerId` column from the `league_df` that we have just created to retrieve the `puuid` and other information for each player, using the [SUMMONER-V4 Endpoint](https://developer.riotgames.com/apis#summoner-v4/GET_getBySummonerId).


Lets understand how this EDNPOINT works. [Click here](https://developer.riotgames.com/apis#summoner-v4) to go to the Summoner - V4 Endpoint.

Click on the EDNPOINT "/lol/summoner/v4/summoners/{encryptedSummonerId}", scroll down and on the encryptedSummonerId value put one summonerId. Then click in "EXECUTE REQUEST".

As we can see the Reponse Body contains summoners information, including the puuid that is what we want. 

Once again copy the request url in the code bellow

The process will be very similar with your first api call, just a slight difference we won´t need to put['entries']to get the data, since the it is only a simple dictionary.

#### **Testing with one summonerId**

In [10]:
api_ulr = 'https://br1.api.riotgames.com/lol/summoner/v4/summoners/mx0nf-5HtF7GNZN7u_brogQVWKxSeabEQY4JNtfeiS9lmdQ' + '?api_key='+ api_key
 
resp = requests.get(api_ulr)
summoner_info = pd.json_normalize(resp.json())
summoner_info.head()

Unnamed: 0,id,accountId,puuid,name,profileIconId,revisionDate,summonerLevel
0,mx0nf-5HtF7GNZN7u_brogQVWKxSeabEQY4JNtfeiS9lmdQ,cr94iSRf_z9J4NzALZ335TO7Q2GMWMrO8Gu0RVw95ttdJ6...,-00t9W6o2B8dQYAEBHJoGifyJhObO-Kg14aTIOFMlQk9Xi...,Griba,7,1666128939575,244


Now that we were able to retrieve the summoner_info for 1 player, we need to repeat this process for each summonerId in the league_df.

To do that we will create another function that will receive a DataFram containing a column named "summonerId" (league_df) and the api_key. In this function we will create a for loop to iterate over each summonerId repeting the process we have just did above.

#### **Creating the function**

In [5]:
def get_summonerinfo_by_summonerId(df, api_key):
    
    summoner_list = [] # list that will store the each summoner_info DataFrame

    for i in range(len(df)): # len() get the numbers of rows of the match_ids parameter inserted. Then we use the range() function to create a sequence of number starting from 0 to number returned by len(). 
                            # With that we can interate trough each line in the match_ids dataframe (our function parameter) using the iloc function to get the matchid.

        # Creating a dynamic api_ulr for each register in the summonerId column using iloc
        api_url = ('https://br1.api.riotgames.com/lol/summoner/v4/summoners/{}?api_key={}'.format(df['summonerId'].iloc[i],api_key)

        # Repeating the process we did at the begining
        resp = requests.get(api_url)
        print(resp.status_code)

        # if, elif and while true to deal with riot api rate limit
        if resp.status_code == 200:
            pass
        elif resp.status_code == 429:
            while True: # while loop because of riot api cost
                if resp.status_code == 429: #429 error is api cost issue
                    print('429 delay try 10 second') #approximate 10 second wait before trying again
                    time.sleep(10)

                    # Trying again
                    api_url = ('https://br1.api.riotgames.com/lol/summoner/v4/summoners/{}?api_key={}'.format(df['summonerId'].iloc[i],api_key)
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break

        summoner_info = pd.json_normalize(resp.json()) # The same process we did at the beggining to create the DataFrame
        summoner_list.append(summoner_info) # Appending the DataFrame generated (summoner_info) into the list "summoner_list"
        
    final_summoner_info = pd.concat(summoner_list) # Using pd.concat to concatenate the results stored in the list "summoner_list"
            
    return final_summoner_info
    

If we execute the function and assing it to a variable we can see the results. 

Take care, because the code took about 1:30 hour to execute in my computer. So you might want to save it for further usage.

In [None]:
summoner_df = get_summonerinfo_by_summonerId(league_df, api_key)

Reseting the index

In [None]:
summoner_df.reset_index(inplace = True) # Reseting the Index
summoner_df.drop(columns='index', inplace = True) # Droping the column Index
summoner_df.head()

Saving the summoner_df

In [None]:
summoner_df.to_csv('summoner_df.csv', index = False) 

## **PART III: [MATCH-V5 Endpoint](https://developer.riotgames.com/apis#match-v5/GET_getMatch)**

In this part we will use the `puuid` column from the `summoner_df`, that we've just created, to retrieve the last 20 matchIds for each `puuid`, using the [match V5 Endpoint](https://developer.riotgames.com/apis#match-v5/GET_getMatch).

So first lets understand how this EDNPOINT works. [Click here](https://developer.riotgames.com/apis#match-v5) to go to the Match-V5 Endpoint.

Click on the EDNPOINT "/lol/match/v5/matches/by-puuid/{puuid}/ids", scroll down and on the path parameters insert this puuid "907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA". Now on the Query parameters insert '420' as the queue and select ranked for the type. After that you click in "EXECUTE REQUEST".

As we can see in the Response Body contains a list of the last 20 match Ids from the player. So everything is all right.

Once again get the request url and copy it in the api_url

#### **Testing with one puuid**

In [None]:
# & instead of ?, because there's already a ? in the original url
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA/ids?queue=420&type=ranked&start=0&count=20' + '&api_key=' + api_key
 
resp = requests.get(api_ulr)
resp.json()

We can see that the data returned by `resp.json()` is a list of matchIds. To transform it into a DataFrame we can use pd.DataFrame(), passing the resp.json as the data and 'matchid' as the column.

In [None]:
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA/ids?queue=420&type=ranked&start=0&count=20' + '&api_key=' + api_key
 
resp = requests.get(api_ulr)
matchids = pd.DataFrame(resp.json(), columns = ['matchid'])
matchids.head()

Now that we were able to retrieve the last 20 5v5 Ranked Solo games for 1 player, we need to be able to repeat this process for each puuid in the summoner_df.

To do that we will create a function that will receive a DataFrame containing a column named "puuid" (summoner_df) and the api_key. In this function we will create a for loop to iterate over each puuid repeting the process we have just did above.

#### **Creating the function**

In [14]:
def get_matchids_by_puuid(df, api_key):

    matchids_list = []

    for i in range(len(df)):

        api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/{}/ids?queue=420&type=ranked&start=0&count=20&api_key={}'.format(df['puuid'].iloc[i], api_key)

        resp = requests.get(api_url) # Sending a get request to the api_url
        print(resp.status_code)

        if resp.status_code == 200:
            pass
        elif resp.status_code == 429:
            while True: # while loop because of riot api cost
                if resp.status_code == 429: #429 error is api cost issue
                    print('429 delay try 100 second') #approximate 110 second wait
                    time.sleep(10)

                    api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/{}/ids?queue=420&type=ranked&start=0&count=20&api_key={}'.format(df['puuid'].iloc[i], api_key)
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break        

        matchids = pd.DataFrame(resp.json(), columns = ['matchid']) 
        matchids_list.append(matchids) # Appending the DataFrame generated to the list "matchids_list"
    
    matchids_df = pd.concat(matchids_list) # Using pd.concat to concatenate the results stored in the "matchids_list"
    
    return matchids_df

#### **Executing the `get_matchids_by_puuid` function**

Take care, because if you try to execute the fucntion using the entire summoner_df it will took about 1:30 hours to execute, and it can fail many times before it runs successfully. So to avoida that, I've splited the summoner_df in half. Now I can run the function twice and then concatenated both results into a single DF. Each code takes about 45 minutes to complete, so you can do something while you are waiting. After that saved it for further usage.

In [22]:
print('O summoner_df possui {} linhas'.format(summoner_df.shape[0]))

O summoner_df possui 4399 linhas


In [None]:
matchids_df = get_matchids_by_puuid(summoner_df.loc[:2200], api_key)

In [None]:
matchids_df_1 = get_matchids_by_puuid(summoner_df.loc[2201:], api_key)

In [None]:
matchids_df_list = [matchids_df, matchids_df_1]

final_matchids_df= pd.concat(matchids_df_list)
final_matchids_df

Reseting the Index

In [None]:
final_matchids_df.reset_index(inplace=True)
final_matchids_df.drop(columns = 'index', inplace = True )
final_matchids_df.head()

Dropping duplicates, since we only want unique matchids

In [43]:
final_matchids_df_without_duplicates = final_matchids_df.drop_duplicates()

In [44]:
matchids_count_before = final_matchids_df.shape[0]
matchids_count_after = final_matchids_df_without_duplicates.shape[0]
print('Before we had {} matchids, after we have {} matchids'.format(matchids_count_before, matchids_count_after))

Before we had 87980 matchids, after we have 35537 matchids


Saving the `matchids_df_without_duplicates`

In [45]:
final_matchids_df_without_duplicates.to_csv('matchids_df.csv', index = False)