# **Extracting League of Legends Match Data using Riot API**

#### **What motivated me to create this guide?**

I play League since season 4 and I´ve always been a fan of the game. Recently I have started studying Data Science and the analyst sector in esports, manly in LOL. And one day when I was trying to figure out a good project to my portifolio I came acroos a idea of using league of legends match data to make exploratory analysis, data visualization and ML. 

So why not?

Since then, I have been looking where could I find LOL match data to use in my project. I even found some places with data sources available, but I decided to "create" my own data source, to my project be even more unique. After I decided this, I started studying about RIOT API to understand how could I use it to create my DataFrame. When I was able to figure out how I could do this using python + RIOT API, I immediately though about creating this guide to help others that might be in the same situation that I was.

#### **Do I need to know python?**

This guide does not require you to know all about python. But it is recommended that if you do not know or if you know just a little bit, you look for some free python tutorials on the internet and Googling certain areas where you are not understanding what is going on.

If you have any questions, feel free to send me a message on linkedin.

### **Importing Libraries**

We will need 3 libraries thourghout this guide:
- Request: Managing all our API requests 
- Pandas: Manipulating Data
- Time: Handle if the riot API rate limite


In [3]:
import requests
import pandas as pd
import time
from warnings import simplefilter
simplefilter(action="ignore", category=pd.errors.PerformanceWarning)

### **Getting your API Key**

Log in with your [Riot Account](https://developer.riotgames.com/) and accept the terms of service. At the bottom, click "I'm not a Robot" and regenerate your API key.

Copy the API key below

In [9]:
api_key = 'RGAPI-4e5ff2f3-9da2-488b-89e4-fb3227803e53'

### **Part I: [LEAGUE-V4 Endpoint](https://developer.riotgames.com/apis#league-v4)**

In this part we will use the League - V4 Endpoint to get the summoner information from players with elo greater than master

First lets do our first API Call. [Click here](https://developer.riotgames.com/apis#league-v4) to go to the League-V4 ENDPOINT.

Now click on the ENDPOINT "/lol/league/v4/challengerleagues/by-queue/{queue}", scroll to the bottom of the page and select RANKED_SOLO_5x5. Then, click "EXECUTE REQUEST".

If all is correct you should see a 200 response code. And bellow it, on the Response Body, you should see, in this case, information about challangers players.

Now go back to the beggining where you can see the request url. This is the url we are going to use to make the resquest, so copy and paste it in the code bellow.


**IMPORTANT:** When dealing with APIs, we need to add "?" in the end of the ulr and declare the parameter name and value as we did bellow. 

In [46]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key

resp = requests.get(api_ulr)
resp

<Response [200]>

We are getting a 200 response code, which means sucessfull. But to see the data from this request we need to use the json method.

In [None]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key

resp = requests.get(api_ulr)
resp.json()

The important information is stored in the `entries` key, so if we pass it between [] we can acess only this part of the data

In [None]:
resp.json()['entries']

Now to transform it into a DataFrame, to use this information, we can use the `pd.json_normalize` to the JSON data returned by the get request to turn it into a flat table.

In [53]:
api_ulr = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5' + '?api_key='+ api_key
 
resp = requests.get(api_ulr)
challenger_summoners = pd.json_normalize(resp.json()['entries'])
challenger_summoners.head(3)

Unnamed: 0,summonerId,summonerName,leaguePoints,rank,wins,losses,veteran,inactive,freshBlood,hotStreak
0,cJS17ysl3L7wK6rvTybv93LQQIFfNXmzJTHyn6U7y44g_w,Atrasia,996,I,249,168,False,False,True,False
1,Qxhotb8LesY9wNI7iU2UlQQHMV7wLbODaFFWMvDWgjiluQ,Youngatti,1161,I,408,353,True,False,False,False
2,-IaH1nm8elIFihOFSYPAowajun_zfpuU1pTudYfYv-pHYQ,ELOHIGH PUMITA,1004,I,193,147,False,False,False,False


Now that we now the basics about sending a request, getting the request data and transforming it into a DataFrame. We can join all we have done to create a funcion that will retunr all the information of players from Master to Challanger in Brazil, using the [League V4 Endpoint](https://developer.riotgames.com/apis#league-v4/GET_getMasterLeague).

For this we will create a function that will take as a parameter the api_key.

Now more about functions [here](https://www.programiz.com/python-programming/function)

In [55]:
def get_summonerId(api_key):
    challenger_url = 'https://br1.api.riotgames.com/lol/league/v4/challengerleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Challanger ENDPOINT
    grandmaster_url = 'https://br1.api.riotgames.com/lol/league/v4/grandmasterleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Grandmaster EDNPOINT
    master_url = 'https://br1.api.riotgames.com/lol/league/v4/masterleagues/by-queue/RANKED_SOLO_5x5?api_key={}'.format(api_key) # Master ENDPOINT

    league_url_lists = [challenger_url, grandmaster_url, master_url] # Creating a list with the 3 endpoints (master, grandmaster and challenger)

    df_list = [] # A list that will store the response from the get request (league_df)

    for url in league_url_lists: # Loop interation through league_url_lists
        resp = requests.get(url) # Sending a get request to the url
        league_df = pd.json_normalize(resp.json()['entries'])
        df_list.append(league_df)
    
    final_league_df = pd.concat(df_list) # Concatenating the stored information in league_url_lists to create the final_league_df

    return final_league_df

If we execute the function and assing it to a variable we can see the results

In [62]:
league_df = get_summonerId(api_key)
print('league_df has information about {} players between master and challenger elo'.format(league_df.shape[0]))
league_df.head(3)

league_df has information about 4574 players between master and challenger elo


Unnamed: 0,summonerId,summonerName,leaguePoints,rank,wins,losses,veteran,inactive,freshBlood,hotStreak
0,h4bbmLl0hgcuN6OX4_xI2JVCkiCG14ONmR0MMQ3VtvLJ,SSG Corëjj,1192,I,316,257,True,False,False,False
1,TwGS6c2Z3vsbdmmHOGH4LHgTOsbw1rZSboYSQ0qOPaUp9v...,não sou o sup,1440,I,181,120,True,False,False,False
2,89g5w4-d6pXyPob_6PiAgJGlC2arZ-b5rr_VUnLlrWxcGQ,Kick a boo,1265,I,359,294,True,False,False,False


Now that we have this information we can head to the Second part of this guide.

### **PART II: [SUMMONER-V4 Endpoint](https://developer.riotgames.com/apis#summoner-v4/GET_getBySummonerId)**

In this part we will use the `summonerId` column from the `league_df` that we have just created to retrieve the `puuid` and other information for each player, using the [SUMMONER-V4 Endpoint](https://developer.riotgames.com/apis#summoner-v4/GET_getBySummonerId).


Lets understand how this EDNPOINT works. [Click here](https://developer.riotgames.com/apis#summoner-v4) to go to the Summoner - V4 Endpoint.

Click on the EDNPOINT "/lol/summoner/v4/summoners/{encryptedSummonerId}", scroll down and on the encryptedSummonerId value put one summonerId. Then click in "EXECUTE REQUEST".

As we can see the Reponse Body contains summoners information, including the `puuid` that is what we want. 

Once again copy the request url in the code bellow

The process will be very similar with your first api call, just a slight difference we won´t need to put['entries']to get the data, since the it is only a simple dictionary.

#### **Testing with one summonerId**

In [10]:
api_ulr = 'https://br1.api.riotgames.com/lol/summoner/v4/summoners/mx0nf-5HtF7GNZN7u_brogQVWKxSeabEQY4JNtfeiS9lmdQ' + '?api_key='+ api_key
 
resp = requests.get(api_ulr)
summoner_info = pd.json_normalize(resp.json())
summoner_info.head()

Unnamed: 0,id,accountId,puuid,name,profileIconId,revisionDate,summonerLevel
0,mx0nf-5HtF7GNZN7u_brogQVWKxSeabEQY4JNtfeiS9lmdQ,cr94iSRf_z9J4NzALZ335TO7Q2GMWMrO8Gu0RVw95ttdJ6...,-00t9W6o2B8dQYAEBHJoGifyJhObO-Kg14aTIOFMlQk9Xi...,Griba,7,1666128939575,244


Now that we were able to retrieve the summoner_info for 1 player, we need to repeat this process for each summonerId in the league_df.

To do that we will create another function that will receive a DataFram containing a column named "summonerId" (league_df) and the api_key. In this function we will create a for loop to iterate over each summonerId repeting the process we have just did above.

#### **Creating the function**

In [5]:
def get_summonerinfo_by_summonerId(df, api_key):
    
    summoner_list = [] # list that will store the each summoner_info DataFrame

    for i in range(len(df)): # len() get the numbers of rows of the match_ids parameter inserted. Then we use the range() function to create a sequence of number starting from 0 to number returned by len(). 
                            # With that we can interate trough each line in the match_ids dataframe (our function parameter) using the iloc function to get the matchid.

        # Creating a dynamic api_ulr for each register in the summonerId column using iloc
        api_url = ('https://br1.api.riotgames.com/lol/summoner/v4/summoners/{}?api_key={}'.format(df['summonerId'].iloc[i],api_key)

        # Repeating the process we did at the begining
        resp = requests.get(api_url)
        print(resp.status_code)

        # if, elif and while true to deal with riot api rate limit
        if resp.status_code == 200:
            pass
        elif resp.status_code == 429:
            while True: # while loop because of riot api cost
                if resp.status_code == 429: #429 error is api cost issue
                    print('429 delay try 10 second') #approximate 10 second wait before trying again
                    time.sleep(10)

                    # Trying again
                    api_url = ('https://br1.api.riotgames.com/lol/summoner/v4/summoners/{}?api_key={}'.format(df['summonerId'].iloc[i],api_key)
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break

        summoner_info = pd.json_normalize(resp.json()) # The same process we did at the beggining to create the DataFrame
        summoner_list.append(summoner_info) # Appending the DataFrame generated (summoner_info) into the list "summoner_list"
        
    final_summoner_info = pd.concat(summoner_list) # Using pd.concat to concatenate the results stored in the list "summoner_list"
            
    return final_summoner_info
    

#### Executing the `'get_summonerinfo_by_summonerId'` function

If we execute the function and assing it to a variable we can see the results. 

Take care, because the code took about 1:30 hour to execute in my computer. So you might want to save it for further usage.

In [None]:
summoner_df = get_summonerinfo_by_summonerId(league_df, api_key)

Reseting the index

In [None]:
summoner_df.reset_index(inplace = True) # Reseting the Index
summoner_df.drop(columns='index', inplace = True) # Droping the column Index
summoner_df.head()

Saving the summoner_df

In [None]:
summoner_df.to_csv('summoner_df.csv', index = False) 

## **PART III: [MATCH-V5 Endpoint](https://developer.riotgames.com/apis#match-v5/GET_getMatch)**

In this part we will use the `puuid` column from the `summoner_df`, that we've just created, to retrieve the last 20 matchIds for each `puuid`, using the [MATCH-V5 Endpoint](https://developer.riotgames.com/apis#match-v5/GET_getMatch).

So first lets understand how this EDNPOINT works. [Click here](https://developer.riotgames.com/apis#match-v5) to go to the Match-V5 Endpoint.

Click on the EDNPOINT "/lol/match/v5/matches/by-puuid/{puuid}/ids", scroll down and on the path parameters insert this puuid "907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA". Now on the Query parameters insert '420' as the queue (you can check the queue Ids [here](https://github.com/Lacerdash/Extracting-League-of-Legends-data-with-Riot-Api/blob/main/Files/queues%20Id.xlsx) or [here](https://static.developer.riotgames.com/docs/lol/queues.json)) and select ranked as the type. After that you click in "EXECUTE REQUEST".

As we can see in the Response Body contains a list of the last 20 match Ids from the player. So everything is all right.

Once again get the request url and copy it in the api_url

#### **Testing with one puuid**

In [None]:
# & instead of ?, because there's already a ? in the original url
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA/ids?queue=420&type=ranked&start=0&count=20' + '&api_key=' + api_key
 
resp = requests.get(api_ulr)
resp.json()

We can see that the data returned by `resp.json()` is a list of matchIds. To transform it into a DataFrame we can use pd.DataFrame(), passing the resp.json as the data and 'matchid' as the column.

In [52]:
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/907TX3ji-JNb3NFS2m0hq0SBc4t0MFIuR0zG_nN-MAe1KY415e8GNgslVgdaBMXvfmFzLh5m7SvsiA/ids?queue=420&type=ranked&start=0&count=20' + '&api_key=' + api_key
 
resp = requests.get(api_ulr)
matchids = pd.DataFrame(resp.json(), columns = ['matchid'])
matchids.head(3)

Unnamed: 0,matchid
0,BR1_2614740582
1,BR1_2614698761
2,BR1_2613549556


Now that we were able to retrieve the last 20 5v5 Ranked Solo games for 1 player, we need to be able to repeat this process for each puuid in the summoner_df.

To do that we will create a function that will receive a DataFrame containing a column named "puuid" (summoner_df) and the api_key. In this function we will create a for loop to iterate over each puuid repeting the process we have just did above.

#### **Creating the function**

In [14]:
def get_matchids_by_puuid(df, api_key):

    matchids_list = []

    for i in range(len(df)):

        api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/{}/ids?queue=420&type=ranked&start=0&count=20&api_key={}'.format(df['puuid'].iloc[i], api_key)

        resp = requests.get(api_url) # Sending a get request to the api_url
        print(resp.status_code)

        if resp.status_code == 200:
            pass
        elif resp.status_code == 429:
            while True: # while loop because of riot api cost
                if resp.status_code == 429: #429 error is api cost issue
                    print('429 delay try 100 second') #approximate 110 second wait
                    time.sleep(10)

                    api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/by-puuid/{}/ids?queue=420&type=ranked&start=0&count=20&api_key={}'.format(df['puuid'].iloc[i], api_key)
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break        

        matchids = pd.DataFrame(resp.json(), columns = ['matchid']) 
        matchids_list.append(matchids) # Appending the DataFrame generated to the list "matchids_list"
    
    matchids_df = pd.concat(matchids_list) # Using pd.concat to concatenate the results stored in the "matchids_list"
    
    return matchids_df

#### **Executing the `get_matchids_by_puuid` function**

Take care, because if you try to execute the fucntion using the entire summoner_df it will took about 1:30 hours to execute, and it can fail many times before it runs successfully. So to avoida that, I've splited the summoner_df in half. Now I can run the function twice and then concatenated both results into a single DF. Each code takes about 45 minutes to complete, so you can do something while you are waiting. After that saved it for further usage.

In [22]:
print('O summoner_df possui {} linhas'.format(summoner_df.shape[0]))

O summoner_df possui 4399 linhas


In [None]:
matchids_df = get_matchids_by_puuid(summoner_df.loc[:2200], api_key)

In [None]:
matchids_df_1 = get_matchids_by_puuid(summoner_df.loc[2201:], api_key)

In [50]:
matchids_df_list = [matchids_df, matchids_df_1]

final_matchids_df= pd.concat(matchids_df_list)
final_matchids_df.head(3)

Unnamed: 0,matchid
0,BR1_2614740582
1,BR1_2614698761
2,BR1_2613549556


Reseting the Index

In [None]:
final_matchids_df.reset_index(inplace=True)
final_matchids_df.drop(columns = 'index', inplace = True )
final_matchids_df.head()

Dropping duplicates, since we only want unique matchids

In [43]:
final_matchids_df_without_duplicates = final_matchids_df.drop_duplicates()

In [44]:
matchids_count_before = final_matchids_df.shape[0]
matchids_count_after = final_matchids_df_without_duplicates.shape[0]
print('Before we had {} matchids, after we have {} matchids'.format(matchids_count_before, matchids_count_after))

Before we had 87980 matchids, after we have 35537 matchids


Saving the `matchids_df_without_duplicates`

In [45]:
final_matchids_df_without_duplicates.to_csv('matchids_df.csv', index = False)

### **PART 4: [MATCH-V5 EDNPOINT](https://developer.riotgames.com/apis#match-v5/GET_getMatch)**

In this part we will use the `matchid` column from the `matchids_df` to retrieve the match data for each `matchid`, using the [MATCH-V5 EDNPOINT](https://developer.riotgames.com/apis#match-v5/GET_getMatch).

In [4]:
# Importing the matchids_df.csv generated in the "PART 3: MATCH-V5 EDNPOINT" above
matchids_df = pd.read_csv('matchids_df.csv')
print('There is a total of {} matchIds in matchids_df.csv'.format(matchids_df.shape[0]))

There is a total of 35537 matchIds in matchids_df.csv


So first lets understand how this EDNPOINT works. [Click here](https://developer.riotgames.com/apis#match-v5) to go to the Match-V5 Endpoint.

Click on the EDNPOINT "/lol/match/v5/matches/{matchId}" and scroll down. On the path parameters insert one matchid from the `matchids_df` and select SELECT the REGION you are working with, in my case is AMERICAS. Then click in "EXECUTE REQUEST".

We can see in the Response Body the information retrieved.

Now scroll up the page where you can see the request url. This is the url we are going to use to make our resquest in the code bellow. 

#### **Testing with one match id**

Setting the api_ulr to make the get request

In [None]:
# The match id I used is the BR1_2583404284 inserted rigth before the "?api_key=", you can replace it with the one you have used.
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/BR1_2583404284?api_key={}'.format(api_key)  
 
resp = requests.get(api_ulr)
resp.json()

There is a lot of data there, so we need to understand how the json file is structured so we can acess the data we want, which is the match data.

[Here](https://developer.riotgames.com/apis#match-v5/GET_getMatch) you can find more information about the JSON file generated by this ENDPOINT.

But basically it has two first keys: 'metadata' and 'info'. Lets explore each of these keys to understand what data is inside each one.

In [None]:
df = pd.json_normalize(resp.json()['metadata'])
df

Unnamed: 0,dataVersion,matchId,participants
0,2,BR1_2583404284,[tzx5O014rLZrM98l42XxZD9-N-cOXWqoi8FeMpOOP5KSM...


The `'metadata'` contains a list of puuid for every player in the game stored in the `'participants'` column.

**IMPORTANT:** The order in which these puuid's appear in this the `'participants'` column list is the same order they will appear for data elsewhere in this dictionary.

In [None]:
df = pd.json_normalize(resp.json()['info'])
df

Unnamed: 0,gameCreation,gameDuration,gameEndTimestamp,gameId,gameMode,gameName,gameStartTimestamp,gameType,gameVersion,mapId,participants,platformId,queueId,teams,tournamentCode
0,1661728836215,2245,1661731199572,2583404284,CLASSIC,teambuilder-match-2583404284,1661728954351,MATCHED_GAME,12.16.462.4391,11,"[{'assists': 6, 'baronKills': 0, 'basicPings':...",BR1,440,"[{'bans': [{'championId': 157, 'pickTurn': 1},...",


The `'info'` key has a lot of information about the match. But the main information is stored in the `'participants'` column. This column has a list, that contains a dictionary with kill, assist, deaths and other information for each player of the match.

After said that, the information we want to get from the `'info'` key is the `'participants'` column.

In [None]:
# Getting the information inside the 'participants' columns
info_participant = pd.json_normalize(resp.json()['info']['participants'])
info_participant.head(10)

Unnamed: 0,assists,baronKills,basicPings,bountyLevel,champExperience,champLevel,championId,championName,championTransform,consumablesPurchased,...,perks.statPerks.flex,perks.statPerks.offense,perks.styles,challenges.controlWardTimeCoverageInRiverOrEnemyHalf,challenges.highestChampionDamage,challenges.highestWardKills,challenges.junglerKillsEarlyJungle,challenges.killsOnLanersEarlyJungleAsJungler,challenges.fasterSupportQuestCompletion,challenges.highestCrowdControlScore
0,6,0,2,0,21605,18,77,Udyr,0,1,...,5008,5007,"[{'description': 'primaryStyle', 'selections':...",,,,,,,
1,5,2,30,1,21097,18,121,Khazix,0,14,...,5008,5008,"[{'description': 'primaryStyle', 'selections':...",0.589931,1.0,1.0,0.0,1.0,,
2,4,0,22,0,19262,18,268,Azir,0,6,...,5008,5005,"[{'description': 'primaryStyle', 'selections':...",0.328858,,,,,,
3,5,0,2,1,15151,16,221,Zeri,0,1,...,5008,5005,"[{'description': 'primaryStyle', 'selections':...",,,,,,,
4,9,0,3,0,14112,15,99,Lux,0,2,...,5008,5007,"[{'description': 'primaryStyle', 'selections':...",,,,,,,
5,1,0,36,0,14629,15,126,Jayce,0,9,...,5008,5008,"[{'description': 'primaryStyle', 'selections':...",,,,,,,
6,7,0,25,1,16249,16,102,Shyvana,0,1,...,5008,5005,"[{'description': 'primaryStyle', 'selections':...",0.114875,,,0.0,0.0,,
7,5,0,73,0,16582,17,61,Orianna,0,12,...,5008,5005,"[{'description': 'primaryStyle', 'selections':...",0.068368,,,,,,
8,15,0,10,0,13237,15,412,Thresh,0,13,...,5002,5007,"[{'description': 'primaryStyle', 'selections':...",0.425645,,,,,1.0,1.0
9,6,0,12,2,18840,18,51,Caitlyn,0,3,...,5008,5005,"[{'description': 'primaryStyle', 'selections':...",0.061545,,,,,,


As we can see we have data about all 10 players that played the match.

Now that we have understood how to use the RIOT EDNPOINTS to get a single match data. We are able to repeat this process for all the matchids stored in the matchid_df, merging in the end all the information in 1 DataFrame. 

To do that we will create a function that will take as a parameter a DF with a column named `'matchid'` containing the matchIds, and the api_key. In the end the final result will be a unique DataFrame with all the matches information.

In [None]:
def get_match_info_by_matchId(match_ids, api_key):

    matchid_info_list = [] # List that will store the dataframes generated, for each match id in the for loop 

    for i in range(len(match_ids)):# len() get the numbers of rows of the match_ids parameter inserted. Then we use the range() function to create a sequence of number starting from 0 to number returned by len(). 
                            # With that we can interate trough each line in the match_ids dataframe (our function parameter) using the iloc function to get the matchid.

        api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/{}?api_key={}'.format(match_ids['matchid'].iloc[i], api_key) # here we use the iloc function as explained above to form the api link for each matchid
        resp = requests.get(api_ulr) # Sending a get request to the api_url
        print(resp.status_code) # Printing the status_code of the request

        if resp.status_code == 200: # Here we need to use if statment, because of the riot API rate limit which is 20 requests every 1 sec and 100 requests every 2 minutes. A 200 status_code means that the request was successful so it can move on.
            pass                        
        elif resp.status_code == 429: # A 429 means that we have exceeded the rate limit. To solve that we need to wait and then try again to get a successful request
            while True: # To do taht we use a while true loop, to execute the if, elif statment repeatdly until the status_code is 200. After that we use break statement to come out of this loop.
                if resp.status_code == 429: 
                    print('429 delay try 10 second') #approximate 110 second wait to work
                    time.sleep(10) # Sleep

                    api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/{}?api_key={}'.format(match_ids['matchid'].iloc[i], api_key) # Trying the request again
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break        

        # Same process we did in the beginning             
        matchid_info = pd.json_normalize(resp.json()['info']['participants'])
        matchid_info_list.append(matchid_info) # Merging all the dataframes generated and then stored in the matchid_info_list with the concat() function
        
    matchid_info_df = pd.concat(matchid_info_list)

    return matchid_info_df

#### **Execution the `'get_match_info_by_matchId'` function**

In [None]:
print('There is a total of {} matchIds in matchids_df.csv'.format(matchids_df.shape[0]))

There is a total of 35537 matchIds in matchids_df.csv


Since the DataFrame has 35.537 rows it would took a tremendous amount of time to loop trough it. Not only because of the rate limit, but also because, in many cases, when I tried to do all at once an unexpected error occured in the middle of the code and you simple lose all the information.

So to avoid that I dicided to take a sample of 10.000 out of the `'matchids_df'`, which is a considerable amount, and separate it in 10 parts of 1.000 to make it easier to run the function all concatenate all together at the end.

In [None]:
# Dividing the sample DF
matchids_df_reduced = matchids_df.sample(10000, random_state = 57)

matchids_sample_1 = matchids_df_reduced.iloc[:1000]
matchids_sample_2 = matchids_df_reduced.iloc[1000:2000]
matchids_sample_3 = matchids_df_reduced.iloc[2000:3000]
matchids_sample_4 = matchids_df_reduced.iloc[3000:4000]
matchids_sample_5 = matchids_df_reduced.iloc[4000:5000]
matchids_sample_6 = matchids_df_reduced.iloc[5000:6000]
matchids_sample_7 = matchids_df_reduced.iloc[6000:7000]
matchids_sample_8 = matchids_df_reduced.iloc[7000:8000]
matchids_sample_9 = matchids_df_reduced.iloc[8000:9000]
matchids_sample_10 = matchids_df_reduced.iloc[9000:10000]

In [None]:
matchid_info_df_1 = get_match_info_by_matchId(matchids_sample_1, api_key)
matchid_info_df_1.to_csv('matchid_info_df_1.csv', index = False)

In [None]:
matchid_info_df_2 = get_match_info_by_matchId(matchids_sample_2, api_key)
matchid_info_df_2.to_csv('matchid_info_df_2.csv', index = False)

In [None]:
matchid_info_df_3 = get_match_info_by_matchId(matchids_sample_3, api_key)
matchid_info_df_3.to_csv('matchid_info_df_3.csv', index = False)

In [None]:
matchid_info_df_4 = get_match_info_by_matchId(matchids_sample_4, api_key)
matchid_info_df_4.to_csv('matchid_info_df_4.csv', index = False)

In [None]:
matchid_info_df_5 = get_match_info_by_matchId(matchids_sample_5, api_key)
matchid_info_df_5.to_csv('matchid_info_df_5.csv', index = False)

In [None]:
matchid_info_df_6 = get_match_info_by_matchId(matchids_sample_6, api_key)
matchid_info_df_6.to_csv('matchid_info_df_6.csv', index = False)

In [None]:
matchid_info_df_7 = get_match_info_by_matchId(matchids_sample_7, api_key)
matchid_info_df_7.to_csv('matchid_info_df_7.csv', index = False)

In [None]:
matchid_info_df_8 = get_match_info_by_matchId(matchids_sample_8, api_key)
matchid_info_df_8.to_csv('matchid_info_df_8.csv', index = False)

In [None]:
matchid_info_df_9 = get_match_info_by_matchId(matchids_sample_9, api_key)
matchid_info_df_9.to_csv('matchid_info_df_9.csv', index = False)

In [None]:
matchid_info_df_10 = get_match_info_by_matchId(matchids_sample_10, api_key)
matchid_info_df_10.to_csv('matchid_info_df_10.csv', index = False)

#### **Reading all the 10 DFs saved, to create 1 unique DF to store all the matches information**

In [13]:
matchid_info_df_list = []
for i in range(1, 11):
    df = pd.read_csv('matchid_info_df_{}.csv'.format(i))
    matchid_info_df_list.append(df)

matchData_df= pd.concat(matchid_info_df_list)
matchData_df

Unnamed: 0,assists,baronKills,basicPings,bountyLevel,champExperience,champLevel,championId,championName,championTransform,consumablesPurchased,...,challenges.soloTurretsLategame,challenges.highestCrowdControlScore,challenges.highestWardKills,challenges.hadAfkTeammate,challenges.shortestTimeToAceFromFirstTakedown,challenges.baronBuffGoldAdvantageOverThreshold,challenges.fasterSupportQuestCompletion,challenges.thirdInhibitorDestroyedTime,challenges.earliestElderDragon,challenges.mejaisFullStackInTime
0,10,0,10.0,1,13643,15,79,Gragas,0,3,...,,,,,,,,,,
1,7,1,100.0,3,14532,15,131,Diana,0,5,...,,,,,,,,,,
2,11,0,24.0,1,13207,15,157,Yasuo,0,2,...,,,,,,,,,,
3,7,0,13.0,3,12462,14,145,Kaisa,0,3,...,,,,,,,,,,
4,20,0,52.0,2,11338,13,526,Rell,0,19,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,0,0,7.0,0,4179,8,27,Singed,0,0,...,,,,,,,,,,
9996,3,0,2.0,0,3753,7,35,Shaco,0,2,...,,,,,,,,,,
9997,0,0,0.0,5,6035,9,157,Yasuo,0,3,...,,,,,,,,,,
9998,1,0,1.0,0,3149,6,15,Sivir,0,2,...,,,,,,,,,,


Saving the `'matchData_df'`

In [14]:
matchData_df.to_csv('matchData_df.csv', index = False)

### **Next steps?**

With that, we've concluded our guide to the Riot API: Extracting League of Legends Match Data

Now you have created your own DataBase. So you can use that to make exploratory analysis to crate hypothesis. In additional to that, you can make some data visualization using python or other program. 

This guide was 1 of many usage cases for the Riot API. To help you think of potential ideas, I'll list a few of the popular APIs that you can use:

- How much Mastery a player has on each Champion
- In-depth game detail for every minute of the game (i.e. how much Gold/XP each player has at 12 minutes)
- In-depth objective and kill data, like who killed who, when and where
- Ranked information, such as their current rank for each queue
- Who is currently in Challenger, Grand Master & Master (& every queue below that too!)
- And much more...!

You can connect with me in likedin to ask questions: https://www.linkedin.com/in/fernando-lacerda-/

I'am starting to write a articles about statistics/AI/esports/gaming, you can subscribe here: 

I listed bellow the main columns from the matchData_df, so you do not have to worry about figuring out which column to use. If you want to know more about the columns I can check that in the [MATCH-V5 EDNPOINT](https://developer.riotgames.com/apis#match-v5/GET_getMatch)

In [15]:
useful_columns = ['assists', 'champLevel', 'championName', 'deaths', 'detectorWardsPlaced', 'gameEndedInEarlySurrender', 
                'gameEndedInSurrender', 'goldEarned', 'inhibitorsLost', 'kills', 'magicDamageDealt', 'magicDamageDealtToChampions', 
                'neutralMinionsKilled', 'physicalDamageDealt', 'physicalDamageDealtToChampions', 'physicalDamageTaken', 'profileIcon', 
                'sightWardsBoughtInGame', 'teamId', 'teamPosition', 'totalDamageDealt', 'totalDamageDealtToChampions', 'totalMinionsKilled', 
                'trueDamageDealt', 'trueDamageDealtToChampions', 'turretsLost', 'visionScore', 'wardsKilled', 'wardsPlaced', 'win'
                ]

print('The original dataset has {}, I choose to use {} columns'.format(matchData_df.shape[1], len(useful_columns)))

The original dataset has 238, I choose to use 30 columns
