# Extracting League of Legends Match Data using Riot API - PART IIII

## Imports

In [1]:
import pandas as pd
import requests
import time
from warnings import simplefilter
simplefilter(action="ignore", category=pd.errors.PerformanceWarning)

importing the MatchId File we generated previously

In [2]:
# Importing the matchids_df.csv generated in the "Extracting League of Legends Match Data using Riot API - PART I, II and III.ipynb"
matchids_df = pd.read_csv('matchids_df.csv')
print('There is a total of {} matchIds in matchids_df.csv'.format(matchids_df.shape[0]))

There is a total of 37971 matchIds in matchids_df.csv


In [4]:
matchids_df.head(3)

Unnamed: 0,matchid
0,BR1_2610790841
1,BR1_2610780078
2,BR1_2610769542


## PART 4: 

Using the [MATCH-V5 EDNPOINT](https://developer.riotgames.com/apis#match-v5/GET_getMatch) to pull the matches data. With this ENDPOINT we can Get matchdata given a match id

### Testing with one match id

Setting the api_key

In [1]:
api_key = 'RGAPI-c8c94262-b1a3-4501-a182-68bd6698f40a'

Setting the api_ulr to make the get request

In [5]:
# The match id we are going to use is the BR1_2583404284 inserted rigth before the ?api_key=
api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/BR1_2583404284?api_key={}'.format(api_key)  
 
resp = requests.get(api_ulr)
resp.json()

{'metadata': {'dataVersion': '2',
  'matchId': 'BR1_2583404284',
  'participants': ['tzx5O014rLZrM98l42XxZD9-N-cOXWqoi8FeMpOOP5KSMNv-BVVNqaQB7W0vyumsNVBHRVVjWHrTfw',
   '6jEznZWk4oV5nXe56vBoZYZF_EjEHZ7RXuCc-r6HvquSUXsMuae30lfO3V2SE8pgn95TblQwWxhlGQ',
   'nYGlKDJX8D5sAkhW4F0UWxIu9FFzmmau0oy8puArmoFwNYHL1q_q1XcYOMgS_-87exEfLHrtMon-rg',
   'vbZS6WC35IJtIC-giQhbcI1_UR8UJjylu6FBrFj0_MYNuHsNnkw_DeRD304HziJp-vMW2TfkLl_Czw',
   '_7cwEDziZ_bbtzNFIZRIKmeqnrwOM3ijML26DTcfzhOYqU6AHJr9x_GpmJOzkDiT9f65Y6RJZ4i7Xw',
   'mGmuiEyYCdQ6kW0aa1nvCn9B1ci7T5s-Gg4Rtgvb_5nHmSIIvzxNPjR98SbatKiSYW3Srn_l8KX7wQ',
   'gue3kaBmSozygOUnJwovJ4eFImyiscktmTvxVcgDzuxk_ch8lAw3N4A5Eyrw5xVjhoHR4jz9TnJsYQ',
   'v91YYISBiPwb-g8ID_eWe37pPiz7r27K1qS0kSBhbxFCgJDKNyL57kOb7wtwTn3EOXggBMA9btkBEw',
   'u3z4ZWNSfoZjzZL5ztBCBLqUzQkvyAqrIKsyD43KEZuW5LBgviFm5LptZP-lDILVLibzLEcZNFc1CA',
   '6ig7Yf8O9gCs5Ak56TFulo0LcguA-lEExrty3ukZD-oRHYu1eOOfpoYg4_gv14x7EWumRI-86uHatQ']},
 'info': {'gameCreation': 1661728836215,
  'gameDuration': 2245,
  

Now that we were able to get all the match information, we need to understand how the json file is structured so we can acess the data we want.

[Here](https://developer.riotgames.com/apis#match-v5/GET_getMatch) you can find more information about the JSON file generated by this ENDPOINT.

But basically it has two first keys: 'metadata' and 'info'. We will use the 'info' key, because it is where most of the information we want is stored.

In [6]:
df = pd.json_normalize(resp.json()['metadata'])
df

Unnamed: 0,dataVersion,matchId,participants
0,2,BR1_2583404284,[tzx5O014rLZrM98l42XxZD9-N-cOXWqoi8FeMpOOP5KSM...


As we can see the `metadata` key has the `dataVersion`, `matchId` and a `participants` list (composed by participants puuid), which are not useful to us.

In [7]:
df = pd.json_normalize(resp.json()['info'])
df

Unnamed: 0,gameCreation,gameDuration,gameEndTimestamp,gameId,gameMode,gameName,gameStartTimestamp,gameType,gameVersion,mapId,participants,platformId,queueId,teams,tournamentCode
0,1661728836215,2245,1661731199572,2583404284,CLASSIC,teambuilder-match-2583404284,1661728954351,MATCHED_GAME,12.16.462.4391,11,"[{'assists': 6, 'baronKills': 0, 'basicPings':...",BR1,440,"[{'bans': [{'championId': 157, 'pickTurn': 1},...",


The `'info'` key has a lot of information about the match. But the main information is stored inside the `'participants'` column. This column has a list, that contains a dictionary with kill, assist, deaths and other information for each player of the match.

After said that the information we want to get from the `'infoDto'` (Dto means Data transfer object) is the `'participants'` column.

In [8]:
# Getting the information inside the 'participants' columns
info_participant = pd.json_normalize(resp.json()['info']['participants'])
info_participant.head(1)

Unnamed: 0,assists,baronKills,basicPings,bountyLevel,champExperience,champLevel,championId,championName,championTransform,consumablesPurchased,...,perks.statPerks.flex,perks.statPerks.offense,perks.styles,challenges.controlWardTimeCoverageInRiverOrEnemyHalf,challenges.highestChampionDamage,challenges.highestWardKills,challenges.junglerKillsEarlyJungle,challenges.killsOnLanersEarlyJungleAsJungler,challenges.fasterSupportQuestCompletion,challenges.highestCrowdControlScore
0,6,0,2,0,21605,18,77,Udyr,0,1,...,5008,5007,"[{'description': 'primaryStyle', 'selections':...",,,,,,,


Although now know where the important information is stored, not all the information will be used. Therefore I listed the columns that I judged to be the most important and necessary to us.

Which will be these ones:

In [9]:
useful_columns = ['assists', 'champLevel', 'championName', 'deaths', 'detectorWardsPlaced', 'gameEndedInEarlySurrender', 
                'gameEndedInSurrender', 'goldEarned', 'inhibitorsLost', 'kills', 'magicDamageDealt', 'magicDamageDealtToChampions', 
                'neutralMinionsKilled', 'physicalDamageDealt', 'physicalDamageDealtToChampions', 'physicalDamageTaken', 'profileIcon', 
                'sightWardsBoughtInGame', 'teamId', 'teamPosition', 'totalDamageDealt', 'totalDamageDealtToChampions', 'totalMinionsKilled', 
                'trueDamageDealt', 'trueDamageDealtToChampions', 'turretsLost', 'visionScore', 'wardsKilled', 'wardsPlaced', 'win'
                ]

print('The original dataset has {}, I choose to use {} columns'.format(info_participant.shape[1], len(useful_columns)))

The original dataset has 232, I choose to use 30 columns


In the end what we want to get is a dataframe like the one bellow for each matchId in our matchids_df.

In [10]:
info_participant = pd.json_normalize(resp.json()['info']['participants'])
info_participant = info_participant[useful_columns] # Selecting only the columns from the "useful_columns" list to update the DF
info_participant

Unnamed: 0,assists,champLevel,championName,deaths,detectorWardsPlaced,gameEndedInEarlySurrender,gameEndedInSurrender,goldEarned,inhibitorsLost,kills,...,totalDamageDealt,totalDamageDealtToChampions,totalMinionsKilled,trueDamageDealt,trueDamageDealtToChampions,turretsLost,visionScore,wardsKilled,wardsPlaced,win
0,6,18,Udyr,2,0,False,False,14258,0,4,...,209647,16763,235,19195,0,1,26,7,11,True
1,5,18,Khazix,6,10,False,False,21597,0,20,...,294846,41272,29,27545,1884,1,43,10,10,True
2,4,18,Azir,6,4,False,False,15028,0,5,...,226425,31171,223,5451,827,1,25,2,13,True
3,5,16,Zeri,6,0,False,False,13146,0,3,...,202313,9635,186,82857,269,1,25,5,10,True
4,9,15,Lux,2,0,False,False,9620,0,1,...,56990,11634,59,0,0,1,61,7,36,True
5,1,15,Jayce,10,4,False,False,10847,3,3,...,157144,24339,216,2385,0,11,33,4,17,False
6,7,16,Shyvana,8,1,False,False,12691,3,5,...,224989,23015,85,7304,1828,11,32,8,3,False
7,5,17,Orianna,7,2,False,False,13259,3,5,...,148492,22021,197,7071,896,11,23,3,13,False
8,15,15,Thresh,5,9,False,False,8347,3,1,...,27513,9707,34,6034,1674,11,75,6,39,False
9,6,18,Caitlyn,3,2,False,False,16130,3,8,...,249859,29474,277,23398,2005,11,45,2,14,False


Now that we have understood how to use the RIOT EDNPOINTS to get a single match data, and which information we want to get from it. We want to be able to repeat this process for all the matchids stored in the matchid_df, merging in the end all the information in 1 DataFrame. 

To do that we will create a function that will take as a parameter a DF with a column named "matchid" containing the matchIds, and the api_key. In the end the final result will be a unique DataFrame with all the matches information.

In [3]:
def get_match_info_by_matchId(match_ids, api_key):

    matchid_info_list = [] # List that will store the dataframes generated, for each match id in the for loop 

    for i in range(len(match_ids)):# len() get the numbers of rows of the match_ids parameter inserted. Then we use the range() function to create a sequence of number starting from 0 to number returned by len(). 
                            # With that we can interate trough each line in the match_ids dataframe (our function parameter) using the iloc function to get the matchid.

        api_ulr = 'https://americas.api.riotgames.com/lol/match/v5/matches/{}?api_key={}'.format(match_ids['matchid'].iloc[i], api_key) # here we use the iloc function as explained above to form the api link for each matchid
        resp = requests.get(api_ulr) # Sending a get request to the api_url
        print(resp.status_code) # Printing the status_code of the request

        if resp.status_code == 200: # Here we need to use if statment, because of the riot API rate limit which is 20 requests every 1 sec and 100 requests every 2 minutes. A 200 status_code means that the request was successful so it can move on.
            pass                        
        elif resp.status_code == 429: # A 429 means that we have exceeded the rate limit. To solve that we need to wait and then try again to get a successful request
            while True: # To do taht we use a while true loop, to execute the if, elif statment repeatdly until the status_code is 200. After that we use break statement to come out of this loop.
                if resp.status_code == 429: 
                    print('429 delay try 10 second') #approximate 110 second wait to work
                    time.sleep(10) # Sleep

                    api_url = 'https://americas.api.riotgames.com/lol/match/v5/matches/{}?api_key={}'.format(match_ids['matchid'].iloc[i], api_key) # Trying the request again
                    resp = requests.get(api_url)
                    print(resp.status_code)

                elif resp.status_code == 200:
                    print('limit cost resolve')
                    break        

        # Same process we did in the beginning             
        matchid_info = pd.json_normalize(resp.json()['info']['participants'])
        matchid_info_list.append(matchid_info) # Merging all the dataframes generated and then stored in the matchid_info_list with the concat() function
        
    matchid_info_df = pd.concat(matchid_info_list)

    return matchid_info_df

In [27]:
print('There is a total of {} matchIds in matchids_df.csv'.format(matchids_df.shape[0]))

There is a total of 37971 matchIds in matchids_df.csv


Since the DataFrame has 37.971 rows it would took a tremendous amount of time to loop trough it. Not only because of the rate limit, but also because, in many cases, when I tried to do that an unexpected error occured in the middle of the code and you simple lose all the information.

So I dicided to take a sample of 10.000, which is a considerable amount, and separate it in 10 parts of 1.000 to make it easier to gather all the information.

In [7]:
# Dividing the sample DF
matchids_df_reduced = matchids_df.sample(10000, random_state = 57)

matchids_sample_1 = matchids_df_reduced.iloc[:1000]
matchids_sample_2 = matchids_df_reduced.iloc[1000:2000]
matchids_sample_3 = matchids_df_reduced.iloc[2000:3000]
matchids_sample_4 = matchids_df_reduced.iloc[3000:4000]
matchids_sample_5 = matchids_df_reduced.iloc[4000:5000]
matchids_sample_6 = matchids_df_reduced.iloc[5000:6000]
matchids_sample_7 = matchids_df_reduced.iloc[6000:7000]
matchids_sample_8 = matchids_df_reduced.iloc[7000:8000]
matchids_sample_9 = matchids_df_reduced.iloc[8000:9000]
matchids_sample_10 = matchids_df_reduced.iloc[9000:10000]

In [None]:
matchid_info_df_1 = get_match_info_by_matchId(matchids_sample_1, api_key)
matchid_info_df_1.to_csv('matchid_info_df_1.csv', index = False)

In [None]:
matchid_info_df_2 = get_match_info_by_matchId(matchids_sample_2, api_key)
matchid_info_df_2.to_csv('matchid_info_df_2.csv', index = False)

In [36]:
matchid_info_df_3 = get_match_info_by_matchId(matchids_sample_3, api_key)
matchid_info_df_3.to_csv('matchid_info_df_3.csv', index = False)

200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cos

  matchid_info_df.reset_index(inplace=True)


In [40]:
matchid_info_df_4 = get_match_info_by_matchId(matchids_sample_4, api_key)
matchid_info_df_4.to_csv('matchid_info_df_4.csv', index = False)

200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cos

In [41]:
matchid_info_df_5 = get_match_info_by_matchId(matchids_sample_5, api_key)
matchid_info_df_5.to_csv('matchid_info_df_5.csv', index = False)

429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 s

In [8]:
matchid_info_df_6 = get_match_info_by_matchId(matchids_sample_6, api_key)
matchid_info_df_6.to_csv('matchid_info_df_6.csv', index = False)

200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
20

In [9]:
matchid_info_df_7 = get_match_info_by_matchId(matchids_sample_7, api_key)
matchid_info_df_7.to_csv('matchid_info_df_7.csv', index = False)

429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
20

In [10]:
matchid_info_df_8 = get_match_info_by_matchId(matchids_sample_8, api_key)
matchid_info_df_8.to_csv('matchid_info_df_8.csv', index = False)

429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 s

In [11]:
matchid_info_df_9 = get_match_info_by_matchId(matchids_sample_9, api_key)
matchid_info_df_9.to_csv('matchid_info_df_9.csv', index = False)

200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cos

In [12]:
matchid_info_df_10 = get_match_info_by_matchId(matchids_sample_10, api_key)
matchid_info_df_10.to_csv('matchid_info_df_10.csv', index = False)

429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 second
429
429 delay try 10 second
429
429 delay try 10 second
200
limit cost resolve
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
200
429
429 delay try 10 s

Reading all the 10 DFs saved, to create 1 unique DF to store all the matches information

In [20]:
matchid_info_df_list = []
for i in range(1, 11):
    df = pd.read_csv('matchid_info_df_{}.csv'.format(i))
    matchid_info_df_list.append(df)

matchData_df= pd.concat(matchid_info_df_list)
matchData_df

Unnamed: 0,assists,baronKills,basicPings,bountyLevel,champExperience,champLevel,championId,championName,championTransform,consumablesPurchased,...,challenges.earliestBaron,challenges.soloTurretsLategame,challenges.teleportTakedowns,challenges.thirdInhibitorDestroyedTime,challenges.fastestLegendary,challenges.shortestTimeToAceFromFirstTakedown,challenges.hadAfkTeammate,challenges.baronBuffGoldAdvantageOverThreshold,challenges.earliestElderDragon,challenges.mejaisFullStackInTime
0,4,0,14.0,1,9053,12,516,Ornn,0,3,...,,,,,,,,,,
1,7,0,39.0,3,9669,12,30,Karthus,0,2,...,,,,,,,,,,
2,4,0,19.0,5,9442,12,101,Xerath,0,5,...,,,,,,,,,,
3,2,0,54.0,3,7208,10,22,Ashe,0,5,...,,,,,,,,,,
4,6,0,24.0,1,4748,8,57,Maokai,0,9,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9995,3,0,1.0,0,7996,11,92,Riven,0,3,...,,,,,,,,,,
9996,8,0,15.0,0,11486,14,64,LeeSin,0,5,...,,,,,623.275066,32.231992,,,,
9997,20,0,11.0,4,10461,13,163,Taliyah,0,4,...,,,,,,32.231992,,,,
9998,2,0,12.0,0,9016,12,145,Kaisa,0,3,...,,1.0,,,,32.231992,,,,


Saving the final matchData_df

In [None]:
matchData_df.to_csv('matchData_df.csv', index = False)

In [None]:
useful_columns = ['assists', 'champLevel', 'championName', 'deaths', 'detectorWardsPlaced', 'gameEndedInEarlySurrender', 
                'gameEndedInSurrender', 'goldEarned', 'inhibitorsLost', 'kills', 'magicDamageDealt', 'magicDamageDealtToChampions', 
                'neutralMinionsKilled', 'physicalDamageDealt', 'physicalDamageDealtToChampions', 'physicalDamageTaken', 'profileIcon', 
                'sightWardsBoughtInGame', 'teamId', 'teamPosition', 'totalDamageDealt', 'totalDamageDealtToChampions', 'totalMinionsKilled', 
                'trueDamageDealt', 'trueDamageDealtToChampions', 'turretsLost', 'visionScore', 'wardsKilled', 'wardsPlaced', 'win'
                ]