# Instructions for Wrangling Match Info

This file was made to streamline the acquisition process. You can make a copy of this file or just copy all of the code into a new one to run it.

In [1]:
import numpy as np
import pandas as pd
import requests
import prepare
from time import sleep
from env import api_key

In [2]:
#Load the match_ids.csv file
matches_df = pd.read_csv('match_ids.csv')
matches_df

Unnamed: 0,0
0,NA1_4062082172
1,NA1_4093908534
2,NA1_4094915650
3,NA1_4082128299
4,NA1_4083500908
...,...
37279,NA1_4097058147
37280,NA1_4096721229
37281,NA1_4094049450
37282,NA1_4091691873


There are 37,284 match ids that need to be run. The first 4,000 have already been acquired, so I will divide the rest of them between the 5 of us. Four of us will acquire 6,657 while the last person will only acquire 6,656.

In [3]:
#Set up the match id list. You only need to uncomment the line following your name

#Johnathon
#match_ids = list(matches_df['0'])[4000:10_657]

#Joshua Bryant
#match_ids = list(matches_df['0'])[10_657:17_314]

#Jared
#match_ids = list(matches_df['0'])[17_314:23_971]

#Chris
match_ids = list(matches_df['0'])[23_971:30_628]

#Joshua Chavez
#match_ids = list(matches_df['0'])[30_628:37_284]

In [4]:
#Set up your username. You only need to uncomment the line following your name.
#Again, this username is only for naming the json files.

#Johnathon
#username = 'smith'

#Joshua Bryant
#username = 'bryant'

#Jared
#username = 'vahle'

#Chris
username = 'everts'

#Joshua Chavez
#username = 'chavez'

### Define the Function

In [5]:
def get_match_info(match_ids, api_key, username, time = 20):
    """
    This function takes in a list of match ids and iterates through them. For each iteration,
    it will make two api calls and retrieve the necessary information for our project.
    
    Two json lists will be created, and once all the info has been gathered for each match id,
    this function will save both json lists for future reference. 
    
    This function also takes in a username string. This will be used for naming the saved files.
    
    Finally, this function will funnel the resulting json lists into a prepare function and
    return a complete df.
    
    The time parameter is used in the prepare function that will be called at the end
    and represents the timeframe of the data we want to gather and prepare.
    The default value is 20 for the 20 minute mark of the match.
    
    """
    
    #Create an empty list to store the timeline json data
    timeline_data = []
    
    #Create an empty list to store the other json data
    game_data = []
    
    #Set up main url
    url = 'https://americas.api.riotgames.com/'
    
    #Create a count var
    n = 0
    
    #Store the length of the match_ids list in a var
    match_count = len(match_ids)
    
    #Loop through each match_id
    for match in match_ids:
        #Update the count var
        n += 1
        
        #After every 50 match ids (100 total requests), wait 150 seconds (2.5 minutes)
        if n % 50 == 0:
            print(f'Completed {n} of {match_count} match IDs.')
            print(f'Waiting 150 seconds...\n')
            
            #Sleep 150 seconds
            sleep(150)
            
            print('Continuing!\n')
        
        ############################# TIMELINE DATA #######################
        #Set up timeline url
        timeline_query = f'lol/match/v5/matches/{match}/timeline/?api_key={api_key}'
        
        #Grab timeline json data
        timeline_response = requests.get(url + timeline_query)
        
        #Check response and leave a status message
        if timeline_response.status_code != 200:
            print(f'Something went wrong getting TIMELINE DATA! Status Code {timeline_response.status_code} for match ID: {match}.')
            print(f'Skipping this match ID.\n')
            continue

        #Turn it into json format
        timeline_json = timeline_response.json()
        
        #Append this data to the timeline_data list
        timeline_data.append(timeline_json)
        
        ############################# OTHER GAME DATA #######################
        #Set up game data url
        game_query = f'lol/match/v5/matches/{match}?api_key={api_key}'
        
        #Grab game json data
        game_response = requests.get(url + game_query)
        
        #Check response and leave a status message.
        if game_response.status_code != 200:
            print(f'Something went wrong getting OTHER GAME DATA! Status Code {game_response.status_code} for match ID: {match}.')
            print(f'Skipping this match ID and REMOVING PREVIOUS TIMELINE ENTRY.\n')
            
            #Remove the last entry in the timeline_data list
            timeline_data.pop()
            continue
        
        #Turn it into json format
        game_json = game_response.json()
        
        #Append this data to the game_data list
        game_data.append(game_json)
        
    ################################## END LOOP ##############################
    
    #Now that all of the json data has been gathered, save each file for future reference
    
    #Save timeline data.
    #First, convert it to a df
    timeline_df = pd.DataFrame(timeline_data)
    
    #Save as csv file. Use username string to identify whose file it is
    timeline_df.to_json(f'timeline_data_{username}.json')
    
    #Print status message
    print(f'Created timeline_data_{username}.json')
    
    #Save other game data
    #First, convert it to a df
    game_df = pd.DataFrame(game_data)
    
    #Save as csv file. Use username string to identify whose file it is
    game_df.to_json(f'other_game_data_{username}.json')
    
    #Print status message
    print(f'Created other_game_data_{username}.json\n')
    
    ################################ PREPARE DATA #########################
    
    #The following section will funnel the data into the prepare function
    #Written by Joshua C.
    
    #Leave a status message
    print('Preparing the data...\n')
    
    #Begin preparing the data
    df = prepare.prepare(timeline_data, game_data, time)
    
    #Finally, return the prepared df
    return df

### Use the Function

Once the function is finished running, you will have two new json files. One for the timeline data, and another for the other game data. This function also returns a prepared df that you can immediately save as a .csv if you like.

In [None]:
#Call the get_match_info function
df = get_match_info(match_ids, api_key, username)

Completed 50 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 100 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 150 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 200 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 250 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 300 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 350 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 400 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 450 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 500 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 550 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 600 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 650 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 700 of 6657 match IDs.
Waiting 150 seconds...

Continuing!

Completed 750 of 6657

### Save the Prepared DF as a .csv

In [None]:
#This will save the prepared dataframe as .csv using the username format. 
#This will prevent confusion and allow us to easily load and concatenate 
#all of our data into one dataframe later.

df.to_csv(f'prepared_data_{username}', index = False)

### Done!