# Some Title League
    by Gerald Liu and On Choi
     - CMSC 320 Fall 2019

<img src="https://i.pinimg.com/originals/4c/0b/51/4c0b5190cf49d2d3e16d8ab56c632b44.jpg" width="500" height="200"/>
<img src="https://media.comicbook.com/2019/04/riot-games-logo-1167492-1280x0.jpeg"  width="200" height="200"/>


## 1. Introduction

League of Legends (LoL) is a MOBA (Multiplayer Online Battle Arena) published by Riot Games in 2009 and is the [most popular online PC game](https://newzoo.com/insights/rankings/top-20-core-pc-games/) as of November 2019. The competitive scene or e-sports has been popular among players and the most recent international tornament had an estimated 99.6 million viewers and had a $2.5 million prize pool for over 100 teams accross 13 regions around the world [(source)](https://www.businessinsider.com/league-of-legends-world-championship-winner-funplus-phoenix-photos-2019-11). There is an estimated >100 million matches of league played per day with 8 million concurrent players at any given time as of 2017 and that number as only gone up. As a result, the LoL has generated a plethora of data over the years and much can be learned from any perspective and purpose. 


In our tutorial, we will looking at match and player data for the North American Server for the latest season as of December 2019. We will crawl the league API to gather and tidy useful data 

## 2. Required Libraries/Resources

 - All libraries and code will be imported/written with python 3 in mind. If you have python 2, use the future library or upgrade :D
 - To scrape League of Legends data, we need requets and beautifulsoup4
 - Pandas and Numpy will be used to transform and tidy our dataset
 - matplotlib and seaborn for data visualization
 - sklearn for predictions and machine learning
 - urllib for converting characters in string into escaped characters (for hitting the API)
 - time to sleep for rate limit
 - random to randomly crawl riot api
 
 
 - We need an API key to access the Riot API. We will load it from a hidden file called 'api_key.json'. Generate yours at https://developer.riotgames.com/. 

In [68]:
# Imports
import pandas as pd
import numpy as np
from sklearn import linear_model
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split, validation_curve, cross_val_score
import matplotlib.pyplot as plt
import seaborn as sns
import requests
import json
from bs4 import BeautifulSoup
import urllib
import time
import random
import warnings
warnings.filterwarnings('ignore')

# API Key obtained from Riot Developer Portal (unique for account expires every 16 hours)
# However, you can apply for an permanent API key at https://developer.riotgames.com/app-type 
API_KEY = 'XXXXX-XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX'
# We have our API key loaded from a json file (can also be stored as an evironment variable if you would like)
with open('api_key.json') as f:
  API_KEY = json.load(f)['API_KEY']

# Set random seed
random.seed(1)

# 3. Data Collection (API Crawling)

 - To gather match data, Riot provides an [API](https://developer.riotgames.com/apis) that has data for matches, players (summoners), champions (playable characters), ect.
 
 
 - However, Riot only provides match data from a given matchID which cannot be randomly gotten. Therefore, we need to start from a seeded playerID and crawl through that players' teammates and opponents matches to generate enough data to preform data analysis.
 
 
 - Our set seed will be on one of our summoner IDs "50g Tibbers" on the North American server. 
 
 
 ### Using Riot's API
 
 - Riot has a [developer portal](https://developer.riotgames.com/) which you will need a Riot [account](https://developer.riotgames.com/login). 
 - You will need an API Key and generate it daily or [apply for a product API key](https://developer.riotgames.com/app-type) which will persist.
 - Documentation can be found at https://developer.riotgames.com/apis
 - Riot has rate limits of 20 requests/second and 100 requests/2 min. Data generation will take a while to generate enough data to learn on. Therefore, we will export data into a file to pull from while generating data as often as we can. 
 - Example for calling summoner (player) information for current API version (v4):
     - https://na1.api.riotgames.com/lol/summoner/v4/summoners/by-name/SUMMONER_NAME?api_key= + API_KEY
     

In [3]:
response = requests.get('https://na1.api.riotgames.com/lol/summoner/v4/summoners/by-name/50g%20Tibbers?api_key=RGAPI-6345f4f0-360f-4c68-afc2-4036d7097a80')

In [5]:
response.json()

{'id': 'IpnPo_LsY61sOsbPsRgVhwe1jmNdXF-m84rnkO3UwISwRY0',
 'accountId': 'e9Po7IYch-H6xHFLKmwAAoVDSXHRRIvdpbWQv-6mtLM7njA',
 'puuid': 'RZkfzEJAMhOrIZ7nM6EMbSNg0ww6phNe_RhEQPR3YgU2gCSg6yTRkgQEiSgjGrCEDxFnif23D93kcQ',
 'name': '50g Tibbers',
 'profileIconId': 3800,
 'revisionDate': 1575959895000,
 'summonerLevel': 103}

In [114]:
def get_random_matches(num_matches=100, seed_summoner="50g Tibbers", key=API_KEY, rank="any", region="na1", season=13, sleep_delay=0.2):
    start_time = time.time()
    
    # Estimated Time based on rate limit
    seconds = (num_matches / 50 * 60) * 3 + num_matches * sleep_delay
    print("This will take minimum {} seconds to complete. If rank option is selected, it will take longer.\n".format(seconds))
    
    # Get seed summoner accountId
    seed_request = requests.get("https://{}.api.riotgames.com/lol/summoner/v4/summoners/by-name/{}?api_key={}".format(region, seed_summoner, key))
    if seed_request.status_code == 404:
        print("Error Code 404. Invalid seed_summoner or region.")
        return
    seed_json = seed_request.json()
    accountId = seed_json['accountId']
    print("Seed Summoner = {}".format(seed_summoner))
    
    # Initalize # of requests after getting seed summoner accountId and matches list
    request_num = 1
    visited_players = set()
    visited_matches = set()
    visited_players.add(accountId)
    matches = []

    for m in range(num_matches):
        # Sleep to account for rate limit. 100 requests per 2 minutes
        time.sleep(1.2) # Request 1

        # Grab account's match history
        match_history = requests.get("https://{}.api.riotgames.com/lol/match/v4/matchlists/by-account/{}?api_key={}".format(region, accountId, key)).json()
        time.sleep(1.2) # Request 2

        # Get most recent game from season parameter. Then grab gameId to get match info.
        most_recent = None
        match_counter = 0
        if 'matches' in match_history:
            
            for match in match_history['matches']:
                if match['season'] == season and match['gameId'] not in visited_matches:
                    most_recent = match
                    visited_matches.add(match['gameId'])
                    break
                match_counter += 1
                if match_counter == len(match_history['matches']):
                    print("Found repeat...")
                    most_recent = matches[-1]
        else:
            if match_counter == len(match_history['matches']):
                print("Found no matches...")
                most_recent = matches[-1]

        if most_recent is None:
            print("Make sure your seed account has a game (most recent 100) from the right season. (season {})".format(season))
            print("Or... all matches in this player's history have been seen...")
            return
        gameId = most_recent['gameId']

        # Grab match info
        match = requests.get("https://{}.api.riotgames.com/lol/match/v4/matches/{}?api_key={}".format(region, gameId, key)).json()
        time.sleep(1.2) # Request 3
        matches.append(match)
        print("Found match #{}: {} for {}".format(m+1, gameId, accountId))

        # Now we need to crawl. Grab a random teammate or enemy and continue match lookup. 
        player_counter = 0
        # len(match['participantIdentities']) should be 10
        if 'participantIdentities' in match:
            random_order = list(range(len(match['participantIdentities'])))
            random.shuffle(random_order)
            for n in random_order:
                chosen_one = match['participantIdentities'][n]['player']['accountId']
                if chosen_one not in visited_players:
                    visited_players.add(chosen_one)
                    accountId = chosen_one
                    break
                player_counter += 1
                # If all players have been seen... add match to blacklist
                if player_counter == len(match['participantIdentities']):
                    print("uh oh...")
                    return
        else:
            print('Invalid match found... {}'.format(gameId))
            m -= 1
            
        # Sleep for good measure (may be removed later)
        time.sleep(sleep_delay)
        
        # Now, we should have another accountId to work with
        # Continue loop until we have num_matches
        
    # Display runtime and return matches list
    print("Found {} matches in {} seconds.".format(len(matches), time.time() - start_time))
    return matches

In [115]:
matches = get_random_matches(num_matches=100)

This will take minimum 380.0 seconds to complete. If rank option is selected, it will take longer.

Seed Summoner = 50g Tibbers
Found match #1: 3230641357 for e9Po7IYch-H6xHFLKmwAAoVDSXHRRIvdpbWQv-6mtLM7njA
Found match #2: 3231394878 for ICRNB-sdlRSKD5kkmQPn7nktjgkyT-MU3xnNm0eiordh1fU
Found match #3: 3231406131 for YJZEQUR1Ix9L_QpNqbgS3mVwk91gAGCJggFGOyYetPYroSFQ-DQ-gZKb
Found match #4: 3231395498 for 748vB_Op96Q20klM6mxAK6821K3xmlohBLd6v4fn_V_Hpj0
Found match #5: 3231388046 for 2hI5Q6TguGYWZZtzN7P8PvhE1tZcON5rmND5UCjw8jVfXw
Found match #6: 3231281274 for YEFg3hbi2kDPMQ5aBn7aCs8cAY5Su-nYUWvwx0wPSCjEVw
Found match #7: 3231308684 for nhDXOeJLnXd1O53p95-Pk7q_QwH_VIBe0VMGU3DEk2w
Found match #8: 3231284466 for BY8wwtHsPUlOn0nTVq8K1wT3RRKaSzHqgh-jNpcJqoYjCw
Found match #9: 3231384897 for yHvVFuK9VLduqfB7C47QX6OrdAILsSZBXZwFUzFdD-1oKUc
Found match #10: 3231314535 for SvWlRDU5ycvy3YvAYgp38Tqq0hxAYrSr8oo2y7KzgnXG5XQuKmSHpY6Q
Found match #11: 3231309967 for kOdW6KVVebqoEBtLT3j-UYr0_n-orzCX0oFf7F