# Match Crawler

This notebook is responsible for data collection process. The notebook is a crawler for high elo League of Legends matches. Crawling is achieved through the Riot Development API and the package `riotwatcher`.

In [1]:
from riotwatcher import LolWatcher, ApiError
import pandas as pd

import os
import json
import re

api_key = 'RGAPI-28637d7d-49ae-49f1-993f-e6a41359ff1e'
watcher = LolWatcher(api_key=api_key)
REGION = 'kr'

### Fetch Challenger League

Fetch the summoner's in challenger league of Korea server and cache it locally to `data/kr_challenger_league.json`. The last time it was cached is in 2022-05-23.

In [2]:
if os.path.exists('data/kr_challenger_league.json'):
    with open('data/kr_challenger_league.json', 'r') as f:
        kr_challenger_league = json.load(f)
    print("Loaded previously cached data.")
else :
    kr_challenger_league = watcher.league.challenger_by_queue(REGION, "RANKED_SOLO_5x5")
    with open('data/kr_challenger_league.json', 'w') as f:
        json.dump(kr_challenger_league, f)

Loaded previously cached data.


Fetch the `puuid` for each challenger account. This is because we can only access matches using `puuid`.

In [3]:
# List of challenger summonerIds
challengers = [entry['summonerId'] for entry in kr_challenger_league['entries']]

# Store counter on disk
if os.path.exists('data/counter.txt'):
    with open('data/counter.txt', 'r') as f:
        counter = f.read()
        print(f"Loaded previously cached counter: {counter}.")
else:
    counter = 0
if os.path.exists('data/accounts.json'):
    with open('data/accounts.json', 'r') as f:
        accounts = json.load(f)
        print("Loaded previously cached accounts")
else:
    for i, challenger in enumerate(challengers):
        if i < int(counter):
            continue
        account = watcher.summoner.by_id(REGION, challenger)
        with open('data/accounts.txt', 'a') as f:
            f.write(str(account))
        with open('data/counter.txt', 'w') as f:
            f.write(str(i))
        print(f"Counter: {i}.")

    with open('data/accounts.txt', 'r') as f:
            accounts = f.read()

    accounts = re.sub('}{', '}@@@{', accounts)
    accounts = accounts.split('@@@')

    with open('data/accounts.json', 'w') as f:
        json.dump(accounts, f)

Loaded previously cached counter: 299.
Loaded previously cached accounts


### Fetch High Elo Matches

Fetch the recent matches by the `puuid` of each challenger account

In [4]:
accounts[0]

"{'id': 'eZSUIU09ORLZonrW_DdxTsPDch_S8u0-i3foyt2ahl8ma1_-DdR_crmDjA', 'accountId': 'HbvN6-lt4ao_4Ps2_ed0V7kBoFGOFlOaoSoKTiYTTJJaA5o5yVIe30zV', 'puuid': 'x2IpDOTbQi8g9NwT53TFTk2Flg5qO9Pj0JthidWeAl6D2_UNmWYKGbSzXQTz03lgG7_Q6VS4jJ8Fdw', 'name': '순삭쿠키', 'profileIconId': 5212, 'revisionDate': 1652510405000, 'summonerLevel': 67}"