<a id='intro'></a>
<div class="alert alert-block alert-warning"> 
<h1 align="center">Introduction</h3>
</div>

> To begin this project, I have to first explore the API, and extract any important information.
<br> Fist, I will familiarize myself with the structure of the data received from each API request. </br>
<br> Once I understand the structure of the data, I will create a function to extract the necessary information and store them in an easily accessible DataFrame. </br>
<br> Next, I will create another function to get a chunk of data using a specified date, and store the information in a DataFrame using the previously created function from the step above. </br>
<br> Lastly, I will collect data from various days and hours for the months of January 2018 to April 2018. </br>

> Once I have collected my data, I will have to create a mapping between the character IDs and their corresponding names. Luckily, this information is given in two separate files I found through the official documentation; a json file, "gameplay.json", and an ini file, "English.ini".

<a id='data'></a>
<div class="alert alert-block alert-warning"> 
<h1 align="center">Data Collection</h3>
</div>

<a id='data1'></a>
### Explore API structure
#### Description:
> I will send an API request and explore the structure of the returned data, and determine how to extract the data I require.

#### Procedure:
> I will first create a function to make an API request, make a request, then explore the structure of the returned data.
1. Build a function to get the data from an API request
2. Explore the data

In [4]:
import json
import requests
import pandas as pd
from collections import defaultdict
import time
from datetime import date, datetime, timedelta

##### 1. Build a function to get the data from an API request

In [5]:
def get_data(url, start=None, end=None):
    '''
    Send API request for data between the start and end dates
    '''
    api_key = open("api_key.txt", "r").read()
    
    header = {
        "Authorization": api_key,
        "Accept": "application/vnd.api+json"
    }
    
    if not start and not end:
        r = requests.get(url, headers=header)
    else:
        start = str(start)[:10] + 'T' + str(start)[11:] + 'Z'
        end = str(end)[:10] + 'T' + str(end)[11:] + 'Z'
        query = {
            "sort": "createdAt",
            "filter[createdAt-start]": start,
            "filter[createdAt-end]": end
        }
        r = requests.get(url, headers=header, params=query)
    
    return r.json()

##### 2. Explore the data

In [6]:
'''Make an API request, and store the returned data'''
url = "https://api.dc01.gamelockerapp.com/shards/global/matches"

start = datetime(2018, 4, 1, 0, 0, 0, 0)
end = start + timedelta(days=1)

data = get_data(url, start, end)

In [7]:
'''Explore the 'data' key'''
print(json.dumps(data['data'][0], indent=4))

{
    "type": "match",
    "id": "34F63A9E2F544B53913C0AD840572330",
    "attributes": {
        "createdAt": "2018-04-01T00:00:00Z",
        "duration": 303,
        "gameMode": "1733162751",
        "patchVersion": "44504",
        "shardId": "global",
        "stats": {
            "mapID": "319DDC57E70174B6C85EF137BAF34E9E",
            "type": "QUICK2V2"
        },
        "tags": {
            "rankingType": "UNRANKED",
            "serverType": "QUICK2V2"
        },
        "titleId": "stunlock-studios-battlerite"
    },
    "relationships": {
        "assets": {
            "data": [
                {
                    "type": "asset",
                    "id": "626990a2-3541-11e8-b891-0a586460a616"
                }
            ]
        },
        "rosters": {
            "data": [
                {
                    "type": "roster",
                    "id": "4b490036-c870-4293-8e6f-06eb60c5ca30"
                },
                {
                    "type": "roster",

> The 'data' key hold match data.
<br> Each entry within 'data' includes information on the date the match was played, the match ID, the game mode played (2v2 or 3v3), and an asset ID used to access the match's telemetry data. </br>

In [17]:
'''Explore telemetry data'''
# Get telemetry urls
telemetry_urls = {}
for entry in data['included']:
    if entry['type'] == 'asset':
        telemetry_urls[str(entry['id'])] = entry['attributes']['URL']

# Get telemetry data
asset_id = data['data'][0]['relationships']['assets']['data'][0]['id']
url = telemetry_urls[asset_id]
r = requests.get(url, headers={"Accept":"application/vnd.api+json"})
telemetry = r.json()
print(json.dumps(telemetry, indent=4))

[
    {
        "cursor": 882384555,
        "type": "Structures.MatchReservedUser",
        "dataObject": {
            "time": 1522540800310,
            "accountId": "783238845914296320",
            "matchId": "34F63A9E2F544B53913C0AD840572330",
            "serverType": "QUICK2V2",
            "characterLevel": 15,
            "teamId": "980201125078806528",
            "totalTimePlayed": 278448,
            "characterTimePlayed": 51115,
            "character": 39373466,
            "team": 1,
            "rankingType": "UNRANKED",
            "mount": 132750854,
            "attachment": 520549809,
            "outfit": 1471164257,
            "emote": 553323459,
            "league": 2,
            "division": 4,
            "divisionRating": 86,
            "seasonId": 7
        }
    },
    {
        "cursor": 882956068,
        "type": "Structures.UserRoundSpell",
        "dataObject": {
            "time": 1522541111616,
            "accountId": "815094029707522048",
      

> There are two important TelemetryData types here: "Structures.MatchReservedUser", and "Structures.MatchFinishedEvent".
<br> "Structures.MatchReservedUser" holds player information such as the character ID the that was played, the player's ranking (bronze, silver, gold, etc.), the player's team, and the match's format (ranked, or casual). </br>
<br> "Structures.MatchFinishedEvent" holds each team's score, which will be useful in determining which players won the match. </br>

In [6]:
'''Explore the 'included' key'''
print(json.dumps(data['included'], indent=4))

[
    {
        "type": "participant",
        "id": "6c7665d4-b033-4245-bd4a-c7fdec5c8569",
        "attributes": {
            "actor": "1208445212",
            "shardId": "global",
            "stats": {
                "abilityUses": 232,
                "attachment": 2143958509,
                "damageDone": 935,
                "damageReceived": 938,
                "deaths": 3,
                "disablesDone": 217,
                "disablesReceived": 166,
                "emote": 914102918,
                "energyGained": 536,
                "energyUsed": 375,
                "healingDone": 130,
                "healingReceived": 232,
                "kills": 1,
                "mount": 1302058678,
                "outfit": 1379696215,
                "score": 1282,
                "side": 1,
                "timeAlive": 190
            }
        },
        "relationships": {
            "player": {
                "data": {
                    "type": "player",
               

> The 'included' key holds various pieces of information, but the only thing I am interested in are the telemetry urls holding the telemetry data of each match.

<a id='data2'></a>
### Collect data
#### Description:
> Now that I understand the structure of the data, I can start collecting my data.

#### Procedure:
> I will first create two functions, one to extract the necessary data receieved and store it as a DataFrame, and another to get a chunk of data for a specified date. Once these functions have been created, I will collect data from various days and hours for the months of January 2018 to April 2018. Because of the current date (April 10), I will only collect data for the first week of April.
1. Build a function to extract data
2. Build a function to get data for specified date
3. Collect data

##### 1. Build a function to extract data

In [7]:
def extract(data):
    '''
    Get match data, and store in a DataFrame
    '''
    
    # Get telemetry urls
    telemetry_urls = {}
    for entry in data['included']:
        if entry['type'] == 'asset':
            telemetry_urls[str(entry['id'])] = entry['attributes']['URL']

    # Get match data
    match_data = {'date':[], 'match_id':[], 'game_mode':[], 'ranked':[], 'character':[], 'league':[], 'win':[]}
    for entry in data['data']:
        if entry['attributes']['stats']['type'] in ['QUICK2V2', 'QUICK3V3', 'LEAGUE2V2', 'LEAGUE3V3']:
            # Get date
            date = datetime.strptime(entry['attributes']['createdAt'], '%Y-%m-%dT%H:%M:%SZ')
            match_data['date'] = date
            
            # Get match id
            match_id = entry['id']

            # Get game mode (2v2 or 3v3)
            game_mode = entry['attributes']['stats']['type'][-3:]

            # Get telemetry data
            asset_id = entry['relationships']['assets']['data'][0]['id']
            url = telemetry_urls[asset_id]
            r = requests.get(url, headers={"Accept":"application/vnd.api+json"})
            telemetry = r.json()

            # Get player info
            players = []
            for entry in telemetry:
                # Determine winning team
                if entry['type'] == 'Structures.MatchFinishedEvent':
                    if entry['dataObject']['teamOneScore'] > entry['dataObject']['teamTwoScore']:
                        winner = 1
                    else:
                        winner = 2
                
                # Gather player info
                if entry['type'] == 'Structures.MatchReservedUser':
                    players.append(entry)

            for player in players:
                # Gather player info
                character = player['dataObject']['character']
                league = player['dataObject']['league']
                if player['dataObject']['team'] == winner:
                    win = True
                else:
                    win = False

                # Store player info
                match_data['character'].append(character)
                match_data['league'].append(league)
                match_data['win'].append(win)
            
            # Determine if the game was ranked
            if players[0]['dataObject']['rankingType'] == 'RANKED':
                ranked = True
            else:
                ranked = False

            count = len(players)
            # Store remaining data
            match_data['match_id'] += [match_id]*count
            match_data['game_mode'] += [game_mode]*count
            match_data['ranked'] += [ranked]*count
            
    # Convert into DataFrame
    DataFrame = pd.DataFrame(match_data)
    
    return DataFrame

##### 2. Build a function to get data for specified date

In [8]:
def get_chunk(df=pd.DataFrame(), year=2018, month=1, day=1, hour=0, minute=0):
    '''
    Get chunk of data
    '''
    
    # Initialize start and end dates
    start = datetime(year, month, day, hour, minute, 0, 0)
    end = start + timedelta(days=1)
    
    # Get data from first page
    url = "https://api.dc01.gamelockerapp.com/shards/global/matches"
    data = get_data(url, start, end)
    temp = extract(data)
    df = pd.concat([df, temp], ignore_index=True)
    
    # Get data from next pages
    for i in range(9):
        # Get the next page
        url = data['links']['next']
        data = get_data(url)
        
        # Create DataFrame
        temp = extract(data)

        # Concatenate DataFrames
        df = pd.concat([df, temp], ignore_index=True)
        
    # Delay requests
    time.sleep(53)
    
    return df

##### 3. Collect Data

In [None]:
'''First Run'''
# Initialize empty DataFrame
match_df = pd.DataFrame()

# Cycle through Jan-March
for month in range(1,4):
    # Cycle through various days of the month
    for day in range(1,29,3):
        # Update DataFrame
        match_df = get_chunk(match_df, 2018, month, day)

# Cycle through first week of April
for day in range(1,8):
    # Update DataFrame
    match_df = get_chunk(match_df, 2018, 4, day)

# Save DataFrame
#match_df.to_csv('compiled_data\match_data', index=False)

In [None]:
'''Second Run'''
# Initialize empty DataFrame
match_df = pd.DataFrame()

# Cycle through Jan-March
for month in range(1,4):
    # Cycle through various days of the month
    for day in range(1,29,3):
        # Cycle through various hours of the day
        for hour in [0, 18, 19, 20, 21, 22, 23]:
            # Cycle through various minutes of the day
            for minute in range(0,60,10):
                # Update DataFrame
                match_df = get_chunk(match_df, 2018, month, day, hour, minute)

# Cycle through first week of April
for day in range(1,8):
    # Cycle through various hours of the day
    for hour in [0, 18, 19, 20, 21, 22, 23]:
        # Cycle through various minutes of the day
            for minute in range(0,60,10):
                # Update DataFrame
                match_df = get_chunk(match_df, 2018, 4, day, hour, minute)

# Save DataFrame
#match_df.to_csv('compiled_data\match_data2', index=False)

<a id='characters'></a>
<div class="alert alert-block alert-warning"> 
<h1 align="center">Character Mappings</h3>
</div>

<a id='character1'></a>
### Build character mappings
#### Description:
> Now that I have collected my data, I must map each character to its corresponding name. To do this, I will have to go through the "gameplay.json", and "English.ini" files and link the corresponding data.

> The "gameplay.json" file includes information on each character.
<br> For each character, there are two IDs, a "typeID" and a "name". </br>
<br> The "typeID" corresponds to the ID stored in my collected data, while the "name" is an ID in the "English.ini" file that can be used to find each character's name. </br>

#### Procedure:
> I will first go through the "gameplay.json" file to create a DataFrame to store each character's "typeID" and "name" ID, then go through the "English.ini" file to create a separate DataFrame to store the mappings of each "name" ID and its value. I can then merge these two DataFrames using an inner join to map each "typeID" with the correct name.
1. Build DataFrame using "gameplay.json"
2. Build DataFrame using "English.ini"
3. Build character mappings

##### 1. Build DataFrame using "gameplay.json"

In [9]:
# Open 'gameplay.json' and load as a json file
gameplay_json = open('mappings\gameplay.json', 'r').read()
character_mappings = json.loads(gameplay_json)['characters']

# Loop through each character and store the 'typeID' and 'name' attributes
character_names = defaultdict(list)
for character in character_mappings:
    character_names['characterID'].append(character['typeID'])
    character_names['nameID'].append(character['name'])

# Turn character_names into a DataFrame
character_names = pd.DataFrame(character_names)

##### 2. Build DataFrame using "English.ini"

In [10]:
# Open 'English.ini' file and store
filename = 'mappings\English.ini'
localization_mapping = {}
with open(filename, encoding='utf8') as text:
    for line in text:
        key, value = line.split('=', 1)
        localization_mapping[key] = value[:-1]

# Turn localization_mapping into a DataFrame
localization_mapping = pd.DataFrame.from_dict(localization_mapping, orient='index').reset_index().rename(columns={0:'name'})

##### 3. Build character mappings

In [11]:
# Merge the two DataFrames to match 'characterID' to 'name'
character_names = character_names.merge(localization_mapping, left_on='nameID', right_on='index')
# Drop unnecessary columns
character_names = character_names.drop(columns=['nameID', 'index'])

# Save data
character_names.to_csv('compiled_data\character_names', index=False)
character_names.head()

Unnamed: 0,characterID,name
0,467463015,Lucie
1,259914044,Sirius
2,1649551456,Pestilus
3,543520739,Blossom
4,842211418,Iva
