# Capstone Project - RL



## Problem Statement


Rocket league is regarded as a video game with very slow improvement rates. Whilst easy to pick up and enjoy, the highly mechanical and precise nature of the game leads to very slow progression at high ranks. Professional e-sports players are constantly pushing the skill ceiling in terms of mechanical talent and tactical mastery, and can be used as examples for improvement and furthermore as play-style case studies for developing players, particularly at mid to high ranks.
This project uses 3 pro players, selected as top-level examples of 3 distinct play-styles:
- MonkeyMoon: 
    - Highly efficient with boost, never out of the game because of this
    - Very strong decision making - only uses the required amount of resources to make a favourable play for his team
- Vatira: 
    - ‘3rd man’ position, largely playing with a high amount of boost, able to use his ability to capitalise on space created by teammates
    - Plays close to teammates and pressure the opposition collectively
- Oski: 
    - ‘Ballchaser’, uses a large amount of boost to play as fast as possible, aiming to beat opposition to the ball or take out multiple players in one play
    - Often leaves teammates exposed, however looks to create more opportunities than conceded over a game
---
An ExtraTrees classification model has been trained on replay data from professional tournaments for each of these players, and can be used to predict which play-style a player most closely fits from a submitted 3v3 replay ID and player ID.
- Visuals highlighting key differences to each play-style can also be viewed, as well as the probabilities of each play-style being selected.
---
Furthermore, a neural network regression model has been produced to estimate the rank of a player given the stats from a 3v3 replay ID and player ID

### Imports

In [3]:
import requests
import json
import pandas as pd

import time

## Oski Data Collection

In [1]:
# params for the function below

params = {            'player-name' : 'Oski',
                      'playlist' : ['private'],
                      'replay-date-after' : '2022-01-02T15:00:05+01:00',
                      'replay-date-before' : '2023-04-19T15:00:05+01:00',
                      'pro' : 'true'
                  }

In [4]:
def get_replays(num_replays, params, auth_key, collected_replays = 0, id_list = []):
    '''
    This function takes a specified number of replays required, designed params (check the ballchasing api), a personal authorisation api key
    and returns a list of replay ids for use in the next function
    '''
    
    # prevents exceeding the rate limit
    time.sleep(0.3)
    
    res = requests.get('https://ballchasing.com/api/replays/',
                  headers={
                      'Authorization': auth_key},
                  params = params
                  )
    data = res.json()
    
    # collects 50 (default) replays at a time, and prints the number that have been collected
    collected_replays += len(data['list'])
    print(collected_replays, num_replays)
    
    # recursively calls the function until the collected replays equals the number specified in the function params
    if collected_replays <= num_replays:
        for i in range(len(data['list'])):
            id_list.append(data['list'][i]['id'])
            replay_end_date = data['list'][-1]['date']
            
            # updates the end date filter for the params, in order to get new replays
            params['replay-date-before'] = replay_end_date
            
        # call the function with new params
        get_replays(num_replays = num_replays, 
                    params = params, 
                    auth_key = auth_key,
                    collected_replays = collected_replays,
                    id_list = id_list
                   )
                
    return id_list



def get_gamedata(id_list, auth_key, player_name = None):
    '''
    This function takes the id_list from the previous function, uses the personal auth key and a player name (if required)
    to retrieve the in game stats for the player_name from each replay id in id_list
    
    It creates a new dataframe and unpacks the subdictionaries into a full dataframe
    '''
    errors = 0
    
    res = requests.get(f'https://ballchasing.com/api/replays/{id_list[0]}',
                  headers={
                      'Authorization': auth_key}
                          )
    data = res.json()
    
    # Formation of the dataframe skeleton
    player_data = data['orange']['players'][0]['stats']
    player_data = dict(player_data, **player_data['core'])
    player_data = dict(player_data, **player_data['boost'])
    player_data = dict(player_data, **player_data['movement'])
    player_data = dict(player_data, **player_data['positioning'])
    player_data = dict(player_data, **player_data['demo'])
    del player_data['core']
    del player_data['boost']
    del player_data['movement']
    del player_data['positioning']
    del player_data['demo']

#     player_data['title'] = data['title']
    player_data['player_name'] = player_name
    
    df = pd.DataFrame(columns = player_data.keys())
    
    for i, s in enumerate(id_list):
        # prevents the function exceeding the rate limit
        time.sleep(0.3)
        
        res = requests.get(f'https://ballchasing.com/api/replays/{s}',
                  headers={
                      'Authorization': auth_key}
                          )
        data = res.json()
        
        # error handling, if the call is unsuccessful, there will be no key 'orange'
        if 'orange' not in data.keys():
            print(f"match {i} failed to retrieve data, {df.shape[0]} games collected")
            errors += 1
            continue
        
        # only pulls matches that are 3v3
        try:
        
            if len(data['orange']['players']) != 3:
                print(f"match {i} not 3V3")
                errors += 1
                continue
        except KeyError:
                errors += 1
                print(f"KeyError at match {i}, errors = {errors}")
                continue
        
        # verbose, updates the user with every 20 games collected
        if df.shape[0] % 20 == 0:
            print(f"{df.shape[0]} games collected, {errors} errors")
        
        # searches for the player_name key inside the subdictionaries, then retrieves the stats and unpacks the sub dictionaries
        try:
            for j in range(0, min([len(data['orange']['players']), len(data['blue']['players'])])):
            
                if player_name != None:
                    # if player_name in orange team
                    if data['orange']['players'][j]['name'].lower() == player_name.lower():
                        player_data = data['orange']['players'][j]['stats']

                        player_data = dict(player_data, **player_data['core'])
                        player_data = dict(player_data, **player_data['boost'])
                        player_data = dict(player_data, **player_data['movement'])
                        player_data = dict(player_data, **player_data['positioning'])
                        player_data = dict(player_data, **player_data['demo'])
                        del player_data['core']
                        del player_data['boost']
                        del player_data['movement']
                        del player_data['positioning']
                        del player_data['demo']

                        player_data['title'] = data['title']
                        player_data['player_name'] = player_name
                        df.loc[player_data['title'], :] = player_data
                    
                    # if player_name in blue team
                    elif data['blue']['players'][j]['name'].lower() == player_name.lower():
                        
                        player_data = data['blue']['players'][j]['stats']

                        player_data = dict(player_data, **player_data['core'])
                        player_data = dict(player_data, **player_data['boost'])
                        player_data = dict(player_data, **player_data['movement'])
                        player_data = dict(player_data, **player_data['positioning'])
                        player_data = dict(player_data, **player_data['demo'])
                        del player_data['core']
                        del player_data['boost']
                        del player_data['movement']
                        del player_data['positioning']
                        del player_data['demo']

                        player_data['title'] = data['title']
                        player_data['player_name'] = player_name

                        df.loc[player_data['title'], :] = player_data
                        
        # if one of the keys is not present, continue and ignore this replay file        
        except KeyError:
            errors += 1
            print(f'KeyError in match {i}, errors = {errors}')
            

    return df, print(f"{df.shape[0]} games collected, errors = {errors}")

In [6]:
id_list = get_replays(num_replays=500, 
                      params = params, 
                      auth_key='mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5')

50 500
100 500
150 500
200 500
250 500
300 500
350 500
400 500
450 500
500 500
550 500


In [7]:
oski = get_gamedata(id_list = id_list, 
             auth_key='mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5',
             player_name = 'Oski')

0 games collected, 0 errors
20 games collected, 0 errors
match 31 not 3V3
match 32 not 3V3
40 games collected, 2 errors
match 73 not 3V3
match 74 not 3V3
match 78 not 3V3
match 91 not 3V3
match 92 not 3V3
60 games collected, 7 errors
match 102 not 3V3
match 103 not 3V3
match 112 not 3V3
match 113 not 3V3
match 122 not 3V3
match 123 not 3V3
80 games collected, 13 errors
match 149 not 3V3
match 150 not 3V3
match 151 not 3V3
100 games collected, 16 errors
120 games collected, 16 errors
match 186 not 3V3
match 187 not 3V3
match 188 not 3V3
match 189 not 3V3
match 191 not 3V3
KeyError in match 192, errors = 22
match 193 not 3V3
match 194 not 3V3
match 195 not 3V3
match 196 not 3V3
match 214 not 3V3
match 215 not 3V3
140 games collected, 28 errors
match 226 not 3V3
match 227 not 3V3
match 238 not 3V3
match 239 not 3V3
match 255 not 3V3
match 256 not 3V3
160 games collected, 34 errors
match 278 not 3V3
match 279 not 3V3
180 games collected, 36 errors
match 303 not 3V3
200 games collected, 37 

In [8]:
# asigns oski to the dataframe that is returned
oski = oski[0]

---
## Vatira Data Collection

In [77]:
params = {            'player-name' : 'Vatira',
                      'playlist' : ['private'],
                      'replay-date-after' : '2022-01-02T15:00:05+01:00',
                      'replay-date-before' : '2023-04-19T15:00:05+01:00',
                      'pro' : 'true'
                  }

In [78]:
vati_id_list = get_replays(num_replays=500,
                           params = params,
                           auth_key = 'mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5'
                          )

50 500
100 500
150 500
200 500
250 500
300 500
350 500
400 500
450 500
500 500
550 500


In [120]:
# since vatira uses multiple names, such as 'Vatira', 'Vati', 'Vatigoat', vati was used to catch all cases
vati = get_gamedata(id_list = vati_id_list,
                    auth_key='mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5',
                    player_name='Vati'
                   )

0 games collected, 0 errors
20 games collected, 0 errors
match 38 not 3V3
match 42 not 3V3
40 games collected, 2 errors
40 games collected, 2 errors
match 68 not 3V3
match 69 not 3V3
60 games collected, 4 errors
80 games collected, 4 errors
match 112 not 3V3
match 113 not 3V3
match 114 not 3V3
100 games collected, 7 errors
match 135 not 3V3
match 136 not 3V3
120 games collected, 9 errors
140 games collected, 9 errors
match 173 not 3V3
match 180 not 3V3
match 181 not 3V3
match 182 not 3V3
match 183 not 3V3
160 games collected, 14 errors
match 222 not 3V3
match 223 not 3V3
match 236 not 3V3
match 237 not 3V3
180 games collected, 18 errors
match 245 not 3V3
match 246 not 3V3
200 games collected, 20 errors
match 267 not 3V3
match 268 not 3V3
match 269 not 3V3
match 293 not 3V3
match 294 not 3V3
220 games collected, 25 errors
240 games collected, 25 errors
260 games collected, 25 errors
match 341 not 3V3
match 363 not 3V3
match 364 not 3V3
280 games collected, 28 errors
match 387 not 3V3
30

In [121]:
vati = vati[0]

Unnamed: 0,shots,shots_against,goals,goals_against,saves,assists,score,mvp,shooting_percentage,bpm,...,percent_offensive_half,percent_behind_ball,percent_infront_ball,percent_most_back,percent_most_forward,percent_closest_to_ball,percent_farthest_from_ball,inflicted,taken,player_name
EU P-K KC vs TL G7 2023-02-26.19.11,1,6,1,3,1,0,325,False,100,410,...,33.170353,76.4848,23.5152,35.274002,28.615133,31.104712,36.4438,2,0,Vati
EU P-K KC vs TL G6 2023-02-26.19.02,5,6,0,1,1,1,258,False,0,422,...,40.89332,74.454704,25.545294,27.57155,31.740908,30.342155,33.731438,0,1,Vati
EU P-K KC vs TL G5 2023-02-26.18.51,2,9,0,3,2,0,317,False,0,458,...,41.98585,64.13079,35.869217,28.30217,42.784454,31.614126,37.756298,4,1,Vati
EU P-K KC vs TL G4 2023-02-26.18.44,6,6,1,4,1,0,418,False,16.666666,403,...,40.977737,73.11973,26.880268,34.678318,33.40298,34.434105,34.86826,0,2,Vati
EU P-K KC vs TL G3 2023-02-26.18.35,6,4,2,2,1,1,602,True,33.333332,444,...,40.779434,81.296974,18.70303,32.3866,35.532387,42.872555,30.82869,1,0,Vati
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
MST vs END G4 2022-05-21.13.58,4,8,2,0,1,0,482,False,50,427,...,40.45441,69.7964,30.2036,32.54293,32.75683,39.173744,26.676037,1,1,Vati
MST vs SLY G3 2022-05-08.12.39,2,2,0,0,0,0,190,False,0,437,...,50.060135,71.02481,28.975185,30.937479,34.937737,33.42151,34.969997,1,0,Vati
MST vs END G7 2022-05-21.14.22,3,8,0,0,3,0,285,False,0,417,...,35.530567,75.80931,24.190687,40.863754,33.73035,30.341177,32.40696,1,0,Vati
MST vs KC G5 2022-05-22.17.35,5,6,0,1,2,2,492,True,0,500,...,40.49754,73.076126,26.92388,33.21212,36.515152,37.333332,30.727272,3,1,Vati


---
## M0nkey M00n Data Collection

In [11]:
params = {            'player-name' : 'M0nkey M00n',
                      'playlist' : ['private'],
                      'replay-date-after' : '2021-10-02T15:00:05+01:00',
                      'replay-date-before' : '2022-10-01T15:00:05+01:00',
                      'pro' : 'true'
                  }

In [12]:
mm_id_list = get_replays(num_replays=500,
                        params = params,
                        auth_key = 'mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5'
                        )

50 500
100 500
150 500
200 500
250 500
300 500
350 500
400 500
450 500
500 500
520 500


In [15]:
mm_id_list = mm_id_list[:500]

In [16]:
monkeymoon = get_gamedata(id_list=mm_id_list,
                          auth_key = 'mPuO0QZqG0wXdB9gDZHp7KKI09KplWLsYXkJQzJ5',
                          player_name = 'M0nkey M00n')

0 games collected, 0 errors
0 games collected, 0 errors
0 games collected, 0 errors
0 games collected, 0 errors
match 35 not 3V3
20 games collected, 1 errors
match 45 not 3V3
match 63 not 3V3
match 64 not 3V3
40 games collected, 4 errors
match 73 not 3V3
match 80 not 3V3
match 81 not 3V3
match 90 not 3V3
match 91 not 3V3
60 games collected, 9 errors
80 games collected, 9 errors
match 131 not 3V3
match 132 not 3V3
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
100 games collected, 11 errors
120 games collected, 11 errors
140 games collected, 11 errors
match 226 not 3V3
match 227 not 3V3
160 games collected, 13 errors
match 242 not 3V3
180 games collected, 14 errors
200 games collected, 14 errors
match 296 not 3V3
match 297 not 3V3
match 316 not 3V3
match 317 not 3V3
220 games collected, 18

In [17]:
monkeymoon = monkeymoon[0]

--- 
## Saving Datasets to CSVs

In [9]:
oski.to_csv('../data/oski.csv')

In [None]:
vati.to_csv('../data/vati.csv')

In [18]:
monkeymoon.to_csv('../data/monkeymoon.csv')