# Kaggle League of Legends competition - Data Exploration

## Team: Elden Ring

<img src="https://eldenring.wiki.fextralife.com/file/Elden-Ring/mirel_pastor_of_vow.jpg" alt="PRAISE DOG" style="width:806px;height:600px;"/>

#### PRAISE THE DOG!

In [1]:
import pandas as pd
import numpy as np
import json

First off, load all the different .csv and .json that are single files

In [2]:
X_train = pd.read_csv('../data/participants_train.csv')
X_test = pd.read_csv('../data/participants_test.csv')
y_train = pd.read_csv('../data/train_winners.csv')

champion_mastery = pd.read_csv('../data/champion_mastery.csv')
champion = pd.read_json('../data/champion.json')
team_positions = pd.read_csv('../data/teamPositions.csv')

#### Champion Data

Based on my knowledge of the game, the only thing that could potentially be useful is the champion types.

The other information (like health, etc...) is contained and updated in the Timeline files.

In [3]:
# to unpack the relevant information
champion_data = pd.json_normalize(champion['data'])

list_classes = []

for i in champion_data['tags']:
    for j in range(len(i)):
        if i[j] not in list_classes:
            list_classes.append(i[j])

list_classes

['Fighter', 'Tank', 'Mage', 'Assassin', 'Marksman', 'Support']

> NOTE: only 6 types of champions exist, that could be useful for predictions

In [4]:
# this is to get the champion types in a way that can be used for furthen analysis
champion_types = champion_data.explode('tags').pivot_table(values='id', index='key', columns='tags', aggfunc='count').fillna(0).reset_index()

champion_types.head()

tags,key,Assassin,Fighter,Mage,Marksman,Support,Tank
0,1,0.0,0.0,1.0,0.0,0.0,0.0
1,10,0.0,1.0,0.0,0.0,1.0,0.0
2,101,0.0,0.0,1.0,0.0,0.0,0.0
3,102,0.0,1.0,0.0,0.0,0.0,1.0
4,103,1.0,0.0,1.0,0.0,0.0,0.0


Initially I also use the info from the champion data (as below) for predictions. Again, this info is essentially encoded in the timelines

In [5]:
champion_data['key'] = champion_data['key'].astype(int)
X_train_champion = pd.merge(X_train, champion_data, how='inner', left_on='championId', right_on='key')
X_train_champion = X_train_champion.sort_values(['matchId', 'participantId'], ascending = [True, True]).reset_index(drop=True)

In [6]:
X_train_champion[['info.attack', 'info.defense', 'info.magic', 'info.difficulty']]

Unnamed: 0,info.attack,info.defense,info.magic,info.difficulty
0,4,6,7,4
1,3,4,8,5
2,2,5,8,6
3,8,2,2,6
4,1,6,8,1
...,...,...,...,...
79995,6,6,4,6
79996,7,7,4,3
79997,1,4,10,10
79998,9,2,3,6


#### Champion Mastery

Right away the idea here was to use how "skilled" somoeone is with a champion. Note however, that players get champion points even when they lose the match. From predictions later on it turned out champion level is a better predictor than champion points

In [7]:
X_train_mastery = pd.merge(X_train, champion_mastery, how='left', on=['summonerId', 'championId'], indicator=True)
X_train_mastery.loc[X_train_mastery['_merge'] != 'both']

Unnamed: 0,matchId,teamId,participantId,summonerId,summonerLevel,championName,championId,championLevel,championPoints,chestGranted,tokensEarned,_merge
15475,1547,200,6,4039,67,Malphite,54,,,,,left_only
17267,1726,200,8,13584,43,Zed,238,,,,,left_only
20142,2014,100,3,15354,50,Corki,42,,,,,left_only
20609,2060,200,10,15627,97,Senna,235,,,,,left_only
28330,2833,100,1,19836,34,Nasus,75,,,,,left_only
35860,3586,100,1,23525,52,KSante,897,,,,,left_only
47090,4709,100,1,28312,42,Gwen,887,,,,,left_only
53017,5301,200,8,30650,53,Cassiopeia,69,,,,,left_only
56373,5637,100,4,31856,44,Samira,360,,,,,left_only
58253,5825,100,4,8619,47,Kaisa,145,,,,,left_only


In [8]:
X_test_mastery = pd.merge(X_test, champion_mastery, how='left', on=['summonerId', 'championId'], indicator=True)
X_test_mastery.loc[X_test_mastery['_merge'] != 'both']

Unnamed: 0,matchId,teamId,participantId,summonerId,summonerLevel,championName,championId,championLevel,championPoints,chestGranted,tokensEarned,_merge
5513,8551,100,4,41371,35,Kaisa,145,,,,,left_only
10156,9015,200,7,42698,43,Kayn,141,,,,,left_only


> NOTE on the above two: there are player - champion combinations that likely had too little games (). For the model, decided to set those champion level and/or points to 0

In [9]:
X_test_mastery['championLevel'].value_counts()

7.0    10158
5.0     3659
6.0     1662
4.0     1534
3.0     1387
2.0     1101
1.0      497
Name: championLevel, dtype: int64

#### Team Positions

Not much to say here. Idea is to look lane per lane (or position per position rather) instead of comparing the whole team

In [10]:
team_positions.head()

Unnamed: 0,matchId,participantId,teamPosition
0,0,1,TOP
1,0,2,JUNGLE
2,0,3,MIDDLE
3,0,4,BOTTOM
4,0,5,UTILITY


In [11]:
X_train.tail()

Unnamed: 0,matchId,teamId,participantId,summonerId,summonerLevel,championName,championId
79995,7999,200,6,13979,595,Yorick,83
79996,7999,200,7,39643,38,Volibear,106
79997,7999,200,8,5570,498,Anivia,34
79998,7999,200,9,10228,733,Twitch,29
79999,7999,200,10,1684,574,Zilean,26


### Timeline Files - TRAINING

Initially I did some tests to see how to grab the information. Below is just the cleaned up and compiled code.

> NOTE: it takes a few minutes to run over 8000 files (like 5 min)...

For this first set of information, it's just enough to take the last frame (and therefore the highest / last updated values).

> NOTE: once we switched from logreg to neural networks I decided to just use all columns, so all of the ones that have ## I've added because of NN approach

In [12]:
training_data = pd.DataFrame()


for i in range(0,8000) :

    temp_data = pd.read_json(f'../data/train_timelines/train_timelines/timeline_{i}.json')
    temp_data = pd.json_normalize(temp_data['frames'])

    max_frame = len(temp_data) - 1

    for j in range(1, 11):
        temp_df = pd.DataFrame({'matchId': [i],
                   'participantId': [j],
                   'final_gold': temp_data[f'participantFrames.{j}.totalGold'][max_frame],
                   'final_xp': temp_data[f'participantFrames.{j}.xp'][max_frame],
                   'final_abilityhaste': temp_data[f'participantFrames.{j}.championStats.abilityHaste'][max_frame], ##
                   'final_abilitypower': temp_data[f'participantFrames.{j}.championStats.abilityPower'][max_frame], ##
                   'final_armor': temp_data[f'participantFrames.{j}.championStats.armor'][max_frame],
                   'final_armorpen': temp_data[f'participantFrames.{j}.championStats.armorPen'][max_frame], ##
                   'final_armorpenpercent': temp_data[f'participantFrames.{j}.championStats.armorPenPercent'][max_frame], ##
                   'final_atkdmg': temp_data[f'participantFrames.{j}.championStats.attackDamage'][max_frame],
                   'final_bns_armorpenpercent': temp_data[f'participantFrames.{j}.championStats.bonusArmorPenPercent'][max_frame], ##
                   'final_bns_magicpenpercent': temp_data[f'participantFrames.{j}.championStats.bonusMagicPenPercent'][max_frame], ##
                   'final_ccreduction': temp_data[f'participantFrames.{j}.championStats.ccReduction'][max_frame], ##
                   'final_cdreduction': temp_data[f'participantFrames.{j}.championStats.cooldownReduction'][max_frame], ##
                   'final_remaining_health': temp_data[f'participantFrames.{j}.championStats.health'][max_frame], ##                   
                   'final_health': temp_data[f'participantFrames.{j}.championStats.healthMax'][max_frame],
                   'final_healthrgn': temp_data[f'participantFrames.{j}.championStats.healthRegen'][max_frame],
                   'final_lifesteal': temp_data[f'participantFrames.{j}.championStats.lifesteal'][max_frame],
                   'final_mppen': temp_data[f'participantFrames.{j}.championStats.magicPen'][max_frame], ##
                   'final_mgpenpercent': temp_data[f'participantFrames.{j}.championStats.magicPenPercent'][max_frame],
                   'final_mgres': temp_data[f'participantFrames.{j}.championStats.magicResist'][max_frame],
                   'final_ms': temp_data[f'participantFrames.{j}.championStats.movementSpeed'][max_frame],
                   'final_omnivamp': temp_data[f'participantFrames.{j}.championStats.omnivamp'][max_frame], ##
                   'final_physicalvamp': temp_data[f'participantFrames.{j}.championStats.physicalVamp'][max_frame], ##
                   'final_power': temp_data[f'participantFrames.{j}.championStats.power'][max_frame], ##
                   'final_powermax': temp_data[f'participantFrames.{j}.championStats.powerMax'][max_frame], ##
                   'final_powerregen': temp_data[f'participantFrames.{j}.championStats.powerRegen'][max_frame], ##
                   'final_spellvamp': temp_data[f'participantFrames.{j}.championStats.spellVamp'][max_frame], ##
                   'final_currentgold': temp_data[f'participantFrames.{j}.currentGold'][max_frame], ##
                   'final_magicdmgdone': temp_data[f'participantFrames.{j}.damageStats.magicDamageDone'][max_frame], ##
                   'final_magicdmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.magicDamageDoneToChampions'][max_frame], ##
                   'final_magicdmgtaken': temp_data[f'participantFrames.{j}.damageStats.magicDamageTaken'][max_frame], ##
                   'final_physdmgdone': temp_data[f'participantFrames.{j}.damageStats.physicalDamageDone'][max_frame], ##
                   'final_physdmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.physicalDamageDoneToChampions'][max_frame], ##
                   'final_physdmgtaken': temp_data[f'participantFrames.{j}.damageStats.physicalDamageTaken'][max_frame], ##
                   'final_dmgdone': temp_data[f'participantFrames.{j}.damageStats.totalDamageDone'][max_frame], ##
                   'final_dmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.totalDamageDoneToChampions'][max_frame],
                   'final_dmgtaken': temp_data[f'participantFrames.{j}.damageStats.totalDamageTaken'][max_frame],
                   'final_truedmgdone': temp_data[f'participantFrames.{j}.damageStats.trueDamageDone'][max_frame], ##
                   'final_truedmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.trueDamageDoneToChampions'][max_frame],
                   'final_truedmgtaken': temp_data[f'participantFrames.{j}.damageStats.trueDamageTaken'][max_frame],
                   'final_goldpersec': temp_data[f'participantFrames.{j}.goldPerSecond'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_lvl': temp_data[f'participantFrames.{j}.level'][max_frame],
                   'final_minionskilled': temp_data[f'participantFrames.{j}.minionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_enemycontrolled': temp_data[f'participantFrames.{j}.timeEnemySpentControlled'][max_frame]
                   })
        

        training_data = pd.concat([training_data, temp_df], ignore_index = True)



In [13]:
# # saving that information to use later
training_data.to_csv('../data/train_last_frame_values.csv', index=False)

All potential columns (in case I decide to use more):

```
['participantFrames.{j}.championStats.abilityHaste',
 'participantFrames.{j}.championStats.abilityPower',
 'participantFrames.{j}.championStats.armor',
 'participantFrames.{j}.championStats.armorPen',
 'participantFrames.{j}.championStats.armorPenPercent',
 'participantFrames.{j}.championStats.attackDamage',
 'participantFrames.{j}.championStats.attackSpeed',
 'participantFrames.{j}.championStats.bonusArmorPenPercent',
 'participantFrames.{j}.championStats.bonusMagicPenPercent',
 'participantFrames.{j}.championStats.ccReduction',
 'participantFrames.{j}.championStats.cooldownReduction',
 'participantFrames.{j}.championStats.health',
 'participantFrames.{j}.championStats.healthMax',
 'participantFrames.{j}.championStats.healthRegen',
 'participantFrames.{j}.championStats.lifesteal',
 'participantFrames.{j}.championStats.magicPen',
 'participantFrames.{j}.championStats.magicPenPercent',
 'participantFrames.{j}.championStats.magicResist',
 'participantFrames.{j}.championStats.movementSpeed',
 'participantFrames.{j}.championStats.omnivamp',
 'participantFrames.{j}.championStats.physicalVamp',
 'participantFrames.{j}.championStats.power',
 'participantFrames.{j}.championStats.powerMax',
 'participantFrames.{j}.championStats.powerRegen',
 'participantFrames.{j}.championStats.spellVamp',
 'participantFrames.{j}.currentGold',
 'participantFrames.{j}.damageStats.magicDamageDone',
 'participantFrames.{j}.damageStats.magicDamageDoneToChampions',
 'participantFrames.{j}.damageStats.magicDamageTaken',
 'participantFrames.{j}.damageStats.physicalDamageDone',
 'participantFrames.{j}.damageStats.physicalDamageDoneToChampions',
 'participantFrames.{j}.damageStats.physicalDamageTaken',
 'participantFrames.{j}.damageStats.totalDamageDone',
 'participantFrames.{j}.damageStats.totalDamageDoneToChampions',
 'participantFrames.{j}.damageStats.totalDamageTaken',
 'participantFrames.{j}.damageStats.trueDamageDone',
 'participantFrames.{j}.damageStats.trueDamageDoneToChampions',
 'participantFrames.{j}.damageStats.trueDamageTaken',
 'participantFrames.{j}.goldPerSecond',
 'participantFrames.{j}.jungleMinionsKilled',
 'participantFrames.{j}.level',
 'participantFrames.{j}.minionsKilled',
 'participantFrames.{j}.participantId',
 'participantFrames.{j}.position.x',
 'participantFrames.{j}.position.y',
 'participantFrames.{j}.timeEnemySpentControlled',
 'participantFrames.{j}.totalGold',
 'participantFrames.{j}.xp']
 ```

For the "Events" instead, have to loop through all of the frames (see below). This is where I thought it would be useful to extract, per each team:
- wards placed
- ward kills
- turret plates destroyed
- elite monsters killed

In [14]:
training_events = pd.DataFrame()


for i in range(0,8000) :

    with open(f'../data/train_timelines/train_timelines/timeline_{i}.json') as training_timeline:
        timeline_contents = json.load(training_timeline)
    
    team100_wards = 0
    team100_ward_kills = 0
    team100_turretplates = 0
    team100_elitemonsters = 0
    team200_wards = 0
    team200_ward_kills = 0
    team200_turretplates = 0
    team200_elitemonsters = 0

    max_frame = len(timeline_contents['frames'])

    for frame in range(max_frame):
        for event in range(len(timeline_contents['frames'][frame]['events'])):
            if (timeline_contents['frames'][frame]['events'][event]['type'] == 'WARD_PLACED'):
                if (timeline_contents['frames'][frame]['events'][event]['creatorId']) in (1,2,3,4,5):
                    team100_wards = team100_wards + 1
                else:
                    team200_wards = team200_wards + 1
            
            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'WARD_KILL'):
                if (timeline_contents['frames'][frame]['events'][event]['killerId']) in (1,2,3,4,5):
                    team100_ward_kills = team100_ward_kills + 1
                else:
                    team200_ward_kills = team200_ward_kills + 1

            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'TURRET_PLATE_DESTROYED'):
                if (timeline_contents['frames'][frame]['events'][event]['killerId']) in (1,2,3,4,5):
                    team100_turretplates = team100_turretplates + 1
                else:
                    team200_turretplates = team200_turretplates + 1

            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'ELITE_MONSTER_KILL'):
                if timeline_contents['frames'][frame]['events'][event]['killerId'] in (1,2,3,4,5):
                    team100_elitemonsters = team100_elitemonsters + 1
                else:
                    team200_elitemonsters = team200_elitemonsters + 1

    temp_df1 = pd.DataFrame({'matchId': [i],
                'teamId': 100,
                'wards_placed': team100_wards,
                'wards_killed': team100_ward_kills,
                'turretplates_destroyed': team100_turretplates,
                'elite_monsters_killed': team100_elitemonsters
                })
    
    temp_df2 = pd.DataFrame({'matchId': [i],
                'teamId': 200,
                'wards_placed': team200_wards,
                'wards_killed': team200_ward_kills,
                'turretplates_destroyed': team200_turretplates,
                'elite_monsters_killed': team200_elitemonsters
                })
        

    training_events = pd.concat([training_events, temp_df1], ignore_index = True)
    training_events = pd.concat([training_events, temp_df2], ignore_index = True)

In [15]:
# save to file
training_events.to_csv('../data/training_events.csv', index=False)

### Timeline Files - TEST

The same exact file extractions and saving performed on the test data below.

In [16]:
testing_data = pd.DataFrame()


for i in range(8000,10000) :

    temp_data = pd.read_json(f'../data/test_timelines/test_timelines/timeline_{i}.json')
    temp_data = pd.json_normalize(temp_data['frames'])

    max_frame = len(temp_data) - 1

    for j in range(1, 11):
        temp_df = pd.DataFrame({'matchId': [i],
                   'participantId': [j],
                   'final_gold': temp_data[f'participantFrames.{j}.totalGold'][max_frame],
                   'final_xp': temp_data[f'participantFrames.{j}.xp'][max_frame],
                   'final_abilityhaste': temp_data[f'participantFrames.{j}.championStats.abilityHaste'][max_frame], ##
                   'final_abilitypower': temp_data[f'participantFrames.{j}.championStats.abilityPower'][max_frame], ##
                   'final_armor': temp_data[f'participantFrames.{j}.championStats.armor'][max_frame],
                   'final_armorpen': temp_data[f'participantFrames.{j}.championStats.armorPen'][max_frame], ##
                   'final_armorpenpercent': temp_data[f'participantFrames.{j}.championStats.armorPenPercent'][max_frame], ##
                   'final_atkdmg': temp_data[f'participantFrames.{j}.championStats.attackDamage'][max_frame],
                   'final_bns_armorpenpercent': temp_data[f'participantFrames.{j}.championStats.bonusArmorPenPercent'][max_frame], ##
                   'final_bns_magicpenpercent': temp_data[f'participantFrames.{j}.championStats.bonusMagicPenPercent'][max_frame], ##
                   'final_ccreduction': temp_data[f'participantFrames.{j}.championStats.ccReduction'][max_frame], ##
                   'final_cdreduction': temp_data[f'participantFrames.{j}.championStats.cooldownReduction'][max_frame], ##
                   'final_remaining_health': temp_data[f'participantFrames.{j}.championStats.health'][max_frame], ##                   
                   'final_health': temp_data[f'participantFrames.{j}.championStats.healthMax'][max_frame],
                   'final_healthrgn': temp_data[f'participantFrames.{j}.championStats.healthRegen'][max_frame],
                   'final_lifesteal': temp_data[f'participantFrames.{j}.championStats.lifesteal'][max_frame],
                   'final_mppen': temp_data[f'participantFrames.{j}.championStats.magicPen'][max_frame], ##
                   'final_mgpenpercent': temp_data[f'participantFrames.{j}.championStats.magicPenPercent'][max_frame],
                   'final_mgres': temp_data[f'participantFrames.{j}.championStats.magicResist'][max_frame],
                   'final_ms': temp_data[f'participantFrames.{j}.championStats.movementSpeed'][max_frame],
                   'final_omnivamp': temp_data[f'participantFrames.{j}.championStats.omnivamp'][max_frame], ##
                   'final_physicalvamp': temp_data[f'participantFrames.{j}.championStats.physicalVamp'][max_frame], ##
                   'final_power': temp_data[f'participantFrames.{j}.championStats.power'][max_frame], ##
                   'final_powermax': temp_data[f'participantFrames.{j}.championStats.powerMax'][max_frame], ##
                   'final_powerregen': temp_data[f'participantFrames.{j}.championStats.powerRegen'][max_frame], ##
                   'final_spellvamp': temp_data[f'participantFrames.{j}.championStats.spellVamp'][max_frame], ##
                   'final_currentgold': temp_data[f'participantFrames.{j}.currentGold'][max_frame], ##
                   'final_magicdmgdone': temp_data[f'participantFrames.{j}.damageStats.magicDamageDone'][max_frame], ##
                   'final_magicdmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.magicDamageDoneToChampions'][max_frame], ##
                   'final_magicdmgtaken': temp_data[f'participantFrames.{j}.damageStats.magicDamageTaken'][max_frame], ##
                   'final_physdmgdone': temp_data[f'participantFrames.{j}.damageStats.physicalDamageDone'][max_frame], ##
                   'final_physdmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.physicalDamageDoneToChampions'][max_frame], ##
                   'final_physdmgtaken': temp_data[f'participantFrames.{j}.damageStats.physicalDamageTaken'][max_frame], ##
                   'final_dmgdone': temp_data[f'participantFrames.{j}.damageStats.totalDamageDone'][max_frame], ##
                   'final_dmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.totalDamageDoneToChampions'][max_frame],
                   'final_dmgtaken': temp_data[f'participantFrames.{j}.damageStats.totalDamageTaken'][max_frame],
                   'final_truedmgdone': temp_data[f'participantFrames.{j}.damageStats.trueDamageDone'][max_frame], ##
                   'final_truedmgdonetochamps': temp_data[f'participantFrames.{j}.damageStats.trueDamageDoneToChampions'][max_frame],
                   'final_truedmgtaken': temp_data[f'participantFrames.{j}.damageStats.trueDamageTaken'][max_frame],
                   'final_goldpersec': temp_data[f'participantFrames.{j}.goldPerSecond'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_lvl': temp_data[f'participantFrames.{j}.level'][max_frame],
                   'final_minionskilled': temp_data[f'participantFrames.{j}.minionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_jungleminionskilled': temp_data[f'participantFrames.{j}.jungleMinionsKilled'][max_frame], ##
                   'final_enemycontrolled': temp_data[f'participantFrames.{j}.timeEnemySpentControlled'][max_frame]
                   })
        

        testing_data = pd.concat([testing_data, temp_df], ignore_index = True)



In [17]:
testing_data.to_csv('../data/test_last_frame_values.csv', index=False)

In [18]:
testing_events = pd.DataFrame()


for i in range(8000,10000) :

    with open(f'../data/test_timelines/test_timelines/timeline_{i}.json') as testing_timeline:
        timeline_contents = json.load(testing_timeline)
    
    team100_wards = 0
    team100_ward_kills = 0
    team100_turretplates = 0
    team100_elitemonsters = 0
    team200_wards = 0
    team200_ward_kills = 0
    team200_turretplates = 0
    team200_elitemonsters = 0

    max_frame = len(timeline_contents['frames'])

    for frame in range(max_frame):
        for event in range(len(timeline_contents['frames'][frame]['events'])):
            if (timeline_contents['frames'][frame]['events'][event]['type'] == 'WARD_PLACED'):
                if (timeline_contents['frames'][frame]['events'][event]['creatorId']) in (1,2,3,4,5):
                    team100_wards = team100_wards + 1
                else:
                    team200_wards = team200_wards + 1
            
            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'WARD_KILL'):
                if (timeline_contents['frames'][frame]['events'][event]['killerId']) in (1,2,3,4,5):
                    team100_ward_kills = team100_ward_kills + 1
                else:
                    team200_ward_kills = team200_ward_kills + 1

            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'TURRET_PLATE_DESTROYED'):
                if (timeline_contents['frames'][frame]['events'][event]['killerId']) in (1,2,3,4,5):
                    team100_turretplates = team100_turretplates + 1
                else:
                    team200_turretplates = team200_turretplates + 1

            elif (timeline_contents['frames'][frame]['events'][event]['type'] == 'ELITE_MONSTER_KILL'):
                if timeline_contents['frames'][frame]['events'][event]['killerId'] in (1,2,3,4,5):
                    team100_elitemonsters = team100_elitemonsters + 1
                else:
                    team200_elitemonsters = team200_elitemonsters + 1

    temp_df1 = pd.DataFrame({'matchId': [i],
                'teamId': 100,
                'wards_placed': team100_wards,
                'wards_killed': team100_ward_kills,
                'turretplates_destroyed': team100_turretplates,
                'elite_monsters_killed': team100_elitemonsters
                })
    
    temp_df2 = pd.DataFrame({'matchId': [i],
                'teamId': 200,
                'wards_placed': team200_wards,
                'wards_killed': team200_ward_kills,
                'turretplates_destroyed': team200_turretplates,
                'elite_monsters_killed': team200_elitemonsters
                })
        

    testing_events = pd.concat([testing_events, temp_df1], ignore_index = True)
    testing_events = pd.concat([testing_events, temp_df2], ignore_index = True)

In [19]:
testing_events.to_csv('../data/testing_events.csv', index=False)

## Incorporating LoL patches

After the competition was done, I talked to Hayden Greer from the other group and he told me where to find win probabilities of champions (I had no idea which website to go to/ where to find it).

In [20]:
from datetime import datetime

In [21]:
matchdates = pd.read_csv('../data/matchdates.csv', parse_dates=['matchcreationDate'])

In [22]:
print(matchdates['matchcreationDate'].min())
print(matchdates['matchcreationDate'].max())


2023-01-18 00:00:00
2023-03-13 00:00:00


In [84]:
conditions = [
    (matchdates['matchcreationDate'] >= '2023-01-11') & (matchdates['matchcreationDate'] < '2023-02-09'),
    (matchdates['matchcreationDate'] >= '2023-02-09') & (matchdates['matchcreationDate'] < '2023-02-23'),
    (matchdates['matchcreationDate'] >= '2023-02-23') & (matchdates['matchcreationDate'] < '2023-03-08'),
    matchdates['matchcreationDate'] >= '2023-03-08'
]

# LoL Patch 13.1	Wednesday, Jan 11, 2023
# LoL Patch 13.3	Thursday, Feb 9, 2023
# LoL Patch 13.4	Thursday, Feb 23, 2023
# LoL Patch 13.5	Wednesday, March 8, 2023

patch_values = ['13.1', '13.3', '13.4', '13.5']

matchdates['patch_version'] = np.select(conditions, patch_values)

In [85]:
matchdates.loc[matchdates['matchId'] < 8000]['patch_version'].value_counts()

13.5    5827
13.4    1665
13.3     445
13.1      63
Name: patch_version, dtype: int64

In [86]:
matchdates.loc[matchdates['matchId'] >= 8000]['patch_version'].value_counts()

13.5    1487
13.4     397
13.3     102
13.1      14
Name: patch_version, dtype: int64

In [98]:
# save this directly for use later; no need to bring the dates along
matchdates[['matchId', 'patch_version']].to_csv('../data/match_patch.csv', index=False)

In [38]:
pd.merge(X_train, matchdates, how='inner', on='matchId', indicator=True)
#matchdates X_train

Unnamed: 0,matchId,teamId,participantId,summonerId,summonerLevel,championName,championId,matchcreationDate,patch_version,_merge
0,0,100,1,0,303,Mordekaiser,82,2023-03-11,13.5,both
1,0,100,2,1,616,Sylas,517,2023-03-11,13.5,both
2,0,100,3,2,667,Lissandra,127,2023-03-11,13.5,both
3,0,100,4,3,860,Caitlyn,51,2023-03-11,13.5,both
4,0,100,5,4,325,Morgana,25,2023-03-11,13.5,both
...,...,...,...,...,...,...,...,...,...,...
79995,7999,200,6,13979,595,Yorick,83,2023-03-06,13.4,both
79996,7999,200,7,39643,38,Volibear,106,2023-03-06,13.4,both
79997,7999,200,8,5570,498,Anivia,34,2023-03-06,13.4,both
79998,7999,200,9,10228,733,Twitch,29,2023-03-06,13.4,both


In [43]:
import requests
from bs4 import BeautifulSoup as BS

In [82]:
# this first attempt didn't work, because the website blocked my attempts
# import time
# champion='Mordekaiser'
# patch_version=13_5
# endpoint = f'https://u.gg/lol/champions/{champion}/build?rank=diamond_plus&patch={patch_version}'

# params = {
# 'rank' : 'diamond_plus',
# 'patch_version' : 13_5
# }

# tm = 0

# while response.status_code != 200:
#     print('waiting for ' + str(champion))
#     time.sleep(10)
#     response = requests.get(endpoint, params = params)
#     tm = tm + 1
#     if tm == 10:
#         break

# print(endpoint)
# print(response.status_code)

# soup = BS(response.text)

Note: the table has more than all champions because it has different stats depending on the role the champion was played in.

Furthermore, I need to:
- Add the patch the portion of the table is relevant to (it's going to be used in join later)
- For some reason names are doubled, so need to shorten them to 1
- change the Role to match the description we had available in the other table. - found out that there's lots of cases where people didn't go the role that is captured in the winrate tables; in fact, there's WR missing for some champions

In [110]:
champion_wr_stats = pd.DataFrame()

for patch in patch_values:
    endpoint = f'https://www.metasrc.com/5v5/br/{patch}/stats?ranks=diamond'

    response = requests.get(endpoint)

    soup = BS(response.text)

    # read the table with the stats, relative to the patch 
    temp_df = pd.read_html(str(soup.find('table', attrs={'class' : 'stats-table'})))[0]
    temp_df['Patch_Version'] = patch

    champion_wr_stats = pd.concat([champion_wr_stats, temp_df])

In [111]:
# fix the names
champion_wr_stats['Name'] = [x[0:int(len(x)/2)] for x in champion_wr_stats['Name']]

# additional fixes to names due to how they're in the other table; no spaces or '
champion_wr_stats['Name'] = champion_wr_stats['Name'].str.replace(' ','')
champion_wr_stats['Name'] = champion_wr_stats['Name'].str.replace("'","")

# fix the roles to align with other table
champion_wr_stats['Role'] = champion_wr_stats['Role'].replace({'MID': 'MIDDLE', 'SUPPORT': 'UTILITY', 'ADC': 'BOTTOM'})

In [112]:
# to go ahead and remove the % to make them numeric
champion_wr_stats['Win %'] = champion_wr_stats['Win %'].str.strip('%')
champion_wr_stats['Role %'] = champion_wr_stats['Role %'].str.strip('%')
champion_wr_stats['Pick %'] = champion_wr_stats['Pick %'].str.strip('%')
champion_wr_stats['Ban %'] = champion_wr_stats['Ban %'].str.strip('%')

In [113]:
champion_wr_stats.to_csv('../data/champion_wr_stats.csv', index=False)

In [30]:
X_train.head()

Unnamed: 0,matchId,teamId,participantId,summonerId,summonerLevel,championName,championId
0,0,100,1,0,303,Mordekaiser,82
1,0,100,2,1,616,Sylas,517
2,0,100,3,2,667,Lissandra,127
3,0,100,4,3,860,Caitlyn,51
4,0,100,5,4,325,Morgana,25


In [31]:
X_train_champsencoded = pd.get_dummies(X_train, prefix='Champion')

In [32]:
sdf =  ['asd', 'fff']

In [33]:
aaa = list(X_train_champsencoded.drop(columns=['matchId', 'teamId', 'participantId', 'summonerId', 'summonerLevel', 'championId']).columns.values)

In [34]:
sdf + aaa

['asd',
 'fff',
 'Champion_Aatrox',
 'Champion_Ahri',
 'Champion_Akali',
 'Champion_Akshan',
 'Champion_Alistar',
 'Champion_Amumu',
 'Champion_Anivia',
 'Champion_Annie',
 'Champion_Aphelios',
 'Champion_Ashe',
 'Champion_AurelionSol',
 'Champion_Azir',
 'Champion_Bard',
 'Champion_Belveth',
 'Champion_Blitzcrank',
 'Champion_Brand',
 'Champion_Braum',
 'Champion_Caitlyn',
 'Champion_Camille',
 'Champion_Cassiopeia',
 'Champion_Chogath',
 'Champion_Corki',
 'Champion_Darius',
 'Champion_Diana',
 'Champion_DrMundo',
 'Champion_Draven',
 'Champion_Ekko',
 'Champion_Elise',
 'Champion_Evelynn',
 'Champion_Ezreal',
 'Champion_FiddleSticks',
 'Champion_Fiora',
 'Champion_Fizz',
 'Champion_Galio',
 'Champion_Gangplank',
 'Champion_Garen',
 'Champion_Gnar',
 'Champion_Gragas',
 'Champion_Graves',
 'Champion_Gwen',
 'Champion_Hecarim',
 'Champion_Heimerdinger',
 'Champion_Illaoi',
 'Champion_Irelia',
 'Champion_Ivern',
 'Champion_Janna',
 'Champion_JarvanIV',
 'Champion_Jax',
 'Champion_Jayce