<img src="https://i.postimg.cc/1R2d17XR/logo-800-47024e2aeaaa8651c172ba883264dd43.png" alt="Italian Trulli">


**WHAT IS LEAGUE OF LEGENDS?**

League of Legends is a team-based strategy game where two teams of five powerful champions face off to destroy the other’s base. Choose from over 140 champions to make epic plays, secure kills, and take down towers as you battle your way to victory.

**DESTROY THE BASE**

The Nexus is the heart of both teams’ bases. Destroy the enemy’s Nexus first to win the game.

**CLEAR THE PATH**

Your team needs to clear at least one lane to get to the enemy Nexus. Blocking your path are defense structures called turrets and inhibitors. Each lane has three turrets and one inhibitor, and each Nexus is guarded by two turrets.

**TAKE ON THE JUNGLE**

In between the lanes is the jungle, where neutral monsters and jungle plants reside. The two most important monsters are Baron Nashor and the Drakes. Killing these units grants unique buffs for your team and can also turn the tide of the game.

**CHOOSE THE LANE**

There are five positions that make up the recommended team comp for the game. Each lane lends itself to certain kinds of champions and roles—try them all or lock in to the lane that calls you.

**POWER THE CHAMP**

Champions get stronger by earning experience to level up and gold to buy more powerful items as the game progresses. Staying on top of these two factors is crucial to overpowering the enemy team and destroying their base.

**UNLOCK ABILITIES**

Champions have five core abilities, two special spells, and up to seven items at a time. Figuring out the optimal ability order, summoner spells, and item build for your champion will help you succeed as a team.

<a href="https://na.leagueoflegends.com/en-us/how-to-play/">League of Legends Official Website</a>


<iframe width="720" height="480" src="https://www.youtube.com/embed/BGtROJeMPeE" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

This dataset contains the first 10min. stats of approx. 10k ranked games (SOLO QUEUE) from a high ELO (DIAMOND I to MASTER). Players have roughly the same level.

Each game is unique. The gameId can help you to fetch more attributes from the Riot API.

There are 19 features per team (38 in total) collected after 10min in-game. This includes kills, deaths, gold, experience, level…

The column blueWins is the target value (the value we are trying to predict). A value of 1 means the blue team has won. 0 otherwise.

There is no missing value.

<a href="https://www.kaggle.com/bobbyscience/league-of-legends-diamond-ranked-games-10-min">Download Here</a>

In this notebook i will compare all blue side and red side features and see if there is some insight can define win or lose for both sides.

I will define and treat each objctive separatly for both sides and end it by train model to predict win or lose from first 10 minutes data and judge the greatness of model accuracy.

LETS GO !

<img src="https://i.postimg.cc/fyqGwySV/Banni-re-lol-fi19533912x700.jpg">

# Contents

* [<font size=4>1. Libraries for Fun</font>](#1)
* [<font size=4>2. First Look at the Data</font>](#2)
* [<font size=4>3. Data Report</font>](#3)
* [<font size=4>4. Comparison of Blue Side and Red Side Features</font>](#4)
 *     [4.1 Wins and Loses](#4.1)
 *     [4.2 Wards](#4.2)
        *     [4.2.1 Wards Placed](#4.2.1)
        *     [4.2.2 Wards Destroyed](#4.2.1)
 *     [4.3 First Bloods](#4.3)
 *     [4.4 Kill, Deaths and Assists](#4.4)
 *     [4.5 Neutral Goals](#4.5)
 *     [4.6 Turrets](#4.6)
 *     [4.7 Golds](#4.7)
 *     [4.8 Leveling and Experiance](#4.8)
 *     [4.9 Minions Farme](#4.9)
 *     [4.10 Gold Difference](#4.10)
 *     [4.11 Experience Difference](#4.11)
 *     [4.12 Creep Score (CS)](#4.12)
 *     [4.13 Gold Per Minute](#4.13)
* [<font size=4>5. Winning Condition</font>](#5)
 *     [5.1 Blue Side Heatmap](#5.1)
 *     [5.2 Red Side Heatmap](#5.2)
 *     [5.3 Wins and Loses Correlation](#5.3)
* [<font size=4>6. Modelisation</font>](#6)
* [<font size=4>7. Prediction and Reality</font>](#7)
* [<font size=4>8. Conclusion</font>](#8)

# 1. Libraries for Fun <a id="1"></a>

In [None]:
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)    #THIS LINE IS MOST IMPORTANT AS THIS WILL DISPLAY PLOT ON 
#NOTEBOOK WHILE KERNEL IS RUNNING

import numpy as np 
import pandas as pd
import plotly as py
import plotly.express as px
import plotly.graph_objs as go
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import seaborn as sns
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import lightgbm as lgb
from bayes_opt import BayesianOptimization
from sklearn.metrics import roc_auc_score
import shap
from pandas_profiling import ProfileReport 

import warnings  
warnings.filterwarnings('ignore')

# 2. First Look at the Data <a id="2"></a>

In [None]:
df = pd.read_csv('../input/league-of-legends-diamond-ranked-games-10-min/high_diamond_ranked_10min.csv')
df['blueWins'] = df['blueWins'].map({1: 'Blue Side', 0:'Red Side'})

isneg = []
for i in df.blueGoldDiff :
    if i < 0 :
        isneg.append(0)
    else :
        isneg.append(1)
df['blueGoldDiffSituation'] = isneg
isneg = []
for i in df.redGoldDiff :
    if i < 0 :
        isneg.append(0)
    else :
        isneg.append(1)
df['redGoldDiffSituation'] = isneg

isneg = []
for i in df.blueExperienceDiff :
    if i < 0 :
        isneg.append(0)
    else :
        isneg.append(1)
df['blueExperienceDiffSituation'] = isneg
isneg = []
for i in df.redExperienceDiff :
    if i < 0 :
        isneg.append(0)
    else :
        isneg.append(1)
df['redExperienceDiffSituation'] = isneg

In [None]:
display(df.info())
df.head()

# 3. Data Report <a id="3"></a>

In [None]:
report_data = ProfileReport(df.sample(2000))
report_data

# 4. Comparison of Blue Side and Red Side Features <a id="4"></a>

## 4.1 Wins and Loses <a id="4.1"></a>

<img src="https://i.postimg.cc/GpkNNqcg/Dead-Fabulous-Europeanfiresalamander-size-restricted.gif" align="right" width="300" height="200">
To win in league of legends you have to destroy the opposing nexus and that by gradually destroying the opposing turrets and the opposing inibitors.
<BR CLEAR=”left” />


In [None]:
color_discrete_map = {'Blue Side': 'rgb(122, 148, 231)', 'Red Side': 'rgb(255, 105, 97)'}

fig = px.histogram(df, x="blueWins",color = 'blueWins', color_discrete_map=color_discrete_map,
                  labels={
                     "blueWins": "Sides","count": "Wins",
                 },
                title="Total Wins per Side",
                hover_name="blueWins",       
                  )
# fig.show()
py.offline.iplot(fig)

Balanced wins and loses on the blue and red side, which is good news, there is no imbalance that occurs on one side more than the other.

## 4.2 Wards <a id="4.2"></a>

<img src="https://i.postimg.cc/nL28r06j/Warding-Totem-screenshot.png" align="right" width="300" height="200">
A ward is a deployable unit that removes the fog of war in a certain area of the map. Vision is not just buying and placing wards. It is not always a unit and it can be granted through champions as well as items that they purchase.
<BR CLEAR=”left” />

<a href="https://mobalytics.gg/blog/warding-guide/#:~:text=A%20ward%20is%20a%20deployable,two%20different%20types%20of%20trinkets.">Vision in League of Legends</a>

## 4.2.1 Wards Placed <a id="4.2.1"></a>

In [None]:
color_discrete_map = {'Blue': 'rgb(122, 148, 231)', 'Red': 'rgb(255, 105, 97)'}

layout = go.Layout(
    yaxis=dict(
        range=[5, 45]
    ),
    xaxis=dict(
        range=[100, 200]
    )
)

tmp1 = df[['blueWardsPlaced', 'blueWardsDestroyed']].copy()
tmp1.columns = ['WardsPlaced','WardsDestroyed']
tmp1 = tmp1.astype(float)
tmp1['Side'] = 'Blue'
tmp2 = df[['redWardsPlaced', 'redWardsDestroyed']].copy()
tmp2.columns = ['WardsPlaced','WardsDestroyed']
tmp2 = tmp2.astype(float)
tmp2['Side'] = 'Red'
data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="WardsPlaced", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Wards Placed per Side')
fig2 = px.violin(data, y="WardsPlaced", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Wards Placed per Side Zoomed')
fig2.update_layout(
    yaxis=dict(
        range=[5, 45]
    )
)
fig1.show()
fig2.show()

Both sides mostly place 16 wards for the first 10 minutes and it varies between 10 and 25 wards.

However, there are some teams that place up to 40 wards for 10 minutes which is awful.

Outlier values remain, it is necessarily players who tilt and who put wards in their bases while waiting for the rapid defeat.

The distribution of the number of wards placed is equivalent between the two sides.

## 4.2.2 Wards Destroyed <a id="4.2.2"></a>

In [None]:
fig1 = px.violin(data, y="WardsDestroyed", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Wards Destroyed per Side')
fig2 = px.violin(data, y="WardsDestroyed", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Wards Destroyed per Side Zoomed')
fig2.update_layout(
    yaxis=dict(
        range=[0, 10]
    )
)
fig1.show()
fig2.show()

On average 2 wards are destroyed per side and this varies between 0 and 6.

With the zoom we can see that the blue side destroys slightly more wards than the blue side but it is incinifying.

## 4.3 First Bloods <a id="4.3"></a>

It is the first champion kill of the game.

In [None]:
color_discrete_map = {'Blue': 'rgb(122, 148, 231)', 'Red': 'rgb(255, 105, 97)'}

tmp1 = df[['blueFirstBlood']].copy()
tmp1.columns = ['FirstBloods']
tmp1 = tmp1.astype(float)
tmp1['Side'] = 'Blue'
tmp2 = df[['redFirstBlood']].copy()
tmp2.columns = ['FirstBloods']
tmp2 = tmp2.astype(float)
tmp2['Side'] = 'Red'
data = pd.concat([tmp1, tmp2])
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x='Side', y='FirstBloods',color = 'Side', color_discrete_map=color_discrete_map, title = 'Mean First Bloods per Side')
fig.show()

A very light advantage for the blue side (95 more), but the number of first blood is relatively equivalent between two sides.

## 4.4 Kill, Deaths and Assists (KDA) <a id="4.4"></a>

* 1 kill corresponds to each time a player kills another player in the game.

* 1 death corresponds to each time a player is killed by another in the game.

* 1 assiss corresponds to each time a player helps another to kill another player in the game.

In [None]:
col = ['blueKills','blueDeaths','blueAssists','redKills','redDeaths','redAssists']
tmp1 = df[col[0:3]].copy()
tmp1.columns = ['Kills','Death','Assistes']
tmp1['Side'] = 'Blue'
tmp2 = df[col[3:6]].copy()
tmp2.columns = ['Kills','Death','Assistes']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()
data = pd.melt(data, id_vars=['Side'], value_vars=['Kills','Death','Assistes'])
data.columns = ['Side','KDA','Mean']

fig = px.bar(data, x="Side", y="Mean", color="Side", color_discrete_map=color_discrete_map,
             facet_col="KDA", title = 'Mean KDA per Side'
#              category_orders={"day": ["Thur", "Fri", "Sat", "Sun"],
#                               "time": ["Lunch", "Dinner"]}
            )
fig.show()

In [None]:
col = ['blueKills','blueDeaths','blueAssists','redKills','redDeaths','redAssists']
tmp1 = df[col[0:3]].copy()
tmp1.columns = ['Kills','Death','Assistes']
tmp1['Side'] = 'Blue'
tmp2 = df[col[3:6]].copy()
tmp2.columns = ['Kills','Death','Assistes']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
# data = data.groupby('Side').sum().reset_index()
data = pd.melt(data, id_vars=['Side'], value_vars=['Kills','Death','Assistes'])
data.columns = ['Side','KDA','Total']

fig = make_subplots(rows=1, cols=2, specs=[[{'type':'domain'}, {'type':'domain'}]])
fig.add_trace(go.Pie(labels=data[data['Side'] == 'Red']['KDA'], values=data[data['Side'] == 'Red']['Total'], name="Red Side"),
              1, 1)
fig.add_trace(go.Pie(labels=data[data['Side'] == 'Blue']['KDA'], values=data[data['Side'] == 'Blue']['Total'], name="Blue Side"),
              1, 2)

fig.update_traces(hole=.4, hoverinfo="label+percent+name+value")

fig.update_layout(
    title_text="KDA Proportion per Side",
    annotations=[dict(text='Red', x=0.18, y=0.5, font_size=20, showarrow=False),
                 dict(text='Blue', x=0.82, y=0.5, font_size=20, showarrow=False)])
fig.show()

In [None]:
fig = go.Figure()

fig.add_trace(go.Violin(x=data['KDA'][ data['Side'] == 'Blue' ],
                        y=data['Total'][ data['Side'] == 'Blue' ],
                        legendgroup='Blue', scalegroup='Blue', name='Blue',
                        line_color='rgb(122, 148, 231)')
             )
fig.add_trace(go.Violin(x=data['KDA'][ data['Side'] == 'Red' ],
                        y=data['Total'][ data['Side'] == 'Red' ],
                        legendgroup='Red', scalegroup='Red', name='Red',
                        line_color='rgb(255, 105, 97)')
             )

fig.update_traces(box_visible=True, meanline_visible=True)
fig.update_layout(violinmode='group', title = 'KDA Distribution per Side')
fig.show()

The red side is slightly lower in kill but it is slightly higher in death and assists and vice versa for the blue side.

But generally it remains equivalent for each side.

However the percentage of kill red side is equal to the percentage of death blue side which is normal because 1 blue kill = 1 red death and 1 red death = 1 blue death, that means the blue side has more kills. but the assistants remain equivalent.

Finally an important variance for both sides.

## 4.5 Neutral Goals <a id="4.5"></a>

Neutral objectives are non-playable characters who do not belong to either side and who are hostile to both sides and who offer rewards when killed.

there are 4 categories of monsters which are part of neutral monsters:

<img src="https://i.postimg.cc/PJrRCwgt/article-header-monstericon-en.jpg" align="right" width="500" height="300">

1. **Elite Monsters :**

    *     **Gromp** begins fights with 100% bonus attack speed and full attack damage, decaying over 5 attacks to 0% bonus attack speed and 71.43% attack damage. Gromp's basic attacks deal magic damage.
    * The **Raptor** camp consists of one large Crimson  Crimson Raptor and five lesser Raptors. Killing this camp awards Gold Gold and Experience.
    * The **Murk Wolf** camp consists of one large Greater Murk Greater Murk Wolf and two lesser Murk Murk Wolves. Killing this camp awards Gold Gold and Experience.
    * The **Blue Sentinel** is a neutral monster on Summoner's Rift. It has negative magic resistance. Kill the Blue Sentinel to receive the Crest of Insight buff Crest of Insight, a buff which grants increased mana regeneration (or energy regeneration) as well as cooldown reduction.
    * The **Red Brambleback** is a neutral monster on Summoner's Rift. It has negative armor. Kill the Red Brambleback to receive the Crest of Cinders buff Crest of Cinders, a buff which grants health regeneration and causes your basic attacks to Slow the enemy while also dealing true damage over time.
    * The **Krug** camp consists of one large Ancient Ancient Krug and one medium  Krug. Killing the Ancient Ancient Krug will spawn two medium Krugs, and killing medium  Krugs will spawn two lesser Krugs, for a total of 10 units. Killing this camp awards Gold Gold and Experience. All additional units spawned will spawn with one lower level (capped at level 1) and will despawn after being out of combat for some time.
    

<BR CLEAR=”left” />


<img src="https://i.postimg.cc/XJGvTsN5/unnamed.jpg" align="right" width="500" height="300">

2. **Dragons :**

    * The buff provided by **Cloud Drake** is useful for both teams regardless of their gold lead or deficit; leading teams can create pressure, split push, and rotate around the map faster for positioning, while trailing teams can still use it to expand vision, respond to split pushing, and position defensively. Even with a single stack, Cloud Drake can be very powerful in split push compositions, especially if it's denied to the enemy team.
    * The buff provided by **Infernal Drake** is considered by many players the strongest, due to its raw power boost. If viable, contest the dragon at all costs. In the early game it doesn't make much difference, but its power in late game is massive.
    * The buff provided by **Mountain Drake** allows teams more durability for extended fights, and its effectiveness increases the more resistances the team builds - it can be useful to build more armor and magic resistance when you have multiple of this buff.
    * The buff provided by **Ocean Drake** is deceptively powerful during the laning phase: the healing during downtime allows less assertive laners to extend their presence. Another benefit is its ability to mitigate damage against poke teams, forcing the enemy team to spend more mana and delaying or even preventing potential flanks.
    * Elder Dragon is a much more powerful caliber of dragon, requiring multiple champions to secure quickly.
    * Due to its durability and the strength of the reward, attempting to kill an Elder Dragons usually prompts the enemy team to contest the kill. Therefore, teams wishing to slay it must take great caution when doing so while the majority of the enemy team is alive, much like Baron Baron Nashor.
    
<BR CLEAR=”left” />


<img src="https://i.postimg.cc/rw1xHFZT/image.jpg" align="right" width="250" height="150">

3. **Herald :**

    * The Eye of the Herald is a neutral buff dropped by the Rift Herald when it dies. It is available only on Summoner's Rift. Only members of the team that killed the Rift Herald can pick it up by walking on it. If nobody picks it up, it disappears after 20 seconds.The Eye also replaces the trinket item in your trinket slot. You can break the Eye to summon the Rift Herald to aid your team. Enemies hear the Rift Herald's cry as an audio cue when the Rift Herald is summoned.

<BR CLEAR=”left” />

<br><br><br><br><br><br>

<img src="https://i.postimg.cc/FsBhQjqg/17f85de0f06f604ceaee89899d76e4ea.jpg" align="right" width="400" height="300">

4. **Baron Nashor :**

    * Baron Nashor is the most powerful neutral Monster on Summoners Rift in League of Legends. Killing Baron Nashor grants Hand of Baron buff Hand of Baron to all living teammates for 180 seconds. The buff gives Attack damage  attack damage, Ability power ability power, Empowered Recall Empowered Recall, and an aura that greatly increases the power of nearby Minion.

ps : Baron Nashor appears from 20 minutes into the game, so we won't see him here = /

<BR CLEAR=”left” />

<a href="https://leagueoflegends.fandom.com/wiki/League_of_Legends_Wiki">League of Legends Wiki</a>

<br><br><br><br><br><br>

<img src="https://i.postimg.cc/ZKVxYqFL/Jungle-camps-SR.jpg" width="600" height="400">


In [None]:
col = ['blueEliteMonsters','blueDragons','blueHeralds','redEliteMonsters','redDragons','redHeralds']
tmp1 = df[col[0:3]].copy()
tmp1.columns = ['EliteMonsters','Dragons','Heralds']
tmp1['Side'] = 'Blue'
tmp2 = df[col[3:6]].copy()
tmp2.columns = ['EliteMonsters','Dragons','Heralds']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()
data=pd.melt(data, id_vars=['Side'], value_vars=['EliteMonsters','Dragons','Heralds'])
data.columns = ['Side','NeutralGoals','Mean']

fig = px.bar(data, x="Side", y="Mean", color="Side", color_discrete_map=color_discrete_map,
             facet_col="NeutralGoals", title = 'Mean Neutral Goals per Side'
            )
fig.show()

The red side kills more elite monsters and dragons and the blue side kills more herald.

The direction of the enclosure of some monsters is not symmetrical in the map which benefits one side more than the other.

## 4.6 Turrets <a id="4.6"></a>


<img src="https://i.postimg.cc/Z0tn4psq/Summoner-s-Rift-Outer-turret.png" align="right" width="200" height="400">
Turrets, also called towers, are heavy fortifications that attack enemy units on sight. Turrets are a core component of League of Legends. They deal damage to enemies and provide vision to their team, allowing them to better control the battlefield. Turrets target one unit at a time and deal heavy damage. Teams must destroy enemy turrets to push their assault into enemy territory.

<BR CLEAR=”left” />

<a href="https://leagueoflegends.fandom.com/wiki/League_of_Legends_Wiki">League of Legends Wiki</a>

In [None]:
col = ['blueTowersDestroyed', 'redTowersDestroyed']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['TowersDestroyed']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['TowersDestroyed']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()
data

fig = px.bar(data, x="Side", y="TowersDestroyed", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Turrets Destroyed per Side'
            )
fig.show()

The blue side destroys many more turrets than the red side, almost 100 more.

This may be due to the red side's focus on neutral objectives.

## 4.7 Golds <a id="4.7"></a>

<img src="https://i.postimg.cc/Z5zHDQcG/4e0c55dcd8468bb3076e149cd084ed99.jpg" align="right" width="300" height="200">

Gold is the in-game currency of League of Legends. It is used to buy items in the shop that provide champions with bonus stats and abilities, which in turn is one of the main ways for champions to increase their power over the course of a game.

At the beginning of the game, champions are given starting gold based on the map being played on, and can receive more gold through various means.

<BR CLEAR=”left” />

<a href="https://leagueoflegends.fandom.com/wiki/League_of_Legends_Wiki">League of Legends Wiki</a>



In [None]:
col = ['blueTotalGold', 'redTotalGold']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['TotalGold']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['TotalGold']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()


fig = px.bar(data, x="Side", y="TotalGold", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Gold per Side'
            )
fig.show()

data = pd.concat([tmp1, tmp2], ignore_index = True)
data

fig1 = px.violin(data, y="TotalGold", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Total Gold per Side Distribution')
fig1.show()

The number of gold won for each side is equivalent however there is a little more variance of gold on the blue side.

## 4.8 Leveling and Experiance <a id="4.8"></a>

Champion experience (XP) is a game mechanic that allows champions to level up after reaching certain amounts of experience. Leveling up allows them access to new abilities or higher ranks of existing abilities. Many base stats and some items and runes scale with champion's level.

Experience isn't gained over time, it has to be earned through different ways.

<a href="https://leagueoflegends.fandom.com/wiki/League_of_Legends_Wiki">League of Legends Wiki</a>

In [None]:
col = ['blueAvgLevel', 'redAvgLevel']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['AvgLevel']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['AvgLevel']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()


fig = px.bar(data, x="Side", y="AvgLevel", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Level per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="AvgLevel", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Level per Side Distribution')
fig1.show()

In [None]:
col = ['blueTotalExperience', 'redTotalExperience']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['TotalExperience']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['TotalExperience']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()


fig = px.bar(data, x="Side", y="TotalExperience", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Experience per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="TotalExperience", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Total Experience per Side Distribution')
fig1.show()

With the exception of some difference in variance, the leveling is practically the same for both sides.

However for the experience gained the blue side is slightly advantaged.

## 4.9 Minions Farme <a id="4.9"></a>

<img src="https://i.postimg.cc/pLm1f5Ps/Cs6-TGmc-UIAAc7-VQ-0.jpg" align="right" width="400" height="200">

Minions are units that comprise the main force sent by the Nexus. They spawn periodically from their nexus and advance along a lane towards the enemy nexus, automatically engaging any enemy unit or structure they encounter. They are controlled by artificial intelligence, and only use basic attacks.

There are four types of minions: Blue Melee, Blue Caster, Blue Siege and Blue Super.

<BR CLEAR=”left” />

<a href="https://leagueoflegends.fandom.com/wiki/League_of_Legends_Wiki">League of Legends Wiki</a>



In [None]:
# blueTotalMinionsKilled

col = ['blueTotalMinionsKilled', 'redTotalMinionsKilled']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['TotalMinionsKilled']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['TotalMinionsKilled']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()


fig = px.bar(data, x="Side", y="TotalMinionsKilled", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Minions Killede per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="TotalMinionsKilled", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Total Minions Killed per Side Distribution')
fig1.show()

In [None]:
# blueTotalJungleMinionsKilled

col = ['blueTotalJungleMinionsKilled', 'redTotalJungleMinionsKilled']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['TotalJungleMinionsKilled']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['TotalJungleMinionsKilled']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()


fig = px.bar(data, x="Side", y="TotalJungleMinionsKilled", color="Side", color_discrete_map=color_discrete_map,title = 'Mean Jungle Minions Killed per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="TotalJungleMinionsKilled", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Total Jungle Minions Killed per Side Distribution')
fig1.show()

It is generally fair between the two sides, however the red side has a small adventure in farming minions

## 4.10 Gold Difference <a id="4.10"></a>

This is the difference of global golds between the two sides.

In [None]:
# blueGoldDiff

col = ['blueGoldDiff', 'redGoldDiff']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['GoldDiff']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['GoldDiff']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="GoldDiff", color="Side", color_discrete_map=color_discrete_map, title = 'Mean Gold Diff per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="GoldDiff", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Gold Diff per Side Distribution')
fig1.show()

15 gold difference on average between the two sides ahead of the blue. this is very small compared to 16k gold won by both sides.

But this is not negligible because it can have a snowball effect on the rest of the game. we will see that a little further.

The destribution seems to be the same.

In [None]:
col = ['blueGoldDiffSituation', 'redGoldDiffSituation']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['GoldDiffSituation']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['GoldDiffSituation']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="GoldDiffSituation", color="Side", color_discrete_map=color_discrete_map, title = 'Gold Difference Situation per Side'
            )
fig.show()


Gold difference situation is which side has the advantage in gold.

It seems like 50/50 but if we zoom in we can see that blue side have little advantage.

## 4.11 Experience Difference <a id="4.11"></a>

This is the difference of global experience and leveling between the two sides.

In [None]:
# blueExperienceDiff

col = ['blueExperienceDiff', 'redExperienceDiff']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['ExperienceDiff']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['ExperienceDiff']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="ExperienceDiff", color="Side", color_discrete_map=color_discrete_map, title = 'Mean Experience Difference per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="ExperienceDiff", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Experience Difference per Side Distribution')
fig1.show()

Very interesting, it seems total opposite of gold difference. blue side have more gold and red side have more experience.

In [None]:
col = ['blueExperienceDiffSituation', 'redExperienceDiffSituation']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['ExperienceDiffSituation']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['ExperienceDiffSituation']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="ExperienceDiffSituation", color="Side", color_discrete_map=color_discrete_map, title = 'Mean Experience Difference Situation per Side'
            )
fig.show()


Same as gold difference, experience difference situation is which side has the advantage in experience.

It seems like 50/50 but if we zoom in we can see that red side have little advantage. oposit to gold.

## 4.12 Creep Score (CS) <a id="4.12"></a>

Minions or Creeps killed by minus for both sides.

In [None]:
# blueCSPerMin	

col = ['blueCSPerMin', 'redCSPerMin']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['CSPerMin']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['CSPerMin']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="CSPerMin", color="Side", color_discrete_map=color_discrete_map, title = 'Mean CS Per Min per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="CSPerMin", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'CS Per Min per Side Distribution')
fig1.show()

Very small advantage for red side in minions kills.

## 4.13 Gold Per Minute <a id="4.13"></a>

Total gold collected per minute.

In [None]:
# blueGoldPerMin

col = ['blueGoldPerMin', 'redGoldPerMin']
tmp1 = df[col[0:1]].copy()
tmp1.columns = ['GoldPerMin']
tmp1['Side'] = 'Blue'
tmp2 = df[col[1:2]].copy()
tmp2.columns = ['GoldPerMin']
tmp2['Side'] = 'Red'

data = pd.concat([tmp1, tmp2], ignore_index = True)
data = data.groupby('Side').mean().reset_index()

fig = px.bar(data, x="Side", y="GoldPerMin", color="Side", color_discrete_map=color_discrete_map, title = 'Mean Gold Per Min per Side'
            )
fig.show()


data = pd.concat([tmp1, tmp2], ignore_index = True).sample(2000)
data

fig1 = px.violin(data, y="GoldPerMin", color = 'Side',  box=True, points='all', color_discrete_map=color_discrete_map, title = 'Gold Per Min per Side Distribution')
fig1.show()

Exactly the same for both sides, however more variance for blue side.

# 5. Winning Condition <a id="5"></a>

After seeing which side did their goals the best, let's now take a look at how goals will influence wins or losses.

## 5.1 Blue Side Heatmap <a id="5.1"></a>

By calculating the correlation between the features we can already see which are the features which correlate positively, negatively or not at all.

Dont forget that we are talking about first 10 minutes of the game only =).

In [None]:
corr = df[[col for col in df.columns if 'blue' in col and col != 'blueWins']].corr()
f,ax = plt.subplots(figsize=(20, 20))
p = sns.heatmap(corr,
                cmap='coolwarm',
                annot=True,
                fmt=".1f",
                annot_kws={'size':10},
                cbar=False,
                ax=ax)
p.set_title('Blue Side Features Correlation')

* Positive correlation between total gold and gold per minute, gold diff, total experience, assists and kills. and correlate negatively with Deaths.
* Positive correlation between elite monsters and dragons.
* Positive correlation between golds per minute and kills.
* negative correlation between deaths and gold diff, experience diff.

## 5.2 Red Side Heatmap <a id="5.2"></a>

In [None]:
corr = df[[col for col in df.columns if 'red' in col and col != 'blueWins']].corr()
f,ax = plt.subplots(figsize=(20, 20))
p = sns.heatmap(corr,
                cmap='coolwarm',
                annot=True,
                fmt=".1f",
                annot_kws={'size':10},
                cbar=False,
                ax=ax)

Globaly same as red side.

## 5.3 Wins and Loses Correlation <a id="5.3"></a>

Lets see what cause win and lose in both of sides.

In [None]:
df['blueWins'] = df['blueWins'].map({'Blue Side': 1, 'Red Side': 0})

In [None]:
blue_win = df[[col for col in df.columns if col != 'blueWins']].corrwith(df['blueWins']).to_frame().sort_values(by = 0, ascending = False)
blue_win = pd.concat([blue_win.head(5), blue_win.tail(5)])
blue_win.columns = ['Blue Win Correlation']
blue_win

red_win = df[[col for col in df.columns if col != 'blueWins']].corrwith(df['blueWins'].map({0:1, 1:0})).to_frame().sort_values(by = 0, ascending = False)
red_win = pd.concat([red_win.head(5), red_win.tail(5)])
red_win.columns = ['Red Win Correlation']
red_win

fig = plt.figure(figsize=(25,10))
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)

plt.figure(figsize=(6,6))

sns.heatmap(blue_win,
            vmin=-1,
            cmap='coolwarm',
            annot=True,
           ax = ax1);

sns.heatmap(red_win,
            vmin=-1,
            cmap='coolwarm',
            annot=True,
           ax = ax2);

To sum up more a team exceeds by far in experience and in gold it is more likely to win.

# 6. Modelisation <a id="6"></a>

Lets train lgbm on this dataset and  see how it can distinct well win and loses.

I will use Bayesian Optimisation to have best hyperparametters and increase accuracy.

In [None]:
X = df.drop(['blueWins', 'gameId'], axis=1)
y = df.blueWins

In [None]:
# X = MinMaxScaler().fit_transform(df.drop(['blueWins', 'gameId'], axis=1))

def bayes_parameter_opt_lgb(X = X, y = y, init_round=15, opt_round=25, n_folds=5, random_seed=6, n_estimators=10000, learning_rate=0.05, output_process=False):
    # prepare data
    train_data = lgb.Dataset(data=X, label=y, free_raw_data=False)
    # parameters
    def lgb_eval(num_leaves, feature_fraction, bagging_fraction, max_depth, lambda_l1, lambda_l2, min_split_gain, min_child_weight):
        params = {'application':'binary','num_iterations': n_estimators, 'learning_rate':learning_rate, 'early_stopping_round':100, 'metric':'auc'}
        params["num_leaves"] = int(round(num_leaves))
        params['feature_fraction'] = max(min(feature_fraction, 1), 0)
        params['bagging_fraction'] = max(min(bagging_fraction, 1), 0)
        params['max_depth'] = int(round(max_depth))
        params['lambda_l1'] = max(lambda_l1, 0)
        params['lambda_l2'] = max(lambda_l2, 0)
        params['min_split_gain'] = min_split_gain
        params['min_child_weight'] = min_child_weight
        cv_result = lgb.cv(params, train_data, nfold=n_folds, seed=random_seed, stratified=True, verbose_eval =200, metrics=['auc'])
        return max(cv_result['auc-mean'])
    # range 
    lgbBO = BayesianOptimization(lgb_eval, {'num_leaves': (24, 45),
                                            'feature_fraction': (0.1, 0.9),
                                            'bagging_fraction': (0.8, 1),
                                            'max_depth': (5, 8.99),
                                            'lambda_l1': (0, 5),
                                            'lambda_l2': (0, 3),
                                            'min_split_gain': (0.001, 0.1),
                                            'min_child_weight': (5, 50)}, random_state=0)
    # optimize
    lgbBO.maximize(init_points=init_round, n_iter=opt_round)
    
    # output optimization process
    if output_process==True: lgbBO.points_to_csv("bayes_opt_result.csv")
    
    # return best parameters
    return lgbBO

opt_params = bayes_parameter_opt_lgb(X, y, init_round=5, opt_round=10, n_folds=3, random_seed=6, n_estimators=100, learning_rate=0.05)

Best accuracy 0.8%, not bad =).

Let's train lgb with those parameters and check most important features for model output.

In [None]:
params = opt_params.max['params']
params["num_leaves"] = int(round(params['num_leaves']))
params['feature_fraction'] = max(min(params['feature_fraction'], 1), 0)
params['bagging_fraction'] = max(min(params['bagging_fraction'], 1), 0)
params['max_depth'] = int(round(params['max_depth']))
params['lambda_l1'] = max(params['lambda_l1'], 0)
params['lambda_l2'] = max(params['lambda_l2'], 0)
params['min_split_gain'] = max(params['min_split_gain'], 0)
params['min_child_weight'] = max(params['min_child_weight'], 0)
params['num_iterations'] = int(round(10000))
params['learning_rate'] = max(0.05, 0)
params['application'] = 'binary'
params['metric'] = 'auc'
params['objective'] = 'binary'

In [None]:
train_data = lgb.Dataset(data=X, label=y, free_raw_data=True, feature_name = [col for col in df.columns if col!= 'gameId' and col!= 'blueWins'])
model = lgb.train(params, train_data)

In [None]:
ax = lgb.plot_importance(model, max_num_features=10, figsize = (15,15))
plt.show()

Hmm this is different from what the correlation returned, for lgbm the most important features are total gold, total experience and minions killed.

# 7. Prediction and Reality <a id="7"></a>

Let's compare how model can distinct between win and loses comparing to the real distinction.
I will do that by redimensioning data into 3D using T SNE and plot it in 3D scatter and coloring wins with blue and loses with red, and see if locations of points is getting closer or not.

In [None]:
pred = np.round(model.predict(X)).astype(int)
data_test = TSNE(n_components=3).fit_transform(X)

In [None]:
predicted = pd.DataFrame(data_test)
predicted['output'] = pred
predicted['output'] = predicted['output'].astype(object)
predicted['output'] = predicted['output'].map({0: "Red Wins",1: "Blue Wins"})
predicted['flag'] = 'predicted'
predicted.columns = ['dim_1','dim_2','dim_3','output', 'flag']

real = pd.DataFrame(data_test)
real['output'] = y.values
real['output'] = real['output'].astype(object)
real['output'] = real['output'].map({0: "Red Wins",1: "Blue Wins"})
real['flag'] = 'real'
real.columns = ['dim_1','dim_2','dim_3','output', 'flag']

data_3d = pd.concat([predicted,real], ignore_index = True).sample(4000)

In [None]:
color_discrete_map = {"Blue Wins": 'rgb(122, 148, 231)', "Red Wins": 'rgb(255, 105, 97)'}

fig1 = px.scatter_3d(predicted, x='dim_1', y='dim_2', z='dim_3', color_discrete_map=color_discrete_map,color='output', title = 'How Model Distinct Wins and Loses')
fig2 = px.scatter_3d(real, x='dim_1', y='dim_2', z='dim_3',color_discrete_map=color_discrete_map,color='output', title = 'Real Distinction Between Wins and Loses')

fig1.show()
fig2.show()

Not that bad, in reality the points are slightly intertwined but we can see that is more blue in one side on red in other side.

The model separates these points well with a fairly significant bias. which is very good. 80% precision is not bad =)

# 8. Conclusion <a id="8"></a>

The playing conditions are fairly even between the two sides. the blue side benefits from more golds and kills and the red side benefits from more neutral objectives and experience and CS.

We were able to model this event to 80% accuracy but using only data from the first 10 minutes of play. But with more data it is possible to predict which side will win in real time.

I played a lot league of legends 3 years ago but now i don't have that much time, but i'm watching international competitions and export part. i like this game.

I really enjoyed doing this notebook, for ones i dont do EDA for somthing serious or for work =).

Do not hesitate to drop an upvote, Good Bye.

<img src="https://i.postimg.cc/qv9Q8kBn/Banner-T2-Image-tnp3w61gzna8r2n3rojp.jpg" alt="Bye">