# League of Legends: Exploring the Relationship Between Kills and Win Rate, Part 2: Linear Regression

## Introduction to League of Legends

In our previous article, we successfully prepared our dataset of historical player data spanning the 2011 to 2022 Worlds Championships of League of Legends.
Our primary objective is to answer the following question:

> What is the connection between a player's average number of kills per game and their corresponding win rate?

From the previous correlation heatmap, there are a few variables that have strong and moderate correlations, one of which is **`kills`**.

## Linear Regression

In [1]:
# Package importing

import numpy as np
import pandas as pd
import seaborn as sns
import missingno as msno
import plotly.express as px


In [2]:
# Loading the Kaggle dataset
df = pd.read_csv("E:\ModelDiversity\data\players_stats.csv")

In [3]:
# Subset missing values
missing_values1 = df[df['season']<4]

# calculating average team gold per team each season
team_gold_1_3 = missing_values1.groupby(['season','team'])['gold'].sum().rename("team_gold")

# calculating average team kills per team each season
team_kills_1_3 = missing_values1.groupby(['season','team'])['kills'].sum().rename("team_kills")

# merging team_gold and team_kills back into dataframe
missing_values2 = missing_values1.merge(team_gold_1_3,how='outer',on=['season','team']).merge(team_kills_1_3,how='outer',on=['season','team'])

# filling in the gold_share using the calculation above
missing_values2['gold_share'] = round(missing_values2['gold']/missing_values2['team_gold'] * 100,2)

# filling in the kill_share using the calculation above
missing_values2['kill_share'] = round(missing_values2['kills']/missing_values2['team_kills'] * 100,2)

# filling in the kill_participation using the calculation above
missing_values2['kill_participation'] = round((missing_values2['kills']+missing_values2["assists"])/missing_values2['team_kills'] *100,2)


In [4]:
# Updates our existing df
df.update(missing_values2)

In [5]:
# subset your dataset to get no missing values
df2 = df.drop(['damage','damage/min'],axis=1)

In [6]:
df2

Unnamed: 0,season,event,team,player,games_played,wins,loses,win_rate,kills,deaths,assists,kill_death_assist_ratio,creep_score,cs/min,gold,gold/min,kill_participation,kill_share,gold_share
0,1,Main,Against_All_authority,Kujaa,12,7,5,58.3,0.25,2.58,8.33,3.32,13.58,0.34,7.8,198,61.64,1.80,14.18
1,1,Main,Against_All_authority,Linak,12,7,5,58.3,1.75,3.58,7.67,2.63,113.17,2.86,10.3,259,67.67,12.57,18.73
2,1,Main,Against_All_authority,MoMa,12,7,5,58.3,4.17,2.75,5.58,3.55,242.25,6.13,12.1,307,70.04,29.96,22.00
3,1,Main,Against_All_authority,sOAZ,12,7,5,58.3,4.00,2.92,7.08,3.80,214.67,5.43,11.6,293,79.60,28.74,21.09
4,1,Main,Against_All_authority,YellOwStaR,12,7,5,58.3,3.75,3.25,5.17,2.74,276.33,6.99,13.2,333,64.08,26.94,24.00
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1278,12,Main,Top_Esports,JackeyLove,6,3,3,50.0,5.33,2.33,5.17,4.50,295.33,9.37,15.3,486,76.80,39.00,26.10
1279,12,Main,Top_Esports,knight,6,3,3,50.0,4.33,1.17,5.50,8.43,283.50,8.99,13.9,442,72.00,31.70,23.70
1280,12,Main,Top_Esports,Mark,6,3,3,50.0,1.17,2.00,9.33,5.25,24.00,0.76,7.7,245,76.80,8.50,13.10
1281,12,Main,Top_Esports,Tian,6,3,3,50.0,1.83,2.33,7.33,3.93,174.00,5.52,10.5,332,67.10,13.40,17.80
