# League of Legends: Exploring the Relationship Between Kills and Win Rate, Part 2: Linear Regression

## Introduction to League of Legends

In our previous article, we successfully prepared our dataset of historical player data spanning the 2011 to 2022 Worlds Championships of League of Legends. 

## Goal of the Project

Our project's goal is to discover meaningful relationships that can help us predict match outcomes in League of Legends. Specifically, we aim to answer the question:

> What is the connection between a player's average number of kills per game and their corresponding win rate?

In [None]:
# Package importing

import numpy as np
import pandas as pd
import seaborn as sns
import missingno as msno
import plotly.express as px


In [None]:
# Loading the Kaggle dataset
df = pd.read_csv("E:\ModelDiversity\data\players_stats.csv")

In [None]:
# Subset missing values
missing_values1 = df[df['season']<4]

# calculating average team gold per team each season
team_gold_1_3 = missing_values1.groupby(['season','team'])['gold'].sum().rename("team_gold")

# calculating average team kills per team each season
team_kills_1_3 = missing_values1.groupby(['season','team'])['kills'].sum().rename("team_kills")

# merging team_gold and team_kills back into dataframe
missing_values2 = missing_values1.merge(team_gold_1_3,how='outer',on=['season','team']).merge(team_kills_1_3,how='outer',on=['season','team'])

# filling in the gold_share using the calculation above
missing_values2['gold_share'] = round(missing_values2['gold']/missing_values2['team_gold'] * 100,2)

# filling in the kill_share using the calculation above
missing_values2['kill_share'] = round(missing_values2['kills']/missing_values2['team_kills'] * 100,2)

# filling in the kill_participation using the calculation above
missing_values2['kill_participation'] = round((missing_values2['kills']+missing_values2["assists"])/missing_values2['team_kills'] *100,2)


In [None]:
# Updates our existing df
df.update(missing_values2)

In [None]:
# subset your dataset to get no missing values
df2 = df.drop(['damage','damage/min'],axis=1)