![https://images.contentstack.io/v3/assets/blt731acb42bb3d1659/bltcfa4652c8d383f56/5e21837f63d1b6503160d39b/Home-page.jpg](https://images.contentstack.io/v3/assets/blt731acb42bb3d1659/bltcfa4652c8d383f56/5e21837f63d1b6503160d39b/Home-page.jpg)

League of Legends is a team-based strategy game where two teams of five powerful champions face off to destroy the other's base. Players choose from over 140 champions to make epic plays, secure kills, and take down towers as you battle your way to victory. **The objective of this study is to try different models to predict, based on game data of the first 10 minutes, whether the winner is the blue team or not.** This dataset contains the first 10min. stats of approx. 10k ranked games (SOLO QUEUE) from a high ELO (DIAMOND I to MASTER). Players have roughly the same level.

### Glossary

* Warding totem: An item that a player can put on the map to reveal the nearby area. Very useful for map/objectives control.
* Minions: NPC that belong to both teams. They give gold when killed by players.
* Jungle minions: NPC that belong to NO TEAM. They give gold and buffs when killed by players.
* Elite monsters: Monsters with high hp/damage that give a massive bonus (gold/XP/stats) when killed by a team.
* Dragons: Elite monster which gives team bonus when killed. The 4th dragon killed by a team gives a massive stats bonus. The 5th dragon (Elder Dragon) offers a huge advantage to the team.
* Herald: Elite monster which gives stats bonus when killed by the player. It helps to push a lane and destroys structures.
* Towers: Structures you have to destroy to reach the enemy Nexus. They give gold.
* Level: Champion level. Start at 1. Max is 18.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn import svm
from sklearn.metrics import plot_roc_curve, accuracy_score, confusion_matrix, classification_report, ConfusionMatrixDisplay
from sklearn.neighbors import KNeighborsClassifier

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
# First look at data
url = "/kaggle/input/league-of-legends-diamond-ranked-games-10-min/high_diamond_ranked_10min.csv"
data = pd.read_csv(url)
data.head()

In [None]:
# No missing data
data.info()

In [None]:
data.columns

In [None]:
data.columns.size

Let's look at blue and red columns seperately. I will pass "blueWins" and "Id" columns. "blueWins" will be the target later.

In [None]:
blue = data.iloc[:,2:-19]
blue.columns

In [None]:
red = data.iloc[:,21:]
red.columns

In [None]:
def cols_check(a,b):
    print(a.columns.size,b.columns.size)
    if red.columns.size == blue.columns.size:
        print("Number of columns equal")
    else:
        print("Not equal")

In [None]:
cols_check(blue,red)

In [None]:
#We have to make similar the column names to calculate between the two data sets. 
#For this reason, I will create a list and then that will be appended to new dataset column names.
temp_list = []
for i in range(19):
    temp_list.append(i)
    
temp_list

In [None]:
difference = blue.set_axis(temp_list, axis = 1) - red.set_axis(temp_list, axis = 1)
difference.head(15)

In [None]:
diff_cols = []

for i in temp_list:
    diff_cols.append(blue.columns.to_list()[i][4:])
    


In [None]:
diff_cols

In [None]:
difference.set_axis(diff_cols, axis = 1, inplace = True)
difference.head()

We have to drop "Gold Diff" and "Experience Diff" columns because subtraction of 2 diff value will be sum of them on the new dataset.

In [None]:
print(data[["blueGoldDiff","blueExperienceDiff"]],"\n\n", data[["redGoldDiff","redExperienceDiff"]])

In [None]:
difference.drop(columns = ["GoldDiff","ExperienceDiff"], inplace = True)

In [None]:
# We can also drop kills or deaths columns. I will drop deaths column.
difference.drop(columns = "Deaths", inplace = True)

In [None]:
difference.head()

In [None]:
# We can also drop first blood column
difference.drop(columns = "FirstBlood", inplace = True)

In [None]:
fig, ax = plt.subplots(figsize=(12,10)) 
sns.heatmap(difference.corr(), annot = True, linewidths=.3, cmap ="binary")

In [None]:
# GoldPerMin == TotalGol and CSPerMin == TotalMinionsKilled
difference.drop(columns = ["GoldPerMin","CSPerMin"], inplace = True)

In [None]:
#Train and Test Split

X = difference
y = data[["blueWins"]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.4, random_state = 7)

In [None]:
print("X_train shape is : ", X_train.shape)
print("X_test shape  is : ", X_test.shape)
print("y_train shape is : ", y_train.shape)
print("y_test shape is : ", y_test.shape)

### SVM Classification

In [None]:
clf = svm.SVC()
clf.fit(X_train,y_train.values.ravel())

y_pred = clf.predict(X_test)

print("Accuracy:",accuracy_score(y_test, y_pred))

svc_disp = plot_roc_curve(clf, X_test, y_test)

### Stochastic Gradient Descent

In [None]:
clfs = SGDClassifier(loss="hinge", penalty="l2", max_iter=165)
clfs.fit(X_train, y_train.values.ravel())

y_pred = clf.predict(X_test)

print("Accuracy:",accuracy_score(y_test, y_pred))

sgd_disp = plot_roc_curve(clfs, X_test, y_test)

### KNeighborsClassifier

In [None]:
nc = KNeighborsClassifier(n_neighbors=17)
nc.fit(X_train, y_train.values.ravel())

y_pred = nc.predict(X_test)

print("Accuracy:",accuracy_score(y_test, y_pred))

knn_disp = plot_roc_curve(nc, X_test, y_test)

### LogisticRegression

In [None]:
clf = LogisticRegression(random_state=5, max_iter = 1000).fit(X_train, y_train.values.ravel())

y_pred = clf.predict(X_test)

print("Accuracy:",accuracy_score(y_test, y_pred))

lr_disp = plot_roc_curve(clf, X_test, y_test)

### Confusion Matrix

In [None]:
matrix = confusion_matrix(y_test, y_pred)
cm_display = ConfusionMatrixDisplay(matrix).plot()

### Classification Report

In [None]:
report = classification_report(y_test, y_pred)
print(report)

### The models I have created work successfully at a rate of 70-73%. However, the data can be considered in more detail to increase accuracy. As a former League of Legen player, I know that there are many different parameters that determine the course of the game, and perhaps more optimized results can be obtained by studying the weights of these parameters.