**In League of Legends, the winner of a game is often determined by two main factors: advantage in gold, and ownership of objects. This implies multiple dependencies in the parameters of experience, and the difference in gold. Using this dataset, let's try to predict the winner in the first 10 minutes of the game. However, it is also required to carry out preliminary processing of the data, since most likely it contains many parameters dependent on each other.**

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import  accuracy_score, confusion_matrix
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.linear_model import LinearRegression, LogisticRegression
import seaborn as sns

In [None]:
df = pd.read_csv('/kaggle/input/league-of-legends-diamond-ranked-games-10-min/high_diamond_ranked_10min.csv')
time = 10 #Mins
df.info()

In [None]:
df.head()

In [None]:
plt.figure()
df['blueTotalGold'].hist()
df['redTotalGold'].hist()
plt.legend(['blue', 'red'])

In [None]:
plt.figure()
df['blueGoldDiff'].hist()
df['redGoldDiff'].hist()
plt.legend(['blue', 'red'])

# **As we can see, in this dataset, some variables are very similar to each other in terms of distribution and values. Let's carry out preprocessing in order to reduce the number of such features and compare the resulting result with the initial one.**

In [None]:
df_before_preproc = df
y = df['blueWins']
df = df.drop('blueWins', axis = 1)

In [None]:
df.drop('gameId', axis = 1, inplace = True)
GoldDiff = np.sum(df['blueTotalGold'] - df['redTotalGold'])
ExpDiff = np.sum(df['blueExperienceDiff'] - df['redExperienceDiff'])
if (np.sum(df['blueGoldDiff'])  - GoldDiff) == 0:
    df.drop('blueGoldDiff', axis = 1, inplace = True)
    df.drop('redGoldDiff', axis = 1, inplace = True)

In [None]:
if (2 * np.sum(df['blueExperienceDiff'])  - ExpDiff) == 0:
    df.drop('blueExperienceDiff', axis = 1, inplace = True)
    df.drop('redExperienceDiff', axis = 1, inplace = True)

In [None]:
GoldPerMin = np.sum((time * df['blueGoldPerMin']) - df['blueTotalGold'])
if GoldPerMin == 0:
    df.drop('blueGoldPerMin', axis = 1, inplace = True)
    df.drop('redGoldPerMin', axis = 1, inplace = True) 

In [None]:
CSPerMin = (np.sum(df['blueCSPerMin'])*time) == np.sum(df['blueTotalMinionsKilled'])
if CSPerMin == True:
    df.drop('blueCSPerMin', axis = 1, inplace = True)
    df.drop('redCSPerMin', axis = 1, inplace = True)

df.drop('blueDeaths', axis = 1, inplace = True)
df.drop('redDeaths', axis = 1, inplace = True)
df.drop('blueTotalExperience', axis = 1, inplace = True)
df.drop('redTotalExperience', axis = 1, inplace = True)

In [None]:
plt.figure(figsize = (10,10))
sns.heatmap(df_before_preproc.corr())

In [None]:
plt.figure(figsize = (10,10))
sns.heatmap(df.corr())

# As we can see, the result of the data processing has paid off. Now the data is less correlated with each other, and you can start building the initial model.

In [None]:
scaler = MinMaxScaler()
X = df
Scaled_X = scaler.fit_transform(X)

In [None]:
X_train,X_test,y_train,y_test = train_test_split(Scaled_X, y, test_size = 0.3, shuffle = True, random_state = 4)

In [None]:
Estimator = LogisticRegression()
cv = 3
param_grid = {'C' : [0.0005, 0.005, 0.05, 0.5, 1], 'penalty' : ['l1','l2']}
Optimizer = GridSearchCV(Estimator, param_grid = param_grid, cv = cv)

In [None]:
Optimizer.fit(X_train, y_train)

In [None]:
predsTest = Optimizer.predict(X_test)
acc_test = accuracy_score(y_test, predsTest)

In [None]:
param_grid = {'max_depth' : [None, 1, 2, 3, 4], 'min_samples_leaf' : [ 5, 10, 20, 100], }
Estimator = DecisionTreeClassifier()
Optimizer = GridSearchCV(Estimator, param_grid = param_grid, cv = cv)
Optimizer.fit(X_train,y_train)

In [None]:
predsTestT = Optimizer.predict(X_test)
acc_testT = accuracy_score(y_test, predsTestT)

'Best accuracy scores for LogReg and DecTree with GridSearch: {} and {}'.format(round(100*acc_test,2), round(100*acc_testT,2))