## PROJETO 3 - CIÊNCIA DOS DADOS

por Victor de Almeida Cunha

In [196]:
%matplotlib inline

import pandas as pd
import numpy as np
from scipy.stats import norm, probplot
import statsmodels.api as sm
import statsmodels.formula.api as smf
import matplotlib.pyplot as plt


from sklearn.metrics import (confusion_matrix,  
                           accuracy_score)

from mpl_toolkits.mplot3d import Axes3D

# Para ter melhor print
from IPython.display import display

# OBJETIVO

Um dos jogos competitivos on-line mais populares atualmente é League Of Legends, um jogo onde você e mais 4 pessoas devem lutar contra um time de outros 5 jogadores para destruir o Inibidor inimigo. 

Para isso, você utiliza das habilidades do seu herói e dos diversos inimigos espalhados pelo mapa. 

A progressão para a vitória é bem simples: destrua as torres em cada corredor do mapa, destrua o inibidor daquele corredor e destrua o inibidor principal inimigo.

Por ser um jogo competitivo muito popular, vários times e analistas utilizam das estatísticas de cada partida para saber onde que um time errou e até mesmo prever qual time irá ganhar, que é o que iremos tentar achar nesse relatório.

Para isso, possuímos duas bases de dados: uma com partidas com duração de 10 minutos e outra com duração de 15 minutos, todas onde os times eram os melhores jogadores do mundo (ou Challenger).

In [197]:
partidas_10min = pd.read_csv('data\Challenger_Ranked_Games_10minute.csv')
partidas_10min

Unnamed: 0,gameId,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,...,redFirstTowerLane,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragnoType,redDragon,redRiftHeralds
0,4247263043,0,14870,2889,32,6.4,199,53,0,3,...,[],0,0,0,0,0,1,['WATER_DRAGON'],1,0
1,4247155821,1,14497,2617,33,6.6,229,44,0,2,...,[],0,0,0,0,0,0,[],0,0
2,4243963257,0,15617,1757,34,6.8,223,39,0,3,...,['BOT_LANE'],1,0,0,1,0,1,['FIRE_DRAGON'],1,1
3,4241678498,0,15684,1439,35,7.0,251,64,0,3,...,[],0,0,0,0,0,0,[],0,0
4,4241538868,1,17472,3512,35,7.0,257,46,0,7,...,[],0,0,0,0,0,0,[],0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26404,4143231833,0,16762,5072,36,7.2,206,52,0,7,...,[],0,0,0,0,0,1,['WATER_DRAGON'],1,1
26405,4157911901,0,35765,1440,39,7.8,191,16,0,18,...,[],3,1,0,2,0,0,[],0,0
26406,3764171638,0,15712,4137,34,6.8,218,55,0,3,...,[],0,0,0,0,0,0,[],0,0
26407,4110201724,1,15850,3220,33,6.6,193,48,0,6,...,[],0,0,0,0,0,0,[],0,0


Essa base de dados possui diversos dados da partida, entre eles:
- Quantidade de ouro que um time possui;
- O nível combinado de todos os heróis de um time;
- Quantos jogadores inimigos e minions (inimigos controlados por IA) foram mortos;
- Qual time derrubou a primeira torre do mapa.

Para começar a analisar nossa base de dados, vamos primeiro tirar as colunas que não vão interessar para a nossa análise, que são:
- ID da partida;
- Qual foi o corredor onde a primeira torre destruída por um time estava;
- Tipo de dragões derrotados.

In [198]:
partidas_10min = partidas_10min.drop(columns=['gameId','blueFirstTowerLane','blueDragnoType','redFirstTowerLane','redDragnoType'])
partidas_10min

Unnamed: 0,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,blueDeath,...,redFirstTower,redFirstInhibitor,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragon,redRiftHeralds
0,0,14870,2889,32,6.4,199,53,0,3,9,...,0,0,0,0,0,0,0,1,1,0
1,1,14497,2617,33,6.6,229,44,0,2,3,...,0,0,0,0,0,0,0,0,0,0
2,0,15617,1757,34,6.8,223,39,0,3,11,...,1,0,1,0,0,1,0,1,1,1
3,0,15684,1439,35,7.0,251,64,0,3,4,...,0,0,0,0,0,0,0,0,0,0
4,1,17472,3512,35,7.0,257,46,0,7,5,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26404,0,16762,5072,36,7.2,206,52,0,7,5,...,0,0,0,0,0,0,0,1,1,1
26405,0,35765,1440,39,7.8,191,16,0,18,27,...,0,0,3,1,0,2,0,0,0,0
26406,0,15712,4137,34,6.8,218,55,0,3,0,...,0,0,0,0,0,0,0,0,0,0
26407,1,15850,3220,33,6.6,193,48,0,6,4,...,0,0,0,0,0,0,0,0,0,0


Agora, vamos renomear nossas categorias qualitativas.

A categoria "First Blood" diz qual time matou o primeiro inimigo da partida. Ou seja, se o time azul teve a primeira eliminação ou se o time vermelho teve a primeira eliminação da partida.

In [199]:
partidas_10min['blueFirstBlood'] = partidas_10min['blueFirstBlood'].astype('category')
partidas_10min['blueFirstBlood'] = partidas_10min['blueFirstBlood'].cat.rename_categories({0:'No',1:'Yes'})

partidas_10min['redFirstBlood'] = partidas_10min['redFirstBlood'].astype('category')
partidas_10min['redFirstBlood'] = partidas_10min['redFirstBlood'].cat.rename_categories({0:'No',1:'Yes'})

A categoria "First Tower" diz se o time analisado destruiu a primeira torre da partida ou não, parecido com a "First Blood".

In [200]:
partidas_10min['blueFirstTower'] = partidas_10min['blueFirstTower'].astype('category')
partidas_10min['blueFirstTower'] = partidas_10min['blueFirstTower'].cat.rename_categories({0:'No',1:'Yes'})

partidas_10min['redFirstTower'] = partidas_10min['redFirstTower'].astype('category')
partidas_10min['redFirstTower'] = partidas_10min['redFirstTower'].cat.rename_categories({0:'No',1:'Yes'})

Analogamente, a categoria "First Inhibitor" diz quem destruiu o primeiro inibidor.

In [201]:
partidas_10min['blueFirstInhibitor'] = partidas_10min['blueFirstInhibitor'].astype('category')
partidas_10min['blueFirstInhibitor'] = partidas_10min['blueFirstInhibitor'].cat.rename_categories({0:'No',1:'Yes'})

partidas_10min['redFirstInhibitor'] = partidas_10min['redFirstInhibitor'].astype('category')
partidas_10min['redFirstInhibitor'] = partidas_10min['redFirstInhibitor'].cat.rename_categories({0:'No',1:'Yes'})

Agora, vamos dividir nossa base de dados para os times azul e vermelho.

In [202]:
partidas_10min_blue = partidas_10min.iloc[:,:23]
partidas_10min_red = partidas_10min.iloc[:,24:45]

In [203]:
partidas_10min_blue

Unnamed: 0,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,blueDeath,...,blueFirstTower,blueFirstInhibitor,blueTowerKills,blueMidTowerKills,blueTopTowerKills,blueBotTowerKills,blueInhibitor,blueFirstDragon,blueDragon,blueRiftHeralds
0,0,14870,2889,32,6.4,199,53,No,3,9,...,No,No,0,0,0,0,0,0,0,0
1,1,14497,2617,33,6.6,229,44,No,2,3,...,No,No,0,0,0,0,0,0,0,1
2,0,15617,1757,34,6.8,223,39,No,3,11,...,No,No,0,0,0,0,0,0,0,0
3,0,15684,1439,35,7.0,251,64,No,3,4,...,No,No,0,0,0,0,0,0,0,0
4,1,17472,3512,35,7.0,257,46,No,7,5,...,No,No,0,0,0,0,0,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26404,0,16762,5072,36,7.2,206,52,No,7,5,...,No,No,0,0,0,0,0,0,0,0
26405,0,35765,1440,39,7.8,191,16,No,18,27,...,Yes,No,2,0,2,0,0,1,1,0
26406,0,15712,4137,34,6.8,218,55,No,3,0,...,No,No,0,0,0,0,0,0,0,0
26407,1,15850,3220,33,6.6,193,48,No,6,4,...,No,No,0,0,0,0,0,1,1,0


In [204]:
partidas_10min_red

Unnamed: 0,redTotalGolds,redCurrentGolds,redTotalLevel,redAvgLevel,redTotalMinionKills,redTotalJungleMinionKills,redFirstBlood,redKill,redDeath,redAssist,...,redWardKills,redFirstTower,redFirstInhibitor,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragon
0,18397,3297,36,7.2,229,32,No,9,3,22,...,8,No,No,0,0,0,0,0,1,1
1,15893,4778,36,7.2,234,57,No,3,2,2,...,5,No,No,0,0,0,0,0,0,0
2,20409,4324,37,7.4,236,45,No,11,3,11,...,4,Yes,No,1,0,0,1,0,1,1
3,16150,3633,35,7.0,256,48,No,4,3,5,...,7,No,No,0,0,0,0,0,0,0
4,15588,3323,34,6.8,194,56,No,5,7,7,...,3,No,No,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26404,14779,3709,36,7.2,205,67,No,5,7,1,...,4,No,No,0,0,0,0,0,1,1
26405,42001,9568,44,8.8,279,73,No,27,18,20,...,0,No,No,3,1,0,2,0,0,0
26406,14845,1345,34,6.8,239,60,No,0,3,0,...,3,No,No,0,0,0,0,0,0,0
26407,14773,3673,33,6.6,204,52,No,4,6,8,...,4,No,No,0,0,0,0,0,0,0


Para poder realizar nossa predição, precisamos utilizar algum tipo de regressão.

## REALIZANDO A REGRESSÃO LOGIT

Em uma regressão, utilizamos diferentes variáveis explicativas para explicar uma variável independente (target). No caso do nosso banco de dados, queremos prever a vitória de um time com os dados fornecidos.

In [205]:
def regress(Y):
    '''
    Y: coluna do DataFrame utilizada como variável resposta (TARGET)
    X: coluna(s) do DataFrame utilizadas como variável(is) explicativas (FEATURES)
    '''
    columns = list(Y.columns)
    formula_string = columns[0] + ' ~ '
    
    i = 1
    while i < len(columns):
        col = columns[i]
        formula_string += col
        if i < len(columns)-1:
            formula_string += ' + '
        i += 1

    model = smf.logit(formula_string, data=Y)
    results = model.fit()
    
    return results

In [206]:
results = regress(partidas_10min_blue)
results.summary()

         Current function value: 0.537835
         Iterations: 35


  return 1/(1+np.exp(-X))
  return 1/(1+np.exp(-X))


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26389.0
Method:,MLE,Df Model:,19.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2241
Time:,17:37:12,Log-Likelihood:,-14204.0
converged:,False,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-5.0780,0.275,-18.472,0.000,-5.617,-4.539
blueFirstTower[T.Yes],0.3986,0.108,3.700,0.000,0.187,0.610
blueFirstInhibitor[T.Yes],7.33e+04,3.28e+07,0.002,0.998,-6.43e+07,6.44e+07
blueTotalGolds,6.274e-05,1.49e-05,4.223,0.000,3.36e-05,9.19e-05
blueCurrentGolds,7.707e-07,1.23e-05,0.063,0.950,-2.34e-05,2.49e-05
blueTotalLevel,0.0463,,,,,
blueAvgLevel,0.0090,,,,,
blueTotalMinionKills,0.0076,0.001,10.890,0.000,0.006,0.009
blueTotalJungleMinionKills,0.0106,0.001,7.197,0.000,0.008,0.014


In [207]:
partidas_10min_blue1 = partidas_10min_blue.drop(columns=['blueInhibitor'])
results = regress(partidas_10min_blue1)
results.summary()

Optimization terminated successfully.
         Current function value: 0.537838
         Iterations 11


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26390.0
Method:,MLE,Df Model:,18.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2241
Time:,17:37:13,Log-Likelihood:,-14204.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-5.0793,0.275,-18.479,0.000,-5.618,-4.541
blueFirstTower[T.Yes],0.4004,0.107,3.725,0.000,0.190,0.611
blueFirstInhibitor[T.Yes],0.6153,0.318,1.934,0.053,-0.008,1.239
blueTotalGolds,6.28e-05,1.49e-05,4.228,0.000,3.37e-05,9.19e-05
blueCurrentGolds,7.851e-07,1.23e-05,0.064,0.949,-2.34e-05,2.49e-05
blueTotalLevel,0.0462,2.15e+04,2.15e-06,1.000,-4.21e+04,4.21e+04
blueAvgLevel,0.0092,1.07e+05,8.6e-08,1.000,-2.11e+05,2.11e+05
blueTotalMinionKills,0.0076,0.001,10.895,0.000,0.006,0.009
blueTotalJungleMinionKills,0.0107,0.001,7.199,0.000,0.008,0.014


In [208]:
partidas_10min_blue2 = partidas_10min_blue1.drop(columns=['blueTotalLevel','blueAvgLevel','blueTowerKills','blueMidTowerKills','blueTopTowerKills','blueBotTowerKills'])
results = regress(partidas_10min_blue2)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538857
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26394.0
Method:,MLE,Df Model:,14.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2226
Time:,17:37:13,Log-Likelihood:,-14231.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8603,0.192,-20.129,0.000,-4.236,-3.484
blueFirstTower[T.Yes],0.0535,0.071,0.749,0.454,-0.086,0.193
blueFirstInhibitor[T.Yes],0.4224,0.295,1.431,0.152,-0.156,1.001
blueTotalGolds,0.0001,9.7e-06,11.201,0.000,8.97e-05,0.000
blueCurrentGolds,5.806e-07,1.23e-05,0.047,0.962,-2.35e-05,2.47e-05
blueTotalMinionKills,0.0068,0.001,9.933,0.000,0.005,0.008
blueTotalJungleMinionKills,0.0105,0.001,7.118,0.000,0.008,0.013
blueKill,0.2099,0.008,27.871,0.000,0.195,0.225
blueDeath,-0.2451,0.005,-47.923,0.000,-0.255,-0.235


In [209]:
partidas_10min_blue3 = partidas_10min_blue2.drop(columns=['blueCurrentGolds'])
results = regress(partidas_10min_blue3)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538857
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26395.0
Method:,MLE,Df Model:,13.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2226
Time:,17:37:13,Log-Likelihood:,-14231.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8599,0.192,-20.146,0.000,-4.235,-3.484
blueFirstTower[T.Yes],0.0536,0.071,0.751,0.453,-0.086,0.193
blueFirstInhibitor[T.Yes],0.4222,0.295,1.430,0.153,-0.156,1.001
blueTotalGolds,0.0001,9.46e-06,11.498,0.000,9.02e-05,0.000
blueTotalMinionKills,0.0068,0.001,9.933,0.000,0.005,0.008
blueTotalJungleMinionKills,0.0105,0.001,7.118,0.000,0.008,0.013
blueKill,0.2099,0.008,27.898,0.000,0.195,0.225
blueDeath,-0.2452,0.005,-48.815,0.000,-0.255,-0.235
blueAssist,0.0044,0.003,1.349,0.177,-0.002,0.011


In [210]:
partidas_10min_blue4 = partidas_10min_blue3.drop(columns=['blueWardPlaced'])
results = regress(partidas_10min_blue4)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538865
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26396.0
Method:,MLE,Df Model:,12.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2226
Time:,17:37:13,Log-Likelihood:,-14231.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8618,0.192,-20.166,0.000,-4.237,-3.487
blueFirstTower[T.Yes],0.0536,0.071,0.751,0.453,-0.086,0.193
blueFirstInhibitor[T.Yes],0.4222,0.295,1.431,0.152,-0.156,1.001
blueTotalGolds,0.0001,9.45e-06,11.482,0.000,9e-05,0.000
blueTotalMinionKills,0.0068,0.001,9.923,0.000,0.005,0.008
blueTotalJungleMinionKills,0.0104,0.001,7.091,0.000,0.008,0.013
blueKill,0.2097,0.008,27.892,0.000,0.195,0.224
blueDeath,-0.2452,0.005,-48.822,0.000,-0.255,-0.235
blueAssist,0.0046,0.003,1.398,0.162,-0.002,0.011


In [211]:
partidas_10min_blue5 = partidas_10min_blue4.drop(columns=['blueWardKills'])
results = regress(partidas_10min_blue5)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538873
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26397.0
Method:,MLE,Df Model:,11.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2226
Time:,17:37:13,Log-Likelihood:,-14231.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8602,0.191,-20.159,0.000,-4.235,-3.485
blueFirstTower[T.Yes],0.0517,0.071,0.725,0.468,-0.088,0.191
blueFirstInhibitor[T.Yes],0.4202,0.295,1.424,0.154,-0.158,0.999
blueTotalGolds,0.0001,9.42e-06,11.466,0.000,8.95e-05,0.000
blueTotalMinionKills,0.0069,0.001,10.334,0.000,0.006,0.008
blueTotalJungleMinionKills,0.0104,0.001,7.084,0.000,0.008,0.013
blueKill,0.2099,0.008,27.915,0.000,0.195,0.225
blueDeath,-0.2453,0.005,-48.840,0.000,-0.255,-0.235
blueAssist,0.0047,0.003,1.438,0.150,-0.002,0.011


In [212]:
partidas_10min_blue6 = partidas_10min_blue5.drop(columns=['blueFirstTower'])
results = regress(partidas_10min_blue6)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538883
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26398.0
Method:,MLE,Df Model:,10.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2226
Time:,17:37:13,Log-Likelihood:,-14231.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8819,0.189,-20.525,0.000,-4.253,-3.511
blueFirstInhibitor[T.Yes],0.4390,0.294,1.494,0.135,-0.137,1.015
blueTotalGolds,0.0001,9.13e-06,12.018,0.000,9.18e-05,0.000
blueTotalMinionKills,0.0069,0.001,10.339,0.000,0.006,0.008
blueTotalJungleMinionKills,0.0104,0.001,7.067,0.000,0.008,0.013
blueKill,0.2094,0.007,27.954,0.000,0.195,0.224
blueDeath,-0.2456,0.005,-49.154,0.000,-0.255,-0.236
blueAssist,0.0048,0.003,1.450,0.147,-0.002,0.011
blueFirstDragon,1.5235,0.366,4.159,0.000,0.806,2.241


In [213]:
partidas_10min_blue7 = partidas_10min_blue6.drop(columns=['blueAssist'])
results = regress(partidas_10min_blue7)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538923
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26399.0
Method:,MLE,Df Model:,9.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2225
Time:,17:37:13,Log-Likelihood:,-14232.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8519,0.188,-20.487,0.000,-4.220,-3.483
blueFirstInhibitor[T.Yes],0.4661,0.293,1.593,0.111,-0.107,1.039
blueTotalGolds,0.0001,8.41e-06,13.644,0.000,9.83e-05,0.000
blueTotalMinionKills,0.0065,0.001,10.564,0.000,0.005,0.008
blueTotalJungleMinionKills,0.0100,0.001,6.918,0.000,0.007,0.013
blueKill,0.2138,0.007,31.137,0.000,0.200,0.227
blueDeath,-0.2469,0.005,-50.218,0.000,-0.257,-0.237
blueFirstDragon,1.6256,0.361,4.506,0.000,0.919,2.333
blueDragon,-1.1461,0.359,-3.192,0.001,-1.850,-0.442


In [214]:
partidas_10min_blue8 = partidas_10min_blue7.drop(columns=['blueFirstInhibitor'])
results = regress(partidas_10min_blue8)
results.summary()

Optimization terminated successfully.
         Current function value: 0.538975
         Iterations 6


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26409.0
Model:,Logit,Df Residuals:,26400.0
Method:,MLE,Df Model:,8.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.2224
Time:,17:37:13,Log-Likelihood:,-14234.0
converged:,True,LL-Null:,-18305.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-3.8630,0.188,-20.558,0.000,-4.231,-3.495
blueTotalGolds,0.0001,8.38e-06,13.822,0.000,9.94e-05,0.000
blueTotalMinionKills,0.0065,0.001,10.535,0.000,0.005,0.008
blueTotalJungleMinionKills,0.0100,0.001,6.909,0.000,0.007,0.013
blueKill,0.2139,0.007,31.186,0.000,0.200,0.227
blueDeath,-0.2472,0.005,-50.357,0.000,-0.257,-0.238
blueFirstDragon,1.6322,0.360,4.536,0.000,0.927,2.337
blueDragon,-1.1532,0.358,-3.221,0.001,-1.855,-0.451
blueRiftHeralds,0.1552,0.038,4.136,0.000,0.082,0.229


In [215]:
Xtest = partidas_10min_blue8.iloc[:,1:]

yhat = results.predict(Xtest)
prediction = list(map(round,yhat))
print('Real:',list(partidas_10min_blue8['blueWins'])[:10])
print('Prediction:',prediction[:10])

Real: [0, 1, 0, 0, 1, 1, 0, 1, 1, 1]
Prediction: [0, 0, 0, 0, 1, 0, 0, 0, 0, 1]


In [229]:
cm = confusion_matrix(list(partidas_10min_blue8['blueWins']), prediction)

print(f'Test accuracy =', accuracy_score(list(partidas_10min_blue8['blueWins']), prediction).round(3))

Test accuracy = 0.728


In [217]:
partidas_15min = pd.read_csv('data\Challenger_Ranked_Games_15minute.csv')
partidas_15min

Unnamed: 0,gameId,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,...,redFirstTowerLane,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragnoType,redDragon,redRiftHeralds
0,4247263043,0,24081,1190,44,8.8,309,74,0,8,...,['MID_LANE'],2,1,0,1,0,1,"['WATER_DRAGON', 'EARTH_DRAGON']",2,1
1,4247155821,1,24162,2212,46,9.2,393,64,0,5,...,['TOP_LANE'],1,0,1,0,0,0,[],0,0
2,4243963257,0,22413,1563,41,8.2,300,62,0,5,...,['BOT_LANE'],4,2,1,1,0,1,"['FIRE_DRAGON', 'EARTH_DRAGON']",2,1
3,4241678498,0,23837,3197,46,9.2,370,96,0,6,...,['TOP_LANE'],1,0,1,0,0,0,[],0,1
4,4241538868,1,27688,3663,44,8.8,381,66,0,9,...,['BOT_LANE'],2,1,0,1,0,0,[],0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26829,4143231833,0,26110,1535,46,9.2,289,73,0,15,...,['BOT_LANE'],1,0,0,1,0,1,['WATER_DRAGON'],1,1
26830,4157911901,0,57503,3293,62,12.4,329,28,0,27,...,[],6,2,1,3,1,0,[],0,0
26831,3764171638,0,26091,2986,47,9.4,338,86,0,7,...,[],0,0,0,0,0,0,[],0,0
26832,4110201724,1,24734,4289,45,9.0,328,64,0,11,...,['BOT_LANE'],1,0,0,1,0,0,['AIR_DRAGON'],1,0


In [218]:
partidas_15min = partidas_15min.drop(columns=['gameId','blueFirstTowerLane','blueDragnoType','redFirstTowerLane','redDragnoType'])
partidas_15min

Unnamed: 0,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,blueDeath,...,redFirstTower,redFirstInhibitor,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragon,redRiftHeralds
0,0,24081,1190,44,8.8,309,74,0,8,14,...,1,0,2,1,0,1,0,1,2,1
1,1,24162,2212,46,9.2,393,64,0,5,6,...,1,0,1,0,1,0,0,0,0,0
2,0,22413,1563,41,8.2,300,62,0,5,20,...,1,0,4,2,1,1,0,1,2,1
3,0,23837,3197,46,9.2,370,96,0,6,13,...,1,0,1,0,1,0,0,0,0,1
4,1,27688,3663,44,8.8,381,66,0,9,10,...,1,0,2,1,0,1,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26829,0,26110,1535,46,9.2,289,73,0,15,14,...,1,0,1,0,0,1,0,1,1,1
26830,0,57503,3293,62,12.4,329,28,0,27,37,...,0,1,6,2,1,3,1,0,0,0
26831,0,26091,2986,47,9.4,338,86,0,7,6,...,0,0,0,0,0,0,0,0,0,0
26832,1,24734,4289,45,9.0,328,64,0,11,7,...,1,0,1,0,0,1,0,0,1,0


In [219]:
partidas_15min['blueFirstBlood'] = partidas_15min['blueFirstBlood'].astype('category')
partidas_15min['blueFirstBlood'] = partidas_15min['blueFirstBlood'].cat.rename_categories({0:'No',1:'Yes'})

partidas_15min['redFirstBlood'] = partidas_15min['redFirstBlood'].astype('category')
partidas_15min['redFirstBlood'] = partidas_15min['redFirstBlood'].cat.rename_categories({0:'No',1:'Yes'})

partidas_15min['blueFirstTower'] = partidas_15min['blueFirstTower'].astype('category')
partidas_15min['blueFirstTower'] = partidas_15min['blueFirstTower'].cat.rename_categories({0:'No',1:'Yes'})

partidas_15min['redFirstTower'] = partidas_15min['redFirstTower'].astype('category')
partidas_15min['redFirstTower'] = partidas_15min['redFirstTower'].cat.rename_categories({0:'No',1:'Yes'})

partidas_15min['blueFirstInhibitor'] = partidas_15min['blueFirstInhibitor'].astype('category')
partidas_15min['blueFirstInhibitor'] = partidas_15min['blueFirstInhibitor'].cat.rename_categories({0:'No',1:'Yes'})

partidas_15min['redFirstInhibitor'] = partidas_15min['redFirstInhibitor'].astype('category')
partidas_15min['redFirstInhibitor'] = partidas_15min['redFirstInhibitor'].cat.rename_categories({0:'No',1:'Yes'})

In [220]:
partidas_15min_blue = partidas_15min.iloc[:,:23]
partidas_15min_red = partidas_15min.iloc[:,24:45]

In [221]:
partidas_15min_blue

Unnamed: 0,blueWins,blueTotalGolds,blueCurrentGolds,blueTotalLevel,blueAvgLevel,blueTotalMinionKills,blueTotalJungleMinionKills,blueFirstBlood,blueKill,blueDeath,...,blueFirstTower,blueFirstInhibitor,blueTowerKills,blueMidTowerKills,blueTopTowerKills,blueBotTowerKills,blueInhibitor,blueFirstDragon,blueDragon,blueRiftHeralds
0,0,24081,1190,44,8.8,309,74,No,8,14,...,No,No,0,0,0,0,0,0,0,0
1,1,24162,2212,46,9.2,393,64,No,5,6,...,No,No,0,0,0,0,0,0,0,1
2,0,22413,1563,41,8.2,300,62,No,5,20,...,No,No,0,0,0,0,0,0,0,0
3,0,23837,3197,46,9.2,370,96,No,6,13,...,No,No,0,0,0,0,0,1,1,0
4,1,27688,3663,44,8.8,381,66,No,9,10,...,No,No,1,0,1,0,0,1,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26829,0,26110,1535,46,9.2,289,73,No,15,14,...,No,No,0,0,0,0,0,0,1,0
26830,0,57503,3293,62,12.4,329,28,No,27,37,...,Yes,No,5,3,2,0,1,1,3,0
26831,0,26091,2986,47,9.4,338,86,No,7,6,...,Yes,No,1,0,1,0,0,0,0,0
26832,1,24734,4289,45,9.0,328,64,No,11,7,...,No,No,0,0,0,0,0,1,1,0


In [222]:
partidas_15min_red

Unnamed: 0,redTotalGolds,redCurrentGolds,redTotalLevel,redAvgLevel,redTotalMinionKills,redTotalJungleMinionKills,redFirstBlood,redKill,redDeath,redAssist,...,redWardKills,redFirstTower,redFirstInhibitor,redTowerKills,redMidTowerKills,redTopTowerKills,redBotTowerKills,redInhibitor,redFirstDragon,redDragon
0,30099,6073,48,9.6,366,76,No,14,8,34,...,12,Yes,No,2,1,0,1,0,1,2
1,26015,3900,48,9.6,394,89,No,6,5,5,...,11,Yes,No,1,0,1,0,0,0,0
2,34296,5496,50,10.0,388,85,No,20,5,20,...,9,Yes,No,4,2,1,1,0,1,2
3,27824,5223,48,9.6,405,72,No,13,6,24,...,13,Yes,No,1,0,1,0,0,0,0
4,25826,2909,47,9.4,329,87,No,10,9,17,...,9,Yes,No,2,1,0,1,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
26829,27301,3931,47,9.4,323,84,No,14,15,17,...,11,Yes,No,1,0,0,1,0,1,1
26830,62919,6253,64,12.8,428,120,No,37,27,29,...,0,No,Yes,6,2,1,3,1,0,0
26831,25579,1897,46,9.2,377,88,No,6,7,8,...,8,No,No,0,0,0,0,0,0,0
26832,23593,4668,43,8.6,343,67,No,7,11,11,...,13,Yes,No,1,0,0,1,0,0,1


In [223]:
results = regress(partidas_15min_blue)
results.summary()

         Current function value: 0.445911
         Iterations: 35




0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26814.0
Method:,MLE,Df Model:,19.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3567
Time:,17:37:14,Log-Likelihood:,-11966.0
converged:,False,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.5276,0.152,-10.042,0.000,-1.826,-1.229
blueFirstTower[T.Yes],0.2257,0.045,5.004,0.000,0.137,0.314
blueFirstInhibitor[T.Yes],1.0435,0.297,3.515,0.000,0.462,1.625
blueTotalGolds,6.306e-05,1.25e-05,5.033,0.000,3.85e-05,8.76e-05
blueCurrentGolds,8.535e-05,1.25e-05,6.834,0.000,6.09e-05,0.000
blueTotalLevel,-0.0196,2.03e+04,-9.64e-07,1.000,-3.98e+04,3.98e+04
blueAvgLevel,-0.0039,1.02e+05,-3.81e-08,1.000,-1.99e+05,1.99e+05
blueTotalMinionKills,0.0005,0.000,0.970,0.332,-0.000,0.001
blueTotalJungleMinionKills,0.0046,0.001,4.078,0.000,0.002,0.007


In [224]:
partidas_15min_blue1 = partidas_15min_blue.drop(columns=['blueTotalLevel','blueAvgLevel','blueTowerKills','blueMidTowerKills','blueTopTowerKills','blueBotTowerKills'])
results = regress(partidas_15min_blue1)
results.summary()

Optimization terminated successfully.
         Current function value: 0.447274
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26818.0
Method:,MLE,Df Model:,15.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3547
Time:,17:39:00,Log-Likelihood:,-12002.0
converged:,True,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.9463,0.132,-14.796,0.000,-2.204,-1.689
blueFirstTower[T.Yes],0.3896,0.037,10.585,0.000,0.317,0.462
blueFirstInhibitor[T.Yes],1.0613,0.298,3.557,0.000,0.476,1.646
blueTotalGolds,5.763e-05,7.77e-06,7.417,0.000,4.24e-05,7.29e-05
blueCurrentGolds,9.155e-05,1.24e-05,7.400,0.000,6.73e-05,0.000
blueTotalMinionKills,-0.0003,0.000,-0.694,0.488,-0.001,0.001
blueTotalJungleMinionKills,0.0044,0.001,3.916,0.000,0.002,0.007
blueKill,0.1790,0.006,27.938,0.000,0.166,0.192
blueDeath,-0.2232,0.004,-52.638,0.000,-0.231,-0.215


In [225]:
partidas_15min_blue2 = partidas_15min_blue1.drop(columns=['blueInhibitor'])
results = regress(partidas_15min_blue2)
results.summary()

Optimization terminated successfully.
         Current function value: 0.447275
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26819.0
Method:,MLE,Df Model:,14.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3547
Time:,17:40:04,Log-Likelihood:,-12002.0
converged:,True,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.9449,0.131,-14.805,0.000,-2.202,-1.687
blueFirstTower[T.Yes],0.3897,0.037,10.589,0.000,0.318,0.462
blueFirstInhibitor[T.Yes],1.0002,0.131,7.658,0.000,0.744,1.256
blueTotalGolds,5.752e-05,7.76e-06,7.417,0.000,4.23e-05,7.27e-05
blueCurrentGolds,9.157e-05,1.24e-05,7.401,0.000,6.73e-05,0.000
blueTotalMinionKills,-0.0003,0.000,-0.688,0.492,-0.001,0.001
blueTotalJungleMinionKills,0.0044,0.001,3.916,0.000,0.002,0.007
blueKill,0.1790,0.006,27.947,0.000,0.166,0.192
blueDeath,-0.2232,0.004,-52.636,0.000,-0.231,-0.215


In [226]:
partidas_15min_blue3 = partidas_15min_blue2.drop(columns=['blueAssist'])
results = regress(partidas_15min_blue3)
results.summary()

Optimization terminated successfully.
         Current function value: 0.447276
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26820.0
Method:,MLE,Df Model:,13.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3547
Time:,17:40:36,Log-Likelihood:,-12002.0
converged:,True,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.9484,0.131,-14.883,0.000,-2.205,-1.692
blueFirstTower[T.Yes],0.3905,0.037,10.637,0.000,0.319,0.462
blueFirstInhibitor[T.Yes],0.9954,0.130,7.680,0.000,0.741,1.249
blueTotalGolds,5.669e-05,7.23e-06,7.845,0.000,4.25e-05,7.09e-05
blueCurrentGolds,9.172e-05,1.24e-05,7.420,0.000,6.75e-05,0.000
blueTotalMinionKills,-0.0003,0.000,-0.627,0.531,-0.001,0.001
blueTotalJungleMinionKills,0.0045,0.001,4.075,0.000,0.002,0.007
blueKill,0.1783,0.006,29.910,0.000,0.167,0.190
blueDeath,-0.2230,0.004,-53.179,0.000,-0.231,-0.215


In [227]:
partidas_15min_blue4 = partidas_15min_blue3.drop(columns=['blueTotalMinionKills'])
results = regress(partidas_15min_blue4)
results.summary()

Optimization terminated successfully.
         Current function value: 0.447284
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26821.0
Method:,MLE,Df Model:,12.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3547
Time:,17:41:11,Log-Likelihood:,-12002.0
converged:,True,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.9755,0.124,-15.911,0.000,-2.219,-1.732
blueFirstTower[T.Yes],0.3904,0.037,10.636,0.000,0.318,0.462
blueFirstInhibitor[T.Yes],1.0062,0.129,7.830,0.000,0.754,1.258
blueTotalGolds,5.459e-05,6.41e-06,8.518,0.000,4.2e-05,6.72e-05
blueCurrentGolds,9.198e-05,1.24e-05,7.445,0.000,6.78e-05,0.000
blueTotalJungleMinionKills,0.0041,0.001,4.349,0.000,0.002,0.006
blueKill,0.1797,0.006,32.158,0.000,0.169,0.191
blueDeath,-0.2223,0.004,-55.057,0.000,-0.230,-0.214
blueWardPlaced,-0.0018,0.001,-2.818,0.005,-0.003,-0.001


In [228]:
partidas_15min_blue5 = partidas_15min_blue4.drop(columns=['blueFirstDragon'])
results = regress(partidas_15min_blue5)
results.summary()

Optimization terminated successfully.
         Current function value: 0.447300
         Iterations 7


0,1,2,3
Dep. Variable:,blueWins,No. Observations:,26834.0
Model:,Logit,Df Residuals:,26822.0
Method:,MLE,Df Model:,11.0
Date:,"Tue, 14 May 2024",Pseudo R-squ.:,0.3547
Time:,17:41:35,Log-Likelihood:,-12003.0
converged:,True,LL-Null:,-18600.0
Covariance Type:,nonrobust,LLR p-value:,0.0

0,1,2,3,4,5,6
,coef,std err,z,P>|z|,[0.025,0.975]
Intercept,-1.9676,0.124,-15.899,0.000,-2.210,-1.725
blueFirstTower[T.Yes],0.3906,0.037,10.640,0.000,0.319,0.463
blueFirstInhibitor[T.Yes],1.0072,0.128,7.841,0.000,0.755,1.259
blueTotalGolds,5.437e-05,6.4e-06,8.492,0.000,4.18e-05,6.69e-05
blueCurrentGolds,9.188e-05,1.24e-05,7.438,0.000,6.77e-05,0.000
blueTotalJungleMinionKills,0.0041,0.001,4.352,0.000,0.002,0.006
blueKill,0.1797,0.006,32.174,0.000,0.169,0.191
blueDeath,-0.2223,0.004,-55.056,0.000,-0.230,-0.214
blueWardPlaced,-0.0018,0.001,-2.817,0.005,-0.003,-0.001


In [231]:
Xtest = partidas_15min_blue5.iloc[:,1:]

yhat = results.predict(Xtest)
prediction = list(map(round,yhat))
print('Real:',list(partidas_15min_blue5['blueWins'])[:10])
print('Prediction:',prediction[:10])

cm = confusion_matrix(list(partidas_15min_blue5['blueWins']), prediction)

print(f'Test accuracy =', accuracy_score(list(partidas_15min_blue5['blueWins']), prediction).round(3))

Real: [0, 1, 0, 0, 1, 1, 0, 1, 1, 1]
Prediction: [0, 0, 0, 0, 0, 0, 0, 1, 0, 1]
Test accuracy = 0.791
