# Trabalho da Disciplina de Aplicações em Aprendizado de Máquina

Autor: Fábio Demo da Rosa

## Nearest Neighbors

O princípio por trás do vizinho mais próximo é encontrar um número pré-definido de amostras que podem ser definida por uma constante do usuário (K-Nearest Neighbor Learning) ou variar baseada na densidade de pontos locais.

Essa distância pode ser qualquer métrica, como por exemplo a Distância Euclidiana.

Métodos baseados nos vizinhos mais próximos são conhecidos como non-generalizing machine learning methods, já que eles somente lembram todos os dados de treino.

Apesar da simplificade, os vizinhos mais próximos tem sido bem-sucedidos em diversos problemas de classificação e regressão, como identificação de dígitos escritos a mão ou de classificação de imagens de satélite.

In [6]:
from IPython.display import Image

Image(url='https://scikit-learn.org/stable/_images/sphx_glr_plot_classification_002.png')

## Case-Based Reasoning
É uma forma de Inteligência Artificial (IA) usada para resolver novos problemas baseados em problemas (similares) passados.

Quando lembramos de um problema semelhante a um novo problema, tendemos a reutilizar a mesma antiga solução ou em alguns casos adaptar uma solução nova, com o conhecimento retido/antigo.

É composto por 4 principais processos (dependendo da literatura):
1. Recuperação - Recuperação de casos (descrição do problema) relevantes na memória;
2. Reuso - Reutilização da solução de casos anteriores ao novo problema;
3. Revisão - Após mapear a solução, deve-se testar e revisar a solução.
4. Retenção - Após a solução ter sido adaptada ao novo problema, a solução do problema será armazenada como uma nova experiência na memória.

In [9]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import preprocessing

Codificação de Naipes para poder buscar pelo vizinho mais próximo

In [10]:
def codificar_naipes(x):
    print(x)
    if (x == 'ESPADAS'):
        return 1
    
    if (x == 'OURO'):
        return 2
    
    if (x == 'BASTOS'):
        return 3
    
    if (x == 'COPAS'):
        return 4

Tratamentos para o dataframe

In [11]:
df = pd.read_csv('dbtrucoimitacao_maos.csv', index_col='idMao').fillna(0)
colunas_string = [
    'naipeCartaAltaRobo', 'naipeCartaMediaRobo','naipeCartaBaixaRobo', 'naipeCartaAltaHumano','naipeCartaMediaHumano', 'naipeCartaBaixaHumano','naipePrimeiraCartaRobo', 'naipePrimeiraCartaHumano',	'naipeSegundaCartaRobo', 'naipeSegundaCartaHumano','naipeTerceiraCartaRobo', 'naipeTerceiraCartaHumano',
    ]
colunas_int = [col for col in df.columns if col not in colunas_string]
df[colunas_int] = df[colunas_int].astype('int').apply(abs)
# df[colunas_string] = df[colunas_string].any(axis=1)
# df[colunas_string] = df[colunas_string].apply(lambda x: print(x), axis=1)
# pd.get_dummies(df[colunas_string], prefix=colunas_string)
df.replace('ESPADAS', '1', inplace=True)
df.replace('OURO', '2', inplace=True)
df.replace('BASTOS', '3', inplace=True)
df.replace('COPAS', '4', inplace=True)
df[colunas_string] = df[colunas_string].astype('int')
# df.apply(abs)
# df = df[(df >= 0).all(axis=1)]

Verificação da codificação de naipes

In [12]:
df

Unnamed: 0_level_0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaRobo,primeiraCartaHumano,segundaCartaRobo,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
idMao,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,2,52,8,2,50,16,16,52,16,2,...,0,1,0,62,82,1,0,1,1,0
2,1,24,16,4,7,6,4,16,4,4,...,1,0,1,44,17,1,0,1,1,0
3,2,12,8,4,42,16,1,4,1,8,...,0,1,0,24,59,0,1,1,5,0
4,1,24,4,1,24,8,6,24,6,1,...,0,1,0,29,38,1,0,1,1,0
5,2,50,12,3,7,6,4,12,4,3,...,1,0,1,65,17,0,1,1,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7713,1,16,12,8,42,6,4,46,46,46,...,3,0,0,0,0,0,0,3,0,0
7714,2,12,7,1,0,0,0,46,46,46,...,0,0,0,0,0,0,0,1,0,0
7716,2,16,8,3,0,0,0,3,16,46,...,0,0,0,0,0,0,0,1,1,0
7717,1,52,24,8,42,24,12,8,42,52,...,2,0,0,0,0,0,0,3,1,0


In [13]:
for column in df.columns:
    print(column)

jogadorMao
cartaAltaRobo
cartaMediaRobo
cartaBaixaRobo
cartaAltaHumano
cartaMediaHumano
cartaBaixaHumano
primeiraCartaRobo
primeiraCartaHumano
segundaCartaRobo
segundaCartaHumano
terceiraCartaRobo
terceiraCartaHumano
ganhadorPrimeiraRodada
ganhadorSegundaRodada
ganhadorTerceiraRodada
quemPediuEnvido
quemPediuFaltaEnvido
quemPediuRealEnvido
pontosEnvidoRobo
pontosEnvidoHumano
quemNegouEnvido
quemGanhouEnvido
tentosEnvido
quemFlor
quemContraFlor
quemContraFlorResto
quemNegouFlor
pontosFlorRobo
pontosFlorHumano
quemGanhouFlor
tentosFlor
quemEscondeuPontosEnvido
quemEscondeuPontosFlor
quemTruco
quandoTruco
quemRetruco
quandoRetruco
quemValeQuatro
quandoValeQuatro
quemNegouTruco
quemGanhouTruco
tentosTruco
tentosAnterioresRobo
tentosAnterioresHumano
tentosPosterioresRobo
tentosPosterioresHumano
roboMentiuEnvido
humanoMentiuEnvido
roboMentiuFlor
humanoMentiuFlor
roboMentiuTruco
humanoMentiuTruco
quemBaralho
quandoBaralho
quemContraFlorFalta
quemEnvidoEnvido
quemFlorFlor
quandoCartaVirada
nai

In [14]:
df.primeiraCartaRobo

idMao
1       52
2       16
3        4
4       24
5       12
        ..
7713    46
7714    46
7716     3
7717     8
7718    46
Name: primeiraCartaRobo, Length: 27515, dtype: int64

Métricas estatísticas de todas as colunas da base de casos

In [15]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
jogadorMao,27515.0,1.496893,0.499999,1.0,1.0,1.0,2.0,2.0
cartaAltaRobo,27515.0,23.592913,15.707426,1.0,8.0,24.0,40.0,52.0
cartaMediaRobo,27515.0,9.255642,7.850880,1.0,4.0,7.0,12.0,50.0
cartaBaixaRobo,27515.0,3.603453,3.323414,1.0,1.0,3.0,6.0,42.0
cartaAltaHumano,27515.0,9.995602,15.792326,0.0,0.0,0.0,16.0,52.0
...,...,...,...,...,...,...,...,...
quantidadeChamadasEnvidoRobo,27515.0,0.004870,0.069617,0.0,0.0,0.0,0.0,1.0
quantidadeChamadasEnvidoHumano,27515.0,0.005488,0.073878,0.0,0.0,0.0,0.0,1.0
saldoTruco,27515.0,1.646338,0.861692,0.0,1.0,1.0,2.0,4.0
saldoEnvido,27515.0,1.019444,1.465042,0.0,0.0,1.0,1.0,24.0


## sklearn
Avaliação das métricas da base de dados

In [16]:
from sklearn.model_selection import train_test_split

In [17]:
y = df['primeiraCartaRobo']
X = df.drop(['primeiraCartaRobo'], axis=1)

SEED = 42
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=SEED)

In [18]:
len(X), len(X_train), len(X_test)

(27515, 20636, 6879)

In [19]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
scaler.fit(X_train)

X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)

In [20]:
scaled_df = pd.DataFrame(X_train, columns=X.columns)
scaled_df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
jogadorMao,20636.0,1.909265e-16,1.000024,-0.991507,-0.991507,-0.991507,1.008565,1.008565
cartaAltaRobo,20636.0,4.992668e-18,1.000024,-1.437715,-0.992285,0.025840,1.043966,1.807560
cartaMediaRobo,20636.0,-7.919404e-18,1.000024,-1.045525,-0.665515,-0.285505,0.347845,5.161306
cartaBaixaRobo,20636.0,5.440287e-17,1.000024,-0.780186,-0.780186,-0.182942,0.712926,11.463332
cartaAltaHumano,20636.0,3.133330e-17,1.000024,-0.632991,-0.632991,-0.632991,0.380787,2.661786
...,...,...,...,...,...,...,...,...
quantidadeChamadasEnvidoRobo,20636.0,-3.305491e-17,1.000024,-0.070132,-0.070132,-0.070132,-0.070132,14.258921
quantidadeChamadasEnvidoHumano,20636.0,2.892304e-17,1.000024,-0.075187,-0.075187,-0.075187,-0.075187,13.300246
saldoTruco,20636.0,-7.213544e-17,1.000024,-1.913829,-0.752111,-0.752111,0.409608,2.733045
saldoEnvido,20636.0,6.335524e-17,1.000024,-0.692092,-0.692092,-0.016628,-0.016628,15.519033


In [21]:
from sklearn.neighbors import KNeighborsRegressor

In [22]:
regressor = KNeighborsRegressor(n_neighbors=100)
regressor.fit(X_train, y_train)

KNeighborsRegressor(n_neighbors=100)

In [23]:
y_pred = regressor.predict(X_test)

Verificação do Mean Absolute Error (MAE), Mean Squared Error (MSE) Root Mean Squared Error (RMSE)
* MAE é usada para avaliar os erros entre as observações;
* MSE é usada para avaliar o viés (bias) e a variação;
* RMSE é utilizada para verificar a acurácia do modelo, ao realizar a raiz quadrada do MSE.

In [24]:
from sklearn.metrics import mean_absolute_error, mean_squared_error

mae = mean_absolute_error(y_test, y_pred)
mse = mean_squared_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False) # Squared -> If True returns MSE value, if False returns RMSE value.

print(f'mae: {mae}')
print(f'mse: {mse}')
print(f'rmse: {rmse}')

mae: 5.0167030091583085
mse: 59.356178107283036
rmse: 7.7042960812317585


Coeficiente de determinação da predição

In [25]:
regressor.score(X_test, y_test)

0.7525823904494113

In [26]:
y.describe()

count    27515.000000
mean        14.115900
std         15.455371
min          1.000000
25%          3.000000
50%          7.000000
75%         16.000000
max         52.000000
Name: primeiraCartaRobo, dtype: float64

## Teste com casos da própria base

In [27]:
X

Unnamed: 0_level_0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaHumano,segundaCartaRobo,segundaCartaHumano,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
idMao,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,2,52,8,2,50,16,16,16,2,16,...,0,1,0,62,82,1,0,1,1,0
2,1,24,16,4,7,6,4,4,4,7,...,1,0,1,44,17,1,0,1,1,0
3,2,12,8,4,42,16,1,1,8,16,...,0,1,0,24,59,0,1,1,5,0
4,1,24,4,1,24,8,6,6,1,8,...,0,1,0,29,38,1,0,1,1,0
5,2,50,12,3,7,6,4,4,3,6,...,1,0,1,65,17,0,1,1,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
7713,1,16,12,8,42,6,4,46,46,46,...,3,0,0,0,0,0,0,3,0,0
7714,2,12,7,1,0,0,0,46,46,46,...,0,0,0,0,0,0,0,1,0,0
7716,2,16,8,3,0,0,0,16,46,46,...,0,0,0,0,0,0,0,1,1,0
7717,1,52,24,8,42,24,12,42,52,24,...,2,0,0,0,0,0,0,3,1,0


In [28]:
X.iloc[0]

jogadorMao                         2
cartaAltaRobo                     52
cartaMediaRobo                     8
cartaBaixaRobo                     2
cartaAltaHumano                   50
                                  ..
quantidadeChamadasEnvidoRobo       1
quantidadeChamadasEnvidoHumano     0
saldoTruco                         1
saldoEnvido                        1
saldoFlor                          0
Name: 1, Length: 79, dtype: int64

Transformação em array numpy e reshape para poder passar pelo Nearest Neighbors

In [29]:
X.iloc[0].to_numpy()

array([  2,  52,   8,   2,  50,  16,  16,  16,   2,  16,  15,   1,   1,
         2, 130,   1,   0,   0,   5,   2,   2,   1,   1,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   2,   2,   0,   0,   0,   0,
         1,   2,   1,   2,   0,   3,   1,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   1,   3,   4,   3,   2,   3,   1,
         4,   4,   2,   0,   0,   1,   0,  62,  82,   1,   0,   1,   1,
         0])

In [30]:
from sklearn.neighbors import NearestNeighbors
neigh = NearestNeighbors(n_neighbors=20)
neigh.fit(X.values)
print(neigh.kneighbors([X.iloc[0]]))

(array([[  0.        ,  68.9710084 ,  69.75672011,  75.07329752,
         75.07995738,  76.28892449,  76.39371702,  78.83527129,
         79.7684148 ,  80.75270893,  81.38181615,  85.31705574,
         86.43494664,  90.76342876,  93.42376571,  98.07140256,
         99.3680029 ,  99.74968672, 100.44899203, 103.82677882]]), array([[  0,   2,  24,  19,   3, 234, 398, 246,  37, 258,   6, 254, 252,
        294, 305, 366, 411, 391, 224,  50]]))


In [31]:
# Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
teste = X.iloc[0].to_numpy().reshape(1, -1)
teste
neigh.kneighbors(teste, return_distance=False)

array([[  0,   2,  24,  19,   3, 234, 398, 246,  37, 258,   6, 254, 252,
        294, 305, 366, 411, 391, 224,  50]])

In [32]:
teste

array([[  2,  52,   8,   2,  50,  16,  16,  16,   2,  16,  15,   1,   1,
          2, 130,   1,   0,   0,   5,   2,   2,   1,   1,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   2,   2,   0,   0,   0,   0,
          1,   2,   1,   2,   0,   3,   1,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   1,   3,   4,   3,   2,   3,   1,
          4,   4,   2,   0,   0,   1,   0,  62,  82,   1,   0,   1,   1,
          0]])

In [33]:
X.iloc[21].to_numpy().reshape(1, -1)

array([[ 2, 50,  6,  3, 42,  6,  1,  1,  6, 42, 46,  1,  1,  1,  1,  2,
         0,  0,  6,  7,  1,  2,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  2,  2,  0,  0,  0,  0,  0,  1,  2,  6,  7,  8,  8,  0,  0,
         0,  0,  0,  0,  2,  2,  0,  0,  0,  0,  3,  2,  1,  1,  4,  1,
         1,  1,  2,  1,  0,  0,  1,  0, 59, 49,  0,  1,  2,  1,  0]])

In [34]:
X.iloc[36].to_numpy().reshape(1, -1)

array([[  2,  16,   3,   2,  50,   8,   3,   3,  46,   1,  15,   1,   0,
        130, 130,   1,   0,   0,  31,   6,   0,   1,   2,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   0,   2,   1,   0,   0,   0,   0,
          1,   2,   1,   6,   9,   8,  10,   0,   0,   0,   0,   0,   0,
          0,   0,   0,   0,   0,   0,   1,   4,   4,   3,   4,   3,   4,
          3,   0,   0,   0,   0,   1,   0,  21,  61,   1,   0,   1,   2,
          0]])

In [35]:
X.iloc[90].to_numpy().reshape(1, -1)

array([[ 1, 40, 16,  8,  8,  3,  2,  3, 16,  1, 46,  1,  1,  1,  1,  1,
         0,  2, 29, 31,  0,  1,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  1,  1, 10,  7, 16,  7,  0,  0,
         0,  0,  0,  0,  2,  1,  0,  0,  0,  0,  2,  2,  1,  3,  3,  3,
         1,  3,  2,  0,  0,  0,  0,  0, 64, 13,  1,  0,  1,  5,  0]])

Separar colunas interessantes do truco

In [36]:
colunas_truco = ['jogadorMao', 'cartaAltaRobo', 'cartaMediaRobo', 'cartaBaixaRobo', 'ganhadorPrimeiraRodada', 'ganhadorSegundaRodada', 'ganhadorTerceiraRodada', 'quemFlor', 'quemContraFlor', 'pontosFlorRobo', 'pontosFlorHumano', 'quemTruco', 'quandoTruco', 'naipeCartaAltaRobo', 'naipeCartaMediaRobo', 'naipeCartaBaixaRobo', 'qualidadeMaoRobo', 'ganhadorPrimeiraRodada', 'ganhadorSegundaRodada', 'ganhadorTerceiraRodada']
len(colunas_truco)

20

A ball tree recursively divides the data into nodes defined by a centroid  and radius , such that each point in the node lies within the hyper-sphere defined by  and . The number of candidate points for a neighbor search is reduced through use of the triangle inequality:

In [37]:
nbrs = NearestNeighbors(n_neighbors=100, algorithm='ball_tree').fit(df)
# distances, indices = nbrs.kneighbors(df)

In [38]:
# indices

In [39]:
# distances

In [40]:
teste_distancia, teste_indices = nbrs.kneighbors(df.iloc[0].to_numpy().reshape(1, -1))



In [41]:
teste_distancia

array([[  0.        ,  78.49840763,  80.13114251,  81.26499862,
         83.6600263 ,  84.02975663,  84.53993139,  86.06392973,
         87.52713865,  88.15894736,  92.93546148,  96.45724441,
         97.44742172, 100.11992809, 101.75460678, 107.48953437,
        109.43034314, 109.66312051, 110.79259903, 111.26544837,
        118.77289253, 128.63514294, 141.36477638, 142.88806808,
        144.70659971, 146.28055236, 146.47866739, 147.6854766 ,
        147.8140724 , 147.92227689, 148.3981132 , 148.74810923,
        148.81532179, 148.85563476, 148.95301273, 149.08051516,
        149.12075644, 149.57941035, 149.71305888, 150.07331542,
        150.38949431, 150.43935655, 150.46261994, 150.48920227,
        151.01655538, 151.13570061, 151.49257408, 151.55856954,
        151.75308893, 151.79920948, 151.84202317, 151.85190154,
        152.26293049, 152.49590158, 152.53851973, 152.64009958,
        152.82669924, 152.95097254, 152.97058541, 153.03594349,
        153.29709717, 153.30035877, 153.

In [42]:
teste_indices

array([[  0,  24,   3, 234, 246,   2,  37,   6,  19, 398, 258, 254, 252,
        305, 294, 366, 391, 224, 411,  50, 245, 142, 337, 312, 110, 265,
        147, 394, 392, 400, 395, 180, 317,  59,  43, 335, 413, 362, 401,
         76, 233, 226,  21,  55, 299,  35,  11, 403,  86, 427, 278, 420,
         15, 397,  22, 225,  18, 105, 262, 396, 185, 170, 289,  67, 140,
        339, 318, 311, 189, 220,  69,  38,  66, 173, 404, 422, 347, 277,
        333, 139, 178,  40, 443, 216, 124, 353, 431,  33, 163,  68, 283,
        250, 308,  32, 118, 203,  97, 399, 344,  62]])

colunas_truco = ['jogadorMao', 'cartaAltaRobo', 'cartaMediaRobo', 'cartaBaixaRobo', 'ganhadorPrimeiraRodada', 'ganhadorSegundaRodada', 'ganhadorTerceiraRodada', 'quemFlor', 'quemContraFlor', 'pontosFlorRobo', 'pontosFlorHumano', 'quemTruco', 'quandoTruco', 'naipeCartaAltaRobo', 'naipeCartaMediaRobo', 'naipeCartaBaixaRobo', 'qualidadeMaoRobo', 'ganhadorPrimeiraRodada', 'ganhadorSegundaRodada', 'ganhadorTerceiraRodada']

In [43]:
teste_registro = df.iloc[0]
teste_registro = df.apply(lambda x: 0)
teste_registro.jogadorMao = 1
teste_registro.cartaAltaRobo = 52
teste_registro.cartaMediaRobo = 24
teste_registro.cartaBaixaRobo = 12
teste_registro.ganhadorPrimeiraRodada = 2
teste_registro.ganhadorSegundaRodada = 2
teste_registro.ganhadorTerceiraRodada = 2
teste_distancia, teste_indices = nbrs.kneighbors(teste_registro.to_numpy().reshape(1, -1))



In [44]:
teste_distancia

array([[28.75760769, 29.94995826, 32.17141588, 32.68026928, 32.71085447,
        33.04542328, 33.21144381, 33.28663395, 33.28663395, 34.35112807,
        34.64101615, 34.78505426, 34.89985673, 35.        , 35.19943181,
        35.34119409, 36.05551275, 36.41428291, 37.        , 37.21558813,
        37.21558813, 37.42993454, 37.42993454, 37.69615365, 37.94733192,
        38.23610859, 38.41874542, 38.41874542, 38.52272057, 38.78143886,
        38.88444419, 38.91015292, 38.98717738, 39.        , 39.0256326 ,
        39.05124838, 39.17907605, 39.33192088, 39.34463115, 39.66106403,
        39.74921383, 39.83716857, 39.96248241, 40.        , 40.03748244,
        40.04996879, 40.22437072, 40.22437072, 40.42276586, 40.43513324,
        40.58324778, 40.6571027 , 40.73082371, 40.77989701, 40.91454509,
        41.        , 41.0487515 , 41.14608122, 41.1703777 , 41.18252056,
        41.24318125, 41.25530269, 41.32795664, 41.37632173, 41.40048309,
        41.44876355, 41.6533312 , 41.66533331, 41.6

In [45]:
teste_indices

array([[24184, 22023,  7103,  1703, 13568,  5241, 22890,  5814,  1596,
        15072, 18359, 15230, 10280,  9878, 22727, 18968, 26532,  9236,
        20352, 21884,  9527, 15188, 24391,  3812,  3604, 18259, 18854,
         1105,  3294, 16934,  6495, 19854,  3856,  6193,  4554, 22370,
        17633, 16229, 13322,  7333, 11139, 26016, 24048, 15528, 21871,
         3343,  4161, 23130, 15181,  8280,  8357, 14773, 23859,  6689,
         4096, 13711, 10557, 14826,  1309,  4593, 18550, 12419, 14771,
         7276,  3883,  7832,  7655,  1684, 17402, 21156,  9764,  4059,
        17458, 16988, 14284, 19425,  7962, 20520,  3210, 10893, 15630,
         5832, 13326,  1640,  4189,  8307,  9443, 13123,  6191,  7694,
        22951, 10139, 12460, 13645, 20415, 20991, 13164, 11382, 18301,
        24837]])

In [46]:
teste_registro.to_numpy().reshape(1, -1)

array([[ 1, 52, 24, 12,  0,  0,  0,  0,  0,  0,  0,  0,  0,  2,  2,  2,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0]])

In [47]:
df.iloc[2380].to_numpy().reshape(1, -1)

array([[ 2, 52,  6,  1, 24,  8,  1,  1,  8, 52, 24,  6,  1,  2,  1,  1,
         0,  0,  0, 25,  4,  0,  0,  0,  1,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  2,  2,  0,  0,  0,  0,  0,  1,  2, 10, 17, 15, 17,  0,
         0,  0,  0,  0,  0,  2,  3,  0,  0,  0,  0,  1,  1,  1,  2,  1,
         4,  1,  1,  1,  2,  1,  0,  0,  0,  0,  0,  0,  0,  2,  0,  0]])

In [48]:
df['ganhadorPrimeiraRodada'].iloc[553]

2

In [49]:
pd.DataFrame(teste_registro).T

Unnamed: 0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaRobo,primeiraCartaHumano,segundaCartaRobo,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
0,1,52,24,12,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [50]:
df.iloc[teste_indices.tolist()[0]]

Unnamed: 0_level_0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaRobo,primeiraCartaHumano,segundaCartaRobo,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
idMao,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
4223,1,52,16,2,0,0,0,2,0,15,...,0,0,0,0,0,0,0,1,0,0
23126,1,42,16,3,0,0,0,3,0,15,...,0,0,0,0,0,0,0,1,1,0
7275,1,42,12,2,0,0,0,12,0,15,...,0,0,0,0,0,0,0,1,0,0
1714,2,52,8,8,4,1,1,8,1,8,...,0,0,0,0,0,0,0,1,1,0
14080,1,40,8,3,0,0,0,3,1,10,...,0,0,0,0,0,0,0,1,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22032,1,52,6,3,0,0,0,3,3,6,...,0,0,0,0,0,0,0,2,1,0
13654,1,24,6,1,0,0,0,6,0,15,...,0,0,0,0,0,0,0,1,0,0
11777,1,24,7,2,0,0,0,7,0,15,...,0,0,0,0,0,0,0,1,1,0
19137,1,42,12,6,0,0,0,15,0,15,...,0,0,0,0,0,0,0,2,0,0


In [51]:
df.ganhadorSegundaRodada.iloc[teste_indices.tolist()[0]]

idMao
4223     2
23126    2
7275     2
1714     1
14080    2
        ..
22032    2
13654    2
11777    2
19137    2
4900     2
Name: ganhadorSegundaRodada, Length: 100, dtype: int64

In [52]:
df.iloc[teste_indices.tolist()[0]].ganhadorSegundaRodada == 2

idMao
4223      True
23126     True
7275      True
1714     False
14080     True
         ...  
22032     True
13654     True
11777     True
19137     True
4900      True
Name: ganhadorSegundaRodada, Length: 100, dtype: bool

In [53]:
df.iloc[teste_indices.tolist()[0]].ganhadorPrimeiraRodada == 2

idMao
4223     False
23126    False
7275     False
1714     False
14080    False
         ...  
22032    False
13654    False
11777    False
19137     True
4900      True
Name: ganhadorPrimeiraRodada, Length: 100, dtype: bool

In [54]:
df.iloc[teste_indices.tolist()[0]].ganhadorPrimeiraRodada == 2

idMao
4223     False
23126    False
7275     False
1714     False
14080    False
         ...  
22032    False
13654    False
11777    False
19137     True
4900      True
Name: ganhadorPrimeiraRodada, Length: 100, dtype: bool

In [55]:
df.iloc[teste_indices.tolist()[0]].ganhadorSegundaRodada == 2

idMao
4223      True
23126     True
7275      True
1714     False
14080     True
         ...  
22032     True
13654     True
11777     True
19137     True
4900      True
Name: ganhadorSegundaRodada, Length: 100, dtype: bool

In [56]:
df.iloc[teste_indices.tolist()[0]].ganhadorTerceiraRodada == 2

idMao
4223      True
23126     True
7275      True
1714     False
14080     True
         ...  
22032     True
13654     True
11777     True
19137     True
4900      True
Name: ganhadorTerceiraRodada, Length: 100, dtype: bool

In [57]:
teste = df.iloc[teste_indices.tolist()[0]]
teste.ganhadorPrimeiraRodada

idMao
4223     1
23126    1
7275     1
1714     1
14080    1
        ..
22032    0
13654    1
11777    1
19137    2
4900     2
Name: ganhadorPrimeiraRodada, Length: 100, dtype: int64

In [58]:
teste[((teste.ganhadorPrimeiraRodada == 2) & (teste.ganhadorSegundaRodada == 2) | (teste.ganhadorPrimeiraRodada == 2) & (teste.ganhadorTerceiraRodada == 2) | (teste.ganhadorSegundaRodada == 2) & (teste.ganhadorTerceiraRodada == 2))]

Unnamed: 0_level_0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaRobo,primeiraCartaHumano,segundaCartaRobo,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
idMao,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
4223,1,52,16,2,0,0,0,2,0,15,...,0,0,0,0,0,0,0,1,0,0
23126,1,42,16,3,0,0,0,3,0,15,...,0,0,0,0,0,0,0,1,1,0
7275,1,42,12,2,0,0,0,12,0,15,...,0,0,0,0,0,0,0,1,0,0
14080,1,40,8,3,0,0,0,3,1,10,...,0,0,0,0,0,0,0,1,0,0
5328,1,50,3,3,0,0,0,3,0,15,...,0,0,0,0,0,0,0,1,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
22032,1,52,6,3,0,0,0,3,3,6,...,0,0,0,0,0,0,0,2,1,0
13654,1,24,6,1,0,0,0,6,0,15,...,0,0,0,0,0,0,0,1,0,0
11777,1,24,7,2,0,0,0,7,0,15,...,0,0,0,0,0,0,0,1,1,0
19137,1,42,12,6,0,0,0,15,0,15,...,0,0,0,0,0,0,0,2,0,0


In [146]:
teste[teste.primeiraCartaRobo == 3]

Unnamed: 0_level_0,jogadorMao,cartaAltaRobo,cartaMediaRobo,cartaBaixaRobo,cartaAltaHumano,cartaMediaHumano,cartaBaixaHumano,primeiraCartaRobo,primeiraCartaHumano,segundaCartaRobo,...,naipeTerceiraCartaHumano,quantidadeChamadasHumano,quantidadeChamadasRobo,qualidadeMaoRobo,qualidadeMaoHumano,quantidadeChamadasEnvidoRobo,quantidadeChamadasEnvidoHumano,saldoTruco,saldoEnvido,saldoFlor
idMao,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
23126,1,42,16,3,0,0,0,3,0,15,...,0,0,0,0,0,0,0,1,1,0
14080,1,40,8,3,0,0,0,3,1,10,...,0,0,0,0,0,0,0,1,0,0
5328,1,50,3,3,0,0,0,3,0,15,...,0,0,0,0,0,0,0,1,1,0
24069,1,50,3,3,0,0,0,3,3,10,...,0,0,0,0,0,0,0,2,1,0
10190,1,52,4,3,0,0,0,3,8,4,...,0,0,0,0,0,0,0,2,1,0
23891,1,50,8,3,0,0,0,3,2,10,...,0,0,0,0,0,0,0,2,1,0
3844,1,50,6,3,0,0,0,3,16,6,...,0,0,0,0,0,0,0,2,0,0
3306,1,40,6,3,0,0,0,3,3,6,...,0,0,0,0,0,0,0,2,0,0
4614,1,50,6,3,0,0,0,3,8,15,...,0,0,0,0,0,0,0,3,1,0
7510,1,52,4,3,0,0,0,3,7,4,...,0,0,0,0,0,0,0,3,0,0


In [145]:
teste.primeiraCartaRobo.value_counts()

3     20
2     14
1     13
15    12
4     12
6      9
8      8
7      6
16     5
12     1
Name: primeiraCartaRobo, dtype: int64

In [147]:
teste.segundaCartaRobo.value_counts()

15    57
10    14
7      8
8      5
6      5
3      3
12     3
16     3
4      2
Name: segundaCartaRobo, dtype: int64

In [148]:
teste.segundaCartaRobo.value_counts()

15    57
10    14
7      8
8      5
6      5
3      3
12     3
16     3
4      2
Name: segundaCartaRobo, dtype: int64

In [122]:
teste.primeiraCartaRobo.value_counts().index.to_list()[0]

3