## Builda-v1

Exemplo de algoritmos de classificação usando o dataset [Dota2 Game Results](https://archive.ics.uci.edu/ml/datasets/Dota2+Games+Results) do repositório UCI de Machine Learning.

### Dataset Information
> https://archive.ics.uci.edu/ml/datasets/Dota2+Games+Results

Data Set Information:

Dota 2 is a popular computer game with two teams of 5 players. At the start of the game each player chooses a unique hero with different strengths and weaknesses. The dataset is reasonably sparse as only 10 of 113 possible heroes are chosen in a given game. All games were played in a space of 2 hours on the 13th of August, 2016

The data was collected using: [dota2ApiLoader.py](https://gist.github.com/da-steve101/1a7ae319448db431715bd75391a66e1b)

Attribute Information:

Each row of the dataset is a single game with the following features (in the order in the vector):
1. Team won the game (1 or -1)
2. Cluster ID (related to location)
3. Game mode (eg All Pick)
4. Game type (eg. Ranked)
5 - end: Each element is an indicator for a hero. Value of 1 indicates that a player from team '1' played as that hero and '-1' for the other team. Hero can be selected by only one player each game. This means that each row has five '1' and five '-1' values.

The hero to id mapping can be found here: [heroes.json](https://github.com/kronusme/dota2-api/blob/master/data/heroes.json)
```

In [2]:
import pandas as pd
import numpy as np
import seaborn as sbn #sns
import matplotlib.pyplot as mppp #plt

from pandas.plotting import parallel_coordinates # plota baseado nas coordenadas
from sklearn.model_selection import train_test_split #método que SPLITA os dados de treino
from sklearn.tree import DecisionTreeClassifier, plot_tree #árvores de decisão
from sklearn import metrics

from sklearn.naive_bayes import GaussianNB
'''
Linear Discriminant Analysis (LinearDiscriminantAnalysis) and Quadratic Discriminant Analysis
(QuadraticDiscriminantAnalysis) are two classic classifiers, with, as their names suggest,
a linear and a quadratic decision surface, respectively.

These classifiers are attractive because they have closed-form solutions that can be easily
computed, are inherently multiclass, have proven to work well in practice, and have no
hyperparameters to tune.
'''
#> https://scikit-learn.org/stable/modules/lda_qda.html#lda-qda
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis, QuadraticDiscriminantAnalysis

#sklearn.neighbors provides functionality for unsupervised and supervised neighbors-based learning methods.
#> https://scikit-learn.org/stable/modules/neighbors.html 
from sklearn.neighbors import KNeighborsClassifier 

# C-Support Vector Classification
#> https://scikit-learn.org/stable/modules/svm.html#svm-classification
from sklearn.svm import SVC 
from sklearn.linear_model import LogisticRegression


In [8]:
#loads the csv
data = pd.read_csv('../../data/dota2Train.csv')
data.head(5)

Unnamed: 0,-1,223,2,2.1,0,0.1,0.2,0.3,0.4,0.5,...,0.93,0.94,0.95,0.96,0.97,0.98,0.99,0.100,0.101,0.102
0,1,152,2,2,0,0,0,1,0,-1,...,0,0,0,0,0,0,0,0,0,0
1,1,131,2,2,0,0,0,1,0,-1,...,0,0,0,0,0,0,0,0,0,0
2,1,154,2,2,0,0,0,0,0,0,...,-1,0,0,0,0,0,0,0,0,0
3,-1,171,2,3,0,0,0,0,0,-1,...,0,0,0,0,0,0,0,0,0,0
4,1,122,2,3,0,1,0,0,0,0,...,1,0,0,0,0,0,0,0,0,-1


In [6]:
data.describe()

Unnamed: 0,-1,223,2,2.1,0,0.1,0.2,0.3,0.4,0.5,...,0.93,0.94,0.95,0.96,0.97,0.98,0.99,0.100,0.101,0.102
count,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,...,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0,92649.0
mean,0.05305,175.863636,3.317586,2.384591,-0.00163,-0.000971,0.000691,-0.000799,-0.002008,0.003173,...,-0.001371,-0.00095,0.000885,0.000594,0.0,0.001025,0.000648,-0.000227,-4.3e-05,0.000896
std,0.998597,35.65807,2.633081,0.486834,0.402006,0.467674,0.165053,0.355395,0.329349,0.483952,...,0.535027,0.206113,0.283987,0.155941,0.0,0.220704,0.204167,0.168708,0.189869,0.139034
min,-1.0,111.0,1.0,1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,...,-1.0,-1.0,-1.0,-1.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0
25%,-1.0,152.0,2.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,1.0,156.0,2.0,2.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,1.0,223.0,2.0,3.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
max,1.0,261.0,9.0,3.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,0.0,1.0,1.0,1.0,1.0,1.0


In [None]:
data.groupby()