# Estimating the League of Starcraft 2 Players

## Overview:
Starcraft II is a popular military science fiction real-time strategy video game published by Blizzard Entertainment. In its multiplayer ranked game, two players compete against one another in an online match for tactical and strategic dominance. Players are placed into one of seven possible leagues depending on their skill level. Given the required data of a Starcraft II match, the goal is to predict the league of its players.
Our target is the league and all the other columns represent the features used to predict the said target.

## Project Team:

### Akshay Gupte A20360699
### Arshad Shaik A20328656

## LOADING DATA (Preprocessed Version):

In [32]:
import numpy as np
import pandas as pd

df = pd.read_csv('./data/data-ml.csv')

## PRINTING DATA SHAPE:

In [33]:
Data = df.values
X = Data[:,1:13]
Y = Data[:,0]
print(X.shape)

(5339, 12)


### Scaling Input

In [34]:
from sklearn.preprocessing import scale
X_scale = scale(X)
print(X_scale.mean())
print(X_scale.std())

-1.28649181102e-17
1.0


## PERFORMANCE:

### Log - Loss

### CLASSIFICATION BASE LINE:
#### Total Instances = 5339
#### Prob. for League 1 = 0.026
#### Prob. for League 2 = 0.076
#### Prob. for League 3 = 0.100
#### Prob. for League 4 = 0.136
#### Prob. for League 5 = 0.227
#### Prob. for League 6 = 0.361
#### Prob. for League 7 = 0.073

### Naive Bayes Evaluation:

In [35]:
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_val_score

model_evaluate = GaussianNB()

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.75173698171
0.100518898061


### Logistic Regression Evaluation:

In [36]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

model_evaluate = LogisticRegression(solver='lbfgs',multi_class='multinomial')

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.32161034857
0.0253107449423


### Decision Tree Evaluation:

In [37]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import cross_val_score

model_evaluate = DecisionTreeClassifier(criterion='entropy',max_depth=5, max_features= 5)

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.88986911811
0.188658239094


### Neural Network Evaluation:

In [38]:
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import cross_val_score

model_evaluate = MLPClassifier(hidden_layer_sizes=(8,4), alpha= 1e-05, max_iter= 1000, solver= 'sgd')

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.34593909381
0.0229529108908


### Support Vector Machine Evaluation:

#### 1. SVC

In [39]:
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score

model_evaluate = SVC(probability= True)

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.34113349899
0.0230110376608


#### 2. NuSVC

In [40]:
from sklearn.svm import NuSVC
from sklearn.model_selection import cross_val_score

model_evaluate = NuSVC(nu=0.1, probability= True)

scoring = cross_val_score(model_evaluate, X_scale, Y, scoring = 'neg_log_loss')

print((scoring.mean())*-1)
print(scoring.std())

1.41552524336
0.0203313963486


### Report Scoring:

#### Best Model: Logistic Regression
#### Log- Loss : ~ 1.3216

### Important Features:
#### 1. APM
#### 2. AUM
#### 3. AUV
#### 4. AMI
#### 5. AVI
#### 6. WORKERS