# Pokemon Primary Type Classification

We are going to use Neural Networks to train a model into classifying a pokemon's primary type based on its base stats.

### Base Stats
* HP
* Attack
* Defense
* Special Attack
* Special Defense
* Speed

### Types
* NORMAL
* FIRE
* WATER
* ELECTRIC
* GRASS
* ICE
* FIGHTING
* POISON
* GROUND
* FLYING
* PSYCHIC
* BUG
* ROCK
* GHOST
* DRAGON
* DARK
* STEEL
* FAIRY

### Goal
Determine the types of new pokemon using Neural Networks classifier:

| Pokemon | HP | Atk | Def | Sp. Atk | Sp. Def | Speed | Type 1 |
|---------|----|-----|-----|---------|---------|-------|--------|
| Necrozma| ...| ... | ... |  ...    |  ...    |  ...  | ???    |

## Step 1: Importing dataset  into a Pandas dataframe

In [1]:
import pandas as pd

data = pd.read_csv('pokemon.csv')
data

Unnamed: 0,abilities,against_bug,against_dark,against_dragon,against_electric,against_fairy,against_fight,against_fire,against_flying,against_ghost,...,percentage_male,pokedex_number,sp_attack,sp_defense,speed,type1,type2,weight_kg,generation,is_legendary
0,"['Overgrow', 'Chlorophyll']",1.00,1.0,1.0,0.5,0.5,0.50,2.00,2.00,1.0,...,88.1,1,65,65,45,grass,poison,6.9,1,0
1,"['Overgrow', 'Chlorophyll']",1.00,1.0,1.0,0.5,0.5,0.50,2.00,2.00,1.0,...,88.1,2,80,80,60,grass,poison,13.0,1,0
2,"['Overgrow', 'Chlorophyll']",1.00,1.0,1.0,0.5,0.5,0.50,2.00,2.00,1.0,...,88.1,3,122,120,80,grass,poison,100.0,1,0
3,"['Blaze', 'Solar Power']",0.50,1.0,1.0,1.0,0.5,1.00,0.50,1.00,1.0,...,88.1,4,60,50,65,fire,,8.5,1,0
4,"['Blaze', 'Solar Power']",0.50,1.0,1.0,1.0,0.5,1.00,0.50,1.00,1.0,...,88.1,5,80,65,80,fire,,19.0,1,0
5,"['Blaze', 'Solar Power']",0.25,1.0,1.0,2.0,0.5,0.50,0.50,1.00,1.0,...,88.1,6,159,115,100,fire,flying,90.5,1,0
6,"['Torrent', 'Rain Dish']",1.00,1.0,1.0,2.0,1.0,1.00,0.50,1.00,1.0,...,88.1,7,50,64,43,water,,9.0,1,0
7,"['Torrent', 'Rain Dish']",1.00,1.0,1.0,2.0,1.0,1.00,0.50,1.00,1.0,...,88.1,8,65,80,58,water,,22.5,1,0
8,"['Torrent', 'Rain Dish']",1.00,1.0,1.0,2.0,1.0,1.00,0.50,1.00,1.0,...,88.1,9,135,115,78,water,,85.5,1,0
9,"['Shield Dust', 'Run Away']",1.00,1.0,1.0,1.0,1.0,0.50,2.00,2.00,1.0,...,50.0,10,20,20,45,bug,,2.9,1,0


#### The stats dictating the primary type are stored in d1

In [2]:
d1 = data[['hp','attack','defense','sp_attack','sp_defense','speed', 'type1']]
d1

Unnamed: 0,hp,attack,defense,sp_attack,sp_defense,speed,type1
0,45,49,49,65,65,45,grass
1,60,62,63,80,80,60,grass
2,80,100,123,122,120,80,grass
3,39,52,43,60,50,65,fire
4,58,64,58,80,65,80,fire
5,78,104,78,159,115,100,fire
6,44,48,65,50,64,43,water
7,59,63,80,65,80,58,water
8,79,103,120,135,115,78,water
9,45,30,35,20,20,45,bug


#### The stats dictating secondary type are stored in d2  
null values are dropped

In [3]:
d2 = data[['hp','attack','defense','sp_attack','sp_defense','speed', 'type2']]
d2 = d2.dropna()
d2

Unnamed: 0,hp,attack,defense,sp_attack,sp_defense,speed,type2
0,45,49,49,65,65,45,poison
1,60,62,63,80,80,60,poison
2,80,100,123,122,120,80,poison
5,78,104,78,159,115,100,flying
11,60,45,50,90,80,70,flying
12,40,35,30,20,20,50,poison
13,45,25,50,25,25,35,poison
14,65,150,40,15,80,145,poison
15,40,45,40,35,35,56,flying
16,63,60,55,50,50,71,flying


#### Both d1 and d2 are merged  
Primary and secondary types of each pokemon are treated as separate rows.  
These means two rows can have the same stats but lead to different type outputs if they belong to the same pokemon

In [4]:
d1 = d1.rename(columns={'type1': 'type'})
d2 = d2.rename(columns={'type2': 'type'})
mergeData = pd.concat([d1, d2], ignore_index=True)
mergeData

Unnamed: 0,hp,attack,defense,sp_attack,sp_defense,speed,type
0,45,49,49,65,65,45,grass
1,60,62,63,80,80,60,grass
2,80,100,123,122,120,80,grass
3,39,52,43,60,50,65,fire
4,58,64,58,80,65,80,fire
5,78,104,78,159,115,100,fire
6,44,48,65,50,64,43,water
7,59,63,80,65,80,58,water
8,79,103,120,135,115,78,water
9,45,30,35,20,20,45,bug


## Step 2: Preparing testing and training data

The training set and testing set will base its values from the 6 base stats alone. Only these columns will be extracted from the data into the training/testing set.  

The stat distribution is acquired from the base stats by dividing each stat by the sum of the other stats.  
This way:
$$ \sum_{s \in \text{stats}} s = 1$$


In [5]:
x = mergeData[['hp','attack','defense','sp_attack','sp_defense','speed']]

# normalize stats to get distribution such that all stats add up to 1
x = x.div(x.sum(axis=1), axis=0)
x

Unnamed: 0,hp,attack,defense,sp_attack,sp_defense,speed
0,0.141509,0.154088,0.154088,0.204403,0.204403,0.141509
1,0.148148,0.153086,0.155556,0.197531,0.197531,0.148148
2,0.128000,0.160000,0.196800,0.195200,0.192000,0.128000
3,0.126214,0.168285,0.139159,0.194175,0.161812,0.210356
4,0.143210,0.158025,0.143210,0.197531,0.160494,0.197531
5,0.123028,0.164038,0.123028,0.250789,0.181388,0.157729
6,0.140127,0.152866,0.207006,0.159236,0.203822,0.136943
7,0.145679,0.155556,0.197531,0.160494,0.197531,0.143210
8,0.125397,0.163492,0.190476,0.214286,0.182540,0.123810
9,0.230769,0.153846,0.179487,0.102564,0.102564,0.230769


#### The pokemon types will will serve class labels for training and testing

In [6]:
y = mergeData['type']
y

0          grass
1          grass
2          grass
3           fire
4           fire
5           fire
6          water
7          water
8          water
9            bug
10           bug
11           bug
12           bug
13           bug
14           bug
15        normal
16        normal
17        normal
18        normal
19        normal
20        normal
21        normal
22        poison
23        poison
24      electric
25      electric
26        ground
27        ground
28        poison
29        poison
          ...   
1188        fire
1189    fighting
1190    fighting
1191     psychic
1192       water
1193       water
1194      ground
1195      ground
1196      flying
1197      dragon
1198       steel
1199       fairy
1200     psychic
1201      dragon
1202       grass
1203    fighting
1204    fighting
1205       fairy
1206       fairy
1207       fairy
1208       fairy
1209       steel
1210       ghost
1211      poison
1212    fighting
1213    fighting
1214      flying
1215       ste

### We will use Scikit's <code>train_test_split()</code> method to split our data set into training set and test set automatically

The following parameters:
* <code>train_size</code>
* <code>test_size</code>  
will be set accordingly such that data will be set into 70% training and 30% testing

In [7]:
from sklearn.model_selection import train_test_split

# 70% training and 30% testing
x_train, x_test, y_train, y_test = train_test_split(x,y,train_size=0.8, stratify=y)
x_train




Unnamed: 0,hp,attack,defense,sp_attack,sp_defense,speed
892,0.135802,0.098765,0.209877,0.197531,0.259259,0.098765
257,0.161290,0.225806,0.161290,0.161290,0.161290,0.129032
1016,0.198020,0.247525,0.102970,0.207921,0.102970,0.140594
121,0.086957,0.097826,0.141304,0.217391,0.260870,0.195652
140,0.121212,0.232323,0.212121,0.131313,0.141414,0.161616
1192,0.108696,0.152174,0.173913,0.086957,0.130435,0.347826
1184,0.149780,0.154185,0.202643,0.110132,0.290749,0.092511
1096,0.151329,0.192229,0.267894,0.110429,0.237219,0.040900
698,0.236084,0.147793,0.138196,0.190019,0.176583,0.111324
661,0.162304,0.191099,0.143979,0.146597,0.136126,0.219895


### The final preparation step is normalizing the data.
MLP works best if the data is scaled in terms of [0,1] or [-1,1] ranges, so we will use Scikit's MinMaxScaler to scale the data accordingly.  
We will normalize it to the scale of [-1,1] to prepare it for Tanh activation.

In [8]:
from sklearn.preprocessing import MinMaxScaler

# Sigmoid activation
# scaler = MinMaxScaler(feature_range=(0,1))

#TANH activation
scaler = MinMaxScaler(feature_range=(-1,1))

x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)
x_train

array([[-0.52272512, -0.52650616, -0.1053341 ,  0.03853945,  0.02166015,
        -0.54440058],
       [-0.43026363,  0.15974797, -0.32402623, -0.17322822, -0.46689039,
        -0.38922548],
       [-0.29702125,  0.27706657, -0.58653132,  0.09925187, -0.75772013,
        -0.32994924],
       ...,
       [-0.56986831,  0.02982278, -0.3787936 , -0.14181214, -0.61505479,
         0.11853237],
       [-0.36512898,  0.01015375, -0.32813558,  0.26244765, -0.56553353,
        -0.61545829],
       [-0.4206717 , -0.35158348,  0.20439648, -0.34936429, -0.20845282,
        -0.63052343]])

## Step 3: Training

Just one line of code baby!!!  
We will be using Scikit's MLPClassifier function.  
Some of the parameters are:
* <code>hidden_layer_sizes(x,y,z...)</code>: Each number included in the parameter indicated the number of hidden nodes for one hidden layer
* <code>activation</code>: 'identity', 'logistic', 'tanh', 'relu'
* <code>max_iter</code>: an integer value representing the maximum number of training iterations
* <code>learning_rate_init</code>: a double value representing the learning rate to be used

In [9]:
from sklearn.neural_network import MLPClassifier

# activation='logistic'
activation='tanh'
mlp = MLPClassifier(hidden_layer_sizes=(15),max_iter=10000,learning_rate_init=0.001,activation=activation)
mlp.fit(x_train,y_train)

MLPClassifier(activation='tanh', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=15, learning_rate='constant',
       learning_rate_init=0.001, max_iter=10000, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='adam', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False)

## Step 4: Testing
We will be using the predict function to make predictions from the training data.  
We will compare these predictions with the testing data using Scikit's confusion_matrix and classification_report 

In [10]:
predictions = mlp.predict(x_test)

In [11]:
from sklearn.metrics import classification_report,confusion_matrix

print (confusion_matrix(y_test,predictions))

[[ 1  1  0  1  0  0  0  1  0  2  1  0  1  0  1  2  3  1]
 [ 2  1  0  0  0  0  0  0  1  2  0  0  3  0  0  1  0  0]
 [ 0  0  0  0  0  0  1  0  0  2  0  0  1  0  0  2  0  3]
 [ 0  0  0  0  0  0  2  2  0  0  0  0  1  0  3  0  1  1]
 [ 0  0  0  0  0  0  0  0  0  2  0  0  0  0  2  0  1  4]
 [ 1  1  0  0  0  2  0  0  0  0  1  0  3  0  0  0  2  1]
 [ 0  0  0  0  0  0  7  0  1  1  0  0  0  0  2  0  0  2]
 [ 1  0  1  0  0  0  1  4  0  1  1  0  4  0  1  0  0  6]
 [ 0  0  0  0  0  1  0  1  2  1  0  0  1  0  0  0  1  1]
 [ 1  0  0  1  0  0  1  1  0  5  0  0  1  0  3  1  1  5]
 [ 0  0  0  0  0  2  0  0  0  3  2  0  4  0  0  1  0  1]
 [ 0  0  0  0  1  0  0  1  0  3  0  0  1  0  1  0  0  1]
 [ 0  0  0  0  1  2  0  3  0  2  0  0 11  0  2  0  0  1]
 [ 2  0  0  0  0  0  1  1  1  1  0  0  3  0  1  0  0  3]
 [ 1  0  0  0  2  0  1  0  0  1  0  0  1  0  6  0  1  3]
 [ 0  0  0  0  0  0  0  0  1  0  3  0  2  0  0  2  3  1]
 [ 0  0  0  0  0  0  0  0  0  1  0  0  1  0  0  3  3  1]
 [ 1  0  0  2  1  0  1  3  0  4

In [12]:
print(classification_report(y_test,predictions))

              precision    recall  f1-score   support

         bug       0.10      0.07      0.08        15
        dark       0.33      0.10      0.15        10
      dragon       0.00      0.00      0.00         9
    electric       0.00      0.00      0.00        10
       fairy       0.00      0.00      0.00         9
    fighting       0.29      0.18      0.22        11
        fire       0.47      0.54      0.50        13
      flying       0.24      0.20      0.22        20
       ghost       0.33      0.25      0.29         8
       grass       0.16      0.25      0.20        20
      ground       0.20      0.15      0.17        13
         ice       0.00      0.00      0.00         8
      normal       0.28      0.50      0.35        22
      poison       0.00      0.00      0.00        13
     psychic       0.27      0.38      0.32        16
        rock       0.13      0.17      0.15        12
       steel       0.19      0.33      0.24         9
       water       0.17    

  'precision', 'predicted', average, warn_for)
