### Predicting Legendary Pokemon using Neural Networks
#### Sam Berkson
#### CPSC 323

Today, ill be utilizing a neural network to predict whether a Pokemon is legendary based on its:
* HP
* Attack
* Defense
* Special Attack
* Special Defence
* Speed

First, we need to import our libraries and read in our data.  Then we split in into our training, testing, and validation set.

In [13]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import sklearn 
from sklearn import preprocessing
from sklearn import utils
from sklearn import model_selection
from sklearn import metrics
from sklearn.metrics import confusion_matrix, classification_report

df = pd.read_csv('Pokemon.csv')
df.head()
df.corr()

# Split data into train, test, and validation sets
X_train, x_test, y_train, y_test = model_selection.train_test_split(df[['HP', 'Attack', 'Defense', 'Sp. Atk', 'Sp. Def', 'Speed']], df['Legendary'], test_size=0.3, random_state=1)
X_train, X_val, y_train, y_val = model_selection.train_test_split(X_train, y_train, test_size=0.1, random_state=1)

Now that we've split our data, we can implement our neural network.  I used a 5 layer network, using a relu activation function. I compile the model using the adam optimizer, tracking loss as mean squared error (MSE), and tracking my accuracy and mse as metrics.  Next, I ran the model over a different number of epochs to see which yielded the best results.

In [14]:

model = keras.Sequential([
    layers.Dense(30, activation='relu', input_shape=[6]),
    layers.Dense(30, activation='relu'),
    layers.Dense(30, activation='relu'),
    layers.Dense(6, activation='relu'),
    layers.Dense(1, activation='relu')
])

model = keras.Sequential([
    layers.Dense(30, activation='relu', input_shape=[6]),
    layers.Dense(1, activation='relu')
])

model.compile(optimizer='adam',
                loss='mse',
                metrics=['accuracy', 'mse'])

model.summary()

history = model.fit(X_train, y_train, epochs=100, batch_size=26, validation_data=(X_val, y_val))
model.evaluate(x_test, y_test)
y_pred = model.predict(x_test)
model.save('PokemonNeuralNet.h5')

Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_43 (Dense)            (None, 30)                210       
                                                                 
 dense_44 (Dense)            (None, 1)                 31        
                                                                 
Total params: 241
Trainable params: 241
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch

#### Epoch Results
* 100 Epochs
* Loss (MSE): 0.0875
* Accuracy: 0.9125

Now that we've got our predictions, lets discretize them to binary values for legendary or not.

In [15]:
# Discretize y_pred to true or false using y_test
y_pred = np.where(y_pred > 0.5, 1, 0)
y_test = np.where(y_test > 0.5, 1, 0)

# Confusion Matrix with labels
cm = confusion_matrix(y_test, y_pred)
print("Confusion Matrix     :  Predicted Legendary | Predicted Non-Legendary")
print("Actual Legendary     :          ", cm[0][0], "                  ", cm[0][1])
print("Actual Non-Legendary :          ", cm[1][0], "                   ", cm[1][1])


Confusion Matrix     :  Predicted Legendary | Predicted Non-Legendary
Actual Legendary     :           219                    0
Actual Non-Legendary :           21                     0


After discretizing, we can analyze our confusion matrix.  Its odd that it only predicts legendary, as last week's model had non-legendary predictions.  However, I will take a false positive over a false negative.