# Pokemon Classifier Algorithm

Team Starter Pokemon<br>
COGS118B FA21

Alexa Acosta<br>
Lian Lumada<br>
Ramzy Oncy-Avila<br>

[SLIDES](https://docs.google.com/presentation/d/19h5fS61qQCrhrNX4Yu0s71IPTsAPd-4Paam3E0TgaXg/edit?usp=sharing) / VIDEO

## Loading packages

In [1]:
import zipfile
zipfile.ZipFile('data_files.zip').extractall()

In [2]:
import imageio as iio
import matplotlib.pyplot as plt
import os
import pandas as pd
import numpy as np
import random
import pickle

## Data Import

In [3]:
# setting some helper variables
directory = "poke_photos/"
pokemon_names = []
pokemon_photos = []

# Uploading our Pokemon images
for filename in os.listdir(directory):
    # Stores names of each pokemon
    name = os.path.splitext(filename)[0]
    pokemon_names.append(name)
    
    # Stores the images of each pokemon - matrix representation
    im = iio.imread(directory + filename, pilmode = "RGBA")[:, :, 0:3] #(120, 120, 3)
    flat_im = [value for rgba in im for value in rgba]
    flatter_im = [value for rgba in flat_im for value in rgba]
    im_final = np.reshape(flatter_im, (14400, 3))
    pokemon_photos.append(im_final)

In [4]:
# Data frame with corresponding names and image matrix
images = pd.DataFrame(columns = ('Name', 'Image_Mat'))
images['Name'] = pokemon_names
images['Image_Mat'] = pokemon_photos

In [5]:
# Pokemon info data set
pokemon_data = pd.read_csv('pokemon_to_photos.csv')
pokemon_data = pokemon_data.sort_values("Name", ascending = True).reset_index(drop = True)

## Data Wrangling

In [6]:
# Sort based on names and reset index
images = images.sort_values("Name", ascending = True).reset_index(drop = True)

In [7]:
# Combined the two datasets
pokemon_data['Image_Mat'] = images['Image_Mat']

In [8]:
pokemon_data.head(5)

Unnamed: 0,Name,Type1,Type2,Image_Mat
0,abomasnow,Grass,Ice,"[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [..."
1,abra,Psychic,,"[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [..."
2,absol,Dark,,"[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [..."
3,accelgor,Bug,,"[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [..."
4,aegislash-blade,Steel,Ghost,"[[0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [..."


For each row of `pokemon_data`, a unique Pokemon is associated with it. Each Pokemon has four columns of data attached.
- `Name`: the name of the Pokemon in lowercase
- `Type1`: the primary type of the Pokemon
- `Type2`: the secondary type of the Pokemon, can be NULL
- `Image_Mat`: a tuple storing the RGB values of each pixel (top to bottom, left to right) in the 120x120 Pokemon image.
    - RGB: a value from 0 to 255

## Saving / Loading data

In [9]:
# Save our merged data into a file for easy access later on
#with open('data.pickle', 'wb') as f:
#    pickle.dump(pokemon_data, f)

# Load our merged data into a python variable
with open('data.pickle', 'rb') as f:
    pokemon_data = pickle.load(f)

## Custom Functions

In [10]:
# Function to view the image of the Pokemon
def view_pokemon(val) :
    print("Pokemon: " + pokemon_data['Name'][val])
    pkmn_reverted = np.reshape(pokemon_data['Image_Mat'][val], (120, 120, 3))
    plt.imshow(pkmn_reverted)

In [14]:
# Function to find average RGB value of a Pokemon
def rgb_pkmn(img_mat):
    return np.mean(img_mat, axis = 0)

## Training

In [11]:
# get the mean RGB values of all 809 Pokemon in the dataset
mu_pkmn = []

for i in pokemon_data['Image_Mat']:
    mu_pkmn.append(rgb_pkmn(i)) # average rgba values

In [12]:
# select 18 random mus for each type cluster we have
kn = 18 # we have 18 base types
mu_k = random.choices(mu_pkmn, k = kn)

In [13]:
# split data into 80/20 training and testing data sets
training_set = pokemon_data.sample(frac = 0.8, random_state = 500) # set seed to 500 for reproducible results
test_set = pokemon_data.drop(training_set.index)