# Pokemon Classifier Algorithm

Team Starter Pokemon<br>
COGS118B FA21

Alexa Acosta<br>
Lian Lumada<br>
Ramzy Oncy-Avila<br>

[SLIDES](https://docs.google.com/presentation/d/19h5fS61qQCrhrNX4Yu0s71IPTsAPd-4Paam3E0TgaXg/edit?usp=sharing) / VIDEO

## Loading packages

In [1]:
import zipfile
zipfile.ZipFile('data_files.zip').extractall()

In [2]:
import imageio as iio
import matplotlib.pyplot as plt
import os
import pandas as pd
import numpy as np
import random
import pickle

## Data Import

In [4]:
# setting some helper variables
directory = "poke_photos/"
pokemon_names = []
pokemon_photos = []

# Uploading our Pokemon images
for filename in os.listdir(directory):
    # Stores names of each pokemon
    name = os.path.splitext(filename)[0]
    pokemon_names.append(name)
    
    # Stores the images of each pokemon - matrix representation
    im = iio.imread(directory + filename)
    pokemon_photos.append(im)

In [None]:
# Data frame with corresponding names and image matrix
images = pd.DataFrame(columns = ('Name', 'Image_Mat'))
images['Name'] = pokemon_names
images['Image_Mat'] = pokemon_photos

In [None]:
# Pokemon info data set
pokemon_data = pd.read_csv('pokemon_to_photos.csv')
pokemon_data = pokemon_data.sort_values("Name", ascending = True).reset_index(drop = True)

pokemon_data

## Data Wrangling

In [None]:
# Sort based on names and reset index
images = images.sort_values("Name", ascending = True).reset_index(drop = True)

images

In [None]:
# Combined the two datasets
pokemon_data['Image_Mat'] = images['Image_Mat']

pokemon_data

For each row of `pokemon_data`, a unique Pokemon is associated with it. Each Pokemon has four columns of data attached.
- `Name`: the name of the Pokemon in lowercase
- `Type1`: the primary type of the Pokemon
- `Type2`: the secondary type of the Pokemon, can be NULL
- `Image_Mat`: the tuple storing the RGBA values of each pixel in the 120x120 Pokemon image.
     - first element: specifies the pixel row
     - second element: specifies the pixel column
     - third element: 4-element list that specifies the RGBA values of the pixel
         - RGB: a value from 0 to 255
         - A: the opacity level of the pixel, 0 to 1 (transparent to opaque)

## Custom Functions

In [None]:
# Function to view the image of the pokemon
def view_pokemon(val) :
    print("Pokemon: " + pokemon_data['Name'][val])
    plt.imshow(pokemon_data['Image_Mat'][val])

view_pokemon(0)

## Saving / Loading data

In [6]:
# Save our merged data into a file for easy access later on
#with open('data.pickle', 'wb') as f:
#    pickle.dump(pokemon_data, f)

# Load our merged data into a python variable
with open('data.pickle', 'rb') as f:
    pokemon_data = pickle.load(f)

## Training

In [None]:
# get the mean RGBA values of all 809 Pokemon in the dataset
mu_pkmn = []

for i in pokemon_data['Image_Mat']:
    mu_pkmn.append(np.mean(i, axis = (0,1))) # average rgba values
    
mu_pkmn

In [None]:
# select 18 random mus for each type cluster we have
mu_k = random.choices(mu_pkmn, k = 18)

In [None]:
# split data into 80/20 training and testing data sets
training_set = pokemon_data.sample(frac = 0.8, random_state = 500) # set seed to 500 for reproducible results
test_set = pokemon_data.drop(training_set.index)