# Factory Finder

Searchable list of Pokemon in Pokemon Platinum Battle Factory game mode

More info:
http://www.psypokes.com/platinum/frontier_pokemon.php?region=johto

## Notes

- A search function that easily allows us to filter for certain pokemon based on moveset

- create new column list_of_moves that aggregates move1,...,move4 into a list? 
    - this would then be able to ensure that no moves are duplicated or whatever

## Import libraries

In [1]:
import pandas as pd

## Import battle factory list data

In [2]:
# Show all files in directory
!ls

HGSS_Factory_List.txt factory_finder.ipynb
HGSS_Speed_Tiers.txt  notes.txt


In [3]:
# Path to file
textfile = './HGSS_Factory_List.txt'

# List of column names
columns_list = ['number', 'name', 'nature', 'held_item', 'move1', 'move2', 'move3', 'move4', 'ivs']

In [4]:
# Import pipe-separated data
data = pd.read_csv(textfile, sep='|', names = columns_list)

In [5]:
# Preview data
data.head()

Unnamed: 0,number,name,nature,held_item,move1,move2,move3,move4,ivs
0,406,Abomasnow,Calm,Lum Berry,Giga Drain,Ice Beam,Water Pulse,GrassWhistle,HP/SpA
1,542,Abomasnow,Bold,Big Root,Giga Drain,Sheer Cold,Ingrain,Leech Seed,HP/Def/SpD
2,678,Abomasnow,Quiet,Occa Berry,Energy Ball,Blizzard,Shadow Ball,Focus Blast,HP/SpA
3,814,Abomasnow,Brave,Shell Bell,Wood Hammer,Avalanche,Earthquake,Rock Slide,HP/Atk
4,374,Absol,Jolly,Chople Berry,Sucker Punch,Facade,Double Team,Taunt,Atk/Spe


## Clean battle factory data

In [6]:
# Remove any excess trailing whitespace
data_trimmed = data.apply(lambda x: x.str.strip() if x.dtype == "object" else x)

In [7]:
# Preview data
data_trimmed.head()

Unnamed: 0,number,name,nature,held_item,move1,move2,move3,move4,ivs
0,406,Abomasnow,Calm,Lum Berry,Giga Drain,Ice Beam,Water Pulse,GrassWhistle,HP/SpA
1,542,Abomasnow,Bold,Big Root,Giga Drain,Sheer Cold,Ingrain,Leech Seed,HP/Def/SpD
2,678,Abomasnow,Quiet,Occa Berry,Energy Ball,Blizzard,Shadow Ball,Focus Blast,HP/SpA
3,814,Abomasnow,Brave,Shell Bell,Wood Hammer,Avalanche,Earthquake,Rock Slide,HP/Atk
4,374,Absol,Jolly,Chople Berry,Sucker Punch,Facade,Double Team,Taunt,Atk/Spe


## Add aditional columns

### Groups
351-486 inclusive is set1

487-622 is set2

623-758 is set3

759-894 is set4

In [8]:
poke_group_a = range(1,351)
poke_group_b = range(351,891)
poke_group_c = range(891,950)

## Import speed tier data

In [9]:
# Path to speed tier file
speed_tiers_path = './HGSS_Speed_Tiers.txt'

# Import newline-separated speedtiers
speed = pd.read_csv(speed_tiers_path, sep='\n',header=None)

In [10]:
# Preview data
speed.head()

Unnamed: 0,0
0,250 Linoone 2 (Choice Scarf)
1,234 Porygon-Z 2 (Choice Scarf)
2,211 Electrode 23
3,"200 Aerodactyl 12, Crobat 1, Jolteon 13"
4,194 Weavile 24


In [11]:
# Separate file into columns after importing to deal with inconsistent number of columns per row
speed = speed[0].str.split(',', expand=True)

In [12]:
# Preview data
speed.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
0,250 Linoone 2 (Choice Scarf),,,,,,,,,,,,,,,,,,
1,234 Porygon-Z 2 (Choice Scarf),,,,,,,,,,,,,,,,,,
2,211 Electrode 23,,,,,,,,,,,,,,,,,,
3,200 Aerodactyl 12,Crobat 1,Jolteon 13,,,,,,,,,,,,,,,,
4,194 Weavile 24,,,,,,,,,,,,,,,,,,


In [13]:
# Remove first three characters to create new column Speed
speed['speed'] = speed[0].str[:3]
speed[0] = speed[0].str[3:]

In [14]:
# Preview data
speed.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,speed
0,Linoone 2 (Choice Scarf),,,,,,,,,,,,,,,,,,,250
1,Porygon-Z 2 (Choice Scarf),,,,,,,,,,,,,,,,,,,234
2,Electrode 23,,,,,,,,,,,,,,,,,,,211
3,Aerodactyl 12,Crobat 1,Jolteon 13,,,,,,,,,,,,,,,,,200
4,Weavile 24,,,,,,,,,,,,,,,,,,,194


# Rearrange and sort speed tiers

In [15]:
# Move speed to first column
cols_to_order = ['speed']
new_columns = cols_to_order + (speed.columns.drop(cols_to_order).tolist())

# Rewrite newly ordered columns index
speed = speed[new_columns]

Unnamed: 0,speed,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
0,250,Linoone 2 (Choice Scarf),,,,,,,,,,,,,,,,,,
1,234,Porygon-Z 2 (Choice Scarf),,,,,,,,,,,,,,,,,,
2,211,Electrode 23,,,,,,,,,,,,,,,,,,
3,200,Aerodactyl 12,Crobat 1,Jolteon 13,,,,,,,,,,,,,,,,
4,194,Weavile 24,,,,,,,,,,,,,,,,,,


In [None]:
# Preview data
speed.head()

## Melting the speed tier dataframe
Ungroup data so that each pokemon is listed on its on row next to its speed

In [16]:
# Select all columns except 'speed' column to be melted
cols_to_melt = speed.columns.drop('speed')

# Melt dataframe speed so that each pokemon is listed next to its speed tier
speed_tiers = pd.melt(speed, id_vars = ['speed'], value_vars = cols_to_melt, value_name='name')

# Drop an empty column leftover from melting the dataframe down to two columns, speed and name
speed_tiers = speed_tiers[speed_tiers.columns.drop('variable')]

# remove rows with NA
speed_tiers = speed_tiers.dropna(axis=0)

# Sort rows according to column 'name' in descending alphabetical order
speed_tiers = speed_tiers.sort_values('name')

In [17]:
# Preview data
speed_tiers

Unnamed: 0,speed,name
42,127,Absol 4
33,139,Absol1 3
3,200,Aerodactyl 12
9,182,Aerodactyl 34
6,189,Alakazam 13
...,...,...
81,55,Quagsire 1234
87,25,Shuckle 1234
86,40,Torkoal 2
72,68,Torterra 2


## Pokemon set number column

In the original dataset, Pokemon are listed as 'Quagsire 1234' where '1234' refers to the set numbers of that Pokemon that have the listed speed.

Now that every named Pokemon in the speed tier list is in its own row, we can split off the listed set numbers to their own columns to further simplify the data structure

## Code snippets

Things I've played around with but don't work yet

In [None]:
data_trimmed.head()

# For Name in dataframe, if first example of that name, setno = 1
# if not first example of that name, set no increments
# do this for all names  in dataframe

"""for row in dataframe:
    name_list = []
    count = 0
    if count == 0:
        if name not in name_list:
"""         