# API Calling and Transformation of Pokemon Data

## David Berberena

## 5/19/2024

In [1]:
# To work with APIs, I need to import the requests library, which allows the user to connect to the API through Python. To 
# convert the API data into a DataFrame and subsequently transform that data, I will be importing pandas as well. 

import requests
import pandas as pd

# As I have worked with APIs before, I know that to request a large amount of information it is easiest to create a function
# that parses the API data and stores the specific values I wish to capture in variables. This function will capture the
# most important and pertinent Pokémon data needed for my dataset, which is the PokéDex number (id), Pokémon name (name), HP
# stat (base_stat in the hp stat dictionary), Attack stat (base_stat in the attack stat dictionary), Defense stat 
# (base_stat in the defense stat dictionary), Special Attack stat (base_stat in the special-attack stat dictionary), Special
# Defense stat (base_stat in the special-defense stat dictionary), Speed stat (base_stat in the speed stat dictionary), and 
# the Pokémon type (name in the type dictionary within the types list). Now as I have demonstrated and shown before in a 
# previous milestone, many Pokémon have two types, a primary and a secondary type. I have crafted the function to parse each
# type from the types list within the API data. 

# I have created the function to take the PokéDex number as a parameter so the API can be called for each Pokémon. As it 
# stands today, there are 1025 Pokémon, so that means this function will need to be run that many times. The API URL is also
# amended to a f-string statement so that the function can take whatever number is input for the pokemon_id parameter. 

def api_pokemon_data(pokemon_id):
    api_url = f'https://pokeapi.co/api/v2/pokemon/{pokemon_id}/'
    response = requests.get(api_url)
    
# With the request library used above to perform the GET request needed to access the API data from the URL, the below if 
# statement checks the status code and immediately converts the GET request's response into JSON format.
    
    if response.status_code == 200:
        data = response.json()
        
# After reviewing the API documentation and seeing that the primary and secondary types were hidden within a slot 1 and slot
# 2 designation, so I simply decided to parse the names of each type based on the number of type names within the types list
# seen in the JSON data structure using if-else statements. 
        
        types = data['types']
        primary_type = types[0]['type']['name'] if len(types) > 0 else None
        secondary_type = types[1]['type']['name'] if len(types) > 1 else None
        
# The elements of my dataset needed from the API data are turned into a dictionary in the block of code I have crafted below
# with the keys being the elements needed and the values being the observations extracted from within the API data. I have 
# used the next() function to have the function understand that the next value needed is the next base_stat in the JSON data
# structure since there are multiple base_stat values. The if statements are also placed here for extra security to ensure 
# the function grabs the right value. The primary and secondary type variables I have made above are also included here. 

        pokemon_info = {
            'id': data['id'],
            'name': data['name'],
            'hp': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'hp'),
            'attack': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'attack'),
            'defense': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'defense'),
            'special_attack': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'special-attack'),
            'special_defense': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'special-defense'),
            'speed': next(stat['base_stat'] for stat in data['stats'] if stat['stat']['name'] == 'speed'),
            'primary_type': primary_type,
            'secondary_type': secondary_type
        }
        return pokemon_info
    

# To implement this function the number of times equal to the number of Pokémon, I have created a for loop that takes each 
# number in the range of the number of Pokémon available (1025) and uses it as an argument for the API call function to 
# extract the statistics needed for that Pokémon, and appends the results to an empty list until all Pokémon API calls have 
# been made. To verify that each Pokémon's data has been parsed successfully, I have made a print statement stating this. 

all_pokemon_data = []

for pokemon_id in range(1, 1026):
    pokemon_data = api_pokemon_data(pokemon_id)
    if pokemon_data:
        all_pokemon_data.append(pokemon_data)
    print(f"Parsed Pokémon with PokéDex Number {pokemon_id}")
    
# The list of Pokémon statistics can now be converted into a DataFrame now using the pd.DataFrame() from pandas.

all_pokemon = pd.DataFrame(all_pokemon_data)

Parsed Pokémon with PokéDex Number 1
Parsed Pokémon with PokéDex Number 2
Parsed Pokémon with PokéDex Number 3
Parsed Pokémon with PokéDex Number 4
Parsed Pokémon with PokéDex Number 5
Parsed Pokémon with PokéDex Number 6
Parsed Pokémon with PokéDex Number 7
Parsed Pokémon with PokéDex Number 8
Parsed Pokémon with PokéDex Number 9
Parsed Pokémon with PokéDex Number 10
Parsed Pokémon with PokéDex Number 11
Parsed Pokémon with PokéDex Number 12
Parsed Pokémon with PokéDex Number 13
Parsed Pokémon with PokéDex Number 14
Parsed Pokémon with PokéDex Number 15
Parsed Pokémon with PokéDex Number 16
Parsed Pokémon with PokéDex Number 17
Parsed Pokémon with PokéDex Number 18
Parsed Pokémon with PokéDex Number 19
Parsed Pokémon with PokéDex Number 20
Parsed Pokémon with PokéDex Number 21
Parsed Pokémon with PokéDex Number 22
Parsed Pokémon with PokéDex Number 23
Parsed Pokémon with PokéDex Number 24
Parsed Pokémon with PokéDex Number 25
Parsed Pokémon with PokéDex Number 26
Parsed Pokémon with P

Parsed Pokémon with PokéDex Number 214
Parsed Pokémon with PokéDex Number 215
Parsed Pokémon with PokéDex Number 216
Parsed Pokémon with PokéDex Number 217
Parsed Pokémon with PokéDex Number 218
Parsed Pokémon with PokéDex Number 219
Parsed Pokémon with PokéDex Number 220
Parsed Pokémon with PokéDex Number 221
Parsed Pokémon with PokéDex Number 222
Parsed Pokémon with PokéDex Number 223
Parsed Pokémon with PokéDex Number 224
Parsed Pokémon with PokéDex Number 225
Parsed Pokémon with PokéDex Number 226
Parsed Pokémon with PokéDex Number 227
Parsed Pokémon with PokéDex Number 228
Parsed Pokémon with PokéDex Number 229
Parsed Pokémon with PokéDex Number 230
Parsed Pokémon with PokéDex Number 231
Parsed Pokémon with PokéDex Number 232
Parsed Pokémon with PokéDex Number 233
Parsed Pokémon with PokéDex Number 234
Parsed Pokémon with PokéDex Number 235
Parsed Pokémon with PokéDex Number 236
Parsed Pokémon with PokéDex Number 237
Parsed Pokémon with PokéDex Number 238
Parsed Pokémon with PokéD

Parsed Pokémon with PokéDex Number 425
Parsed Pokémon with PokéDex Number 426
Parsed Pokémon with PokéDex Number 427
Parsed Pokémon with PokéDex Number 428
Parsed Pokémon with PokéDex Number 429
Parsed Pokémon with PokéDex Number 430
Parsed Pokémon with PokéDex Number 431
Parsed Pokémon with PokéDex Number 432
Parsed Pokémon with PokéDex Number 433
Parsed Pokémon with PokéDex Number 434
Parsed Pokémon with PokéDex Number 435
Parsed Pokémon with PokéDex Number 436
Parsed Pokémon with PokéDex Number 437
Parsed Pokémon with PokéDex Number 438
Parsed Pokémon with PokéDex Number 439
Parsed Pokémon with PokéDex Number 440
Parsed Pokémon with PokéDex Number 441
Parsed Pokémon with PokéDex Number 442
Parsed Pokémon with PokéDex Number 443
Parsed Pokémon with PokéDex Number 444
Parsed Pokémon with PokéDex Number 445
Parsed Pokémon with PokéDex Number 446
Parsed Pokémon with PokéDex Number 447
Parsed Pokémon with PokéDex Number 448
Parsed Pokémon with PokéDex Number 449
Parsed Pokémon with PokéD

Parsed Pokémon with PokéDex Number 636
Parsed Pokémon with PokéDex Number 637
Parsed Pokémon with PokéDex Number 638
Parsed Pokémon with PokéDex Number 639
Parsed Pokémon with PokéDex Number 640
Parsed Pokémon with PokéDex Number 641
Parsed Pokémon with PokéDex Number 642
Parsed Pokémon with PokéDex Number 643
Parsed Pokémon with PokéDex Number 644
Parsed Pokémon with PokéDex Number 645
Parsed Pokémon with PokéDex Number 646
Parsed Pokémon with PokéDex Number 647
Parsed Pokémon with PokéDex Number 648
Parsed Pokémon with PokéDex Number 649
Parsed Pokémon with PokéDex Number 650
Parsed Pokémon with PokéDex Number 651
Parsed Pokémon with PokéDex Number 652
Parsed Pokémon with PokéDex Number 653
Parsed Pokémon with PokéDex Number 654
Parsed Pokémon with PokéDex Number 655
Parsed Pokémon with PokéDex Number 656
Parsed Pokémon with PokéDex Number 657
Parsed Pokémon with PokéDex Number 658
Parsed Pokémon with PokéDex Number 659
Parsed Pokémon with PokéDex Number 660
Parsed Pokémon with PokéD

Parsed Pokémon with PokéDex Number 847
Parsed Pokémon with PokéDex Number 848
Parsed Pokémon with PokéDex Number 849
Parsed Pokémon with PokéDex Number 850
Parsed Pokémon with PokéDex Number 851
Parsed Pokémon with PokéDex Number 852
Parsed Pokémon with PokéDex Number 853
Parsed Pokémon with PokéDex Number 854
Parsed Pokémon with PokéDex Number 855
Parsed Pokémon with PokéDex Number 856
Parsed Pokémon with PokéDex Number 857
Parsed Pokémon with PokéDex Number 858
Parsed Pokémon with PokéDex Number 859
Parsed Pokémon with PokéDex Number 860
Parsed Pokémon with PokéDex Number 861
Parsed Pokémon with PokéDex Number 862
Parsed Pokémon with PokéDex Number 863
Parsed Pokémon with PokéDex Number 864
Parsed Pokémon with PokéDex Number 865
Parsed Pokémon with PokéDex Number 866
Parsed Pokémon with PokéDex Number 867
Parsed Pokémon with PokéDex Number 868
Parsed Pokémon with PokéDex Number 869
Parsed Pokémon with PokéDex Number 870
Parsed Pokémon with PokéDex Number 871
Parsed Pokémon with PokéD

In [2]:
# To show that the Pokémon has been correctly parsed and converted into a DataFrame, I will print the whole dataset as this
# will show me the shape of the DataFrame in addition to the first and last 5 observations.

all_pokemon

Unnamed: 0,id,name,hp,attack,defense,special_attack,special_defense,speed,primary_type,secondary_type
0,1,bulbasaur,45,49,49,65,65,45,grass,poison
1,2,ivysaur,60,62,63,80,80,60,grass,poison
2,3,venusaur,80,82,83,100,100,80,grass,poison
3,4,charmander,39,52,43,60,50,65,fire,
4,5,charmeleon,58,64,58,80,65,80,fire,
...,...,...,...,...,...,...,...,...,...,...
1020,1021,raging-bolt,125,73,91,137,89,75,electric,dragon
1021,1022,iron-boulder,90,120,80,68,108,124,rock,psychic
1022,1023,iron-crown,90,72,100,122,108,98,steel,psychic
1023,1024,terapagos,90,65,85,65,85,60,normal,


## Data Transformation 1: Replace Headers

In [3]:
# To match the headers of the two previous datasets that have been transformed for this term project, I will replace the 
# headers accordingly.

real_headers = ['Pokedex_entry_number', 'Pokemon_name', 'HP_stat', 'Attack_stat', 'Defense_stat', 'Special_attack_stat', 
                   'Special_defense_stat', 'Speed_stat', 'Primary_type', 'Secondary_type',]
all_pokemon.columns = real_headers

# The head() function is used simply to verify that the transformation of the data has been performed correctly.

all_pokemon.head()

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type
0,1,bulbasaur,45,49,49,65,65,45,grass,poison
1,2,ivysaur,60,62,63,80,80,60,grass,poison
2,3,venusaur,80,82,83,100,100,80,grass,poison
3,4,charmander,39,52,43,60,50,65,fire,
4,5,charmeleon,58,64,58,80,65,80,fire,


## Data Transformation 2: Fill in Missing Values

In [4]:
# As we have seen in the dataset we worked with for the second milestone, the Pokémon who do not have a secondary type are 
# given a missing value. In this dataset, we have made that value appear as 'None'. Since 'None' is recognized as equal to 
# 'NaN', we can fill in these values using fillna(). I will be replacing the 'None' value with that Pokémon's primary type 
# so that both types will be the same for those who only have a primary type.

all_pokemon['Secondary_type'] = all_pokemon['Secondary_type'].fillna(all_pokemon['Primary_type'])

# The head() function is used simply to verify that the transformation of the data has been performed correctly.

all_pokemon.head()

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type
0,1,bulbasaur,45,49,49,65,65,45,grass,poison
1,2,ivysaur,60,62,63,80,80,60,grass,poison
2,3,venusaur,80,82,83,100,100,80,grass,poison
3,4,charmander,39,52,43,60,50,65,fire,fire
4,5,charmeleon,58,64,58,80,65,80,fire,fire


## Data Transformation 3: Fix Letter Casing

In [5]:
# To make the dataset match the casing of my previous two datasets, I will fix the casing of the string observation dataset 
# columns so that the first letter of the data point is capitalized. This is accomplished using the str.title() function.

all_pokemon['Pokemon_name'] = all_pokemon['Pokemon_name'].str.title()
all_pokemon['Primary_type'] = all_pokemon['Primary_type'].str.title()
all_pokemon['Secondary_type'] = all_pokemon['Secondary_type'].str.title()

# The head() function is used simply to verify that the transformation of the data has been performed correctly.

all_pokemon.head()

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type
0,1,Bulbasaur,45,49,49,65,65,45,Grass,Poison
1,2,Ivysaur,60,62,63,80,80,60,Grass,Poison
2,3,Venusaur,80,82,83,100,100,80,Grass,Poison
3,4,Charmander,39,52,43,60,50,65,Fire,Fire
4,5,Charmeleon,58,64,58,80,65,80,Fire,Fire


## Data Transformation 4: Remove Special Characters

In [6]:
# Looking back at the initial version of the dataset gathered and converted from the API data, the last few observations had
# dashes instead of spaces for some of the Pokémon names. I need to remove these dashes and replace them with spaces using 
# the str.replace() function.

all_pokemon['Pokemon_name'] = all_pokemon['Pokemon_name'].str.replace('-', ' ')

# The tail() function is used this time to verify that the transformation of the data has been performed correctly.

all_pokemon.tail()

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type
1020,1021,Raging Bolt,125,73,91,137,89,75,Electric,Dragon
1021,1022,Iron Boulder,90,120,80,68,108,124,Rock,Psychic
1022,1023,Iron Crown,90,72,100,122,108,98,Steel,Psychic
1023,1024,Terapagos,90,65,85,65,85,60,Normal,Normal
1024,1025,Pecharunt,88,88,160,88,88,88,Poison,Ghost


## Data Transformation 5: Add Useful Columns

In [7]:
# With this dataset being the most encompassing of the datasets I have worked with so far in this project, I wanted to add 
# the missing columns I saw in the last two datasets, which are the columns that denote the sum of the six Pokémon stats, 
# the average of these stats, and the standard deviation associated with the stats. Each column is created using these stats
# and thr sum() function, the mean(), and NumPy's std() function, respectively. I have imported NumPy here.

import numpy as np

all_pokemon['Total_stats'] = all_pokemon[['HP_stat', 'Attack_stat', 'Defense_stat', 'Special_attack_stat', 
                                          'Special_defense_stat', 'Speed_stat']].sum(axis=1)

# For the average and standard deviation, I chose to round off the answer to the nearest hundredth place with the 
# round() function with 2 as the argument.

all_pokemon['Avg_of_stats'] = all_pokemon[['HP_stat', 'Attack_stat', 'Defense_stat', 'Special_attack_stat', 
                                           'Special_defense_stat', 'Speed_stat']].mean(axis=1).round(2)

all_pokemon['Deviation_of_stats'] = np.std(all_pokemon[['HP_stat', 'Attack_stat', 'Defense_stat', 'Special_attack_stat', 
                                                 'Special_defense_stat', 'Speed_stat']], axis=1).round(2)

# The head() function is used simply to verify that the transformation of the data has been performed correctly.

all_pokemon.head()

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type,Total_stats,Avg_of_stats,Deviation_of_stats
0,1,Bulbasaur,45,49,49,65,65,45,Grass,Poison,318,53.0,8.64
1,2,Ivysaur,60,62,63,80,80,60,Grass,Poison,405,67.5,8.9
2,3,Venusaur,80,82,83,100,100,80,Grass,Poison,525,87.5,8.9
3,4,Charmander,39,52,43,60,50,65,Fire,Fire,309,51.5,9.0
4,5,Charmeleon,58,64,58,80,65,80,Fire,Fire,405,67.5,9.23


## Ethical Implications

With the above changes made to the dataset (replacing headers, filling in missing values, fixing letter casing, removing special characters, and adding useful columns), I have arrived at a human-readable and clean dataset. As I have completed the transformation of the initial dataset parsed and converted from my chosen API URL to achieve the end result, I can see now that there are some ethical risks to consider when thinking about potential questions I might ask of the data. In the world of Pokémon, each creature is different, as is apparent with the display of various statistics attached to each observation. With this dataset, I feel that I have arrived at the best dataset to describe every Pokémon available to the public, save for any regional variants and Mega evolved Pokémon. With respect to the hypothetical question I have been asking of which Pokémon has the highest combined total stats, to mitigate the unintentional bias I have created towards Pokémon who are abnormal in the sense that they have different forms, I would have to reinclude both Pokémon with regional or other variants as well as Mega evolved Pokémon. The Pokémon API that was used here contains no data on these types of Pokémon, so I would have to look elsehwere to find data on the Pokémon I would need to add to the mix to truly figure out which one has the highest combined stat total. After excluding these Pokémon with each dataset, I will need to edit my prior question to something akin to "Which Pokémon (barring variant or unnatural evolution) has ther highest total combined stats?" This pivot to a revised question allows the natural removal of the aforementioned outlier Pokémon with no harm to the insight needed to answer the question at hand. Regarding the data itself, I am glad that I can manipulate the data freely with little to no worry of legal issues other than the fact that all of the observations within the dataset belong to the Pokémon Company, meaning that the data has been verified as accurate by the company itself and that it has been widely published for the public to have access to. In this case with the dataset being harvested from an API URL that represents the Pokémon Company through information found in the Pokémon games, they have accurately reported the data correctly. The API documentation also outlines its fair use policy, wrapper libraries, and other information that lets a user access the information held within the API URL in an ethically responsible way. Now that the dataset appropriately showcases the new features seen in the previous two milestones, I can now ethically expand on the question of which Pokémon has the highest combined stats by also inquiring which Pokémon has the highest average stats. This is considered an ethical approach as the new feature added (the average column) simply applies the arithmetic mean formula to the already verified Pokémon stats.

While transforming the data from the API, I have assumed that each column of data would have the correct data type attached to it. I look forward to seeing if this is the case in the final milestone when I am to merge all three datasets together, as having integer types and object types trying to merge may not play out well if the data typing doesn't match up. Using the requests library in Python to acquire the data from my chosen API URL may bring up some ethical concerns, especially since some API calls are monitored for their use and can be limited to a specified number of calls per day. For my data however, as it is safely acquired for educational purposes and I have followed the fair use policy, the question of ethics is a moot point here. With the completion of this dataset, I am very much looking forward to merging the three finished datasets I have now crafted. The human-readable dataset I have cleaned and transformed is below.

In [8]:
all_pokemon.head(20)

Unnamed: 0,Pokedex_entry_number,Pokemon_name,HP_stat,Attack_stat,Defense_stat,Special_attack_stat,Special_defense_stat,Speed_stat,Primary_type,Secondary_type,Total_stats,Avg_of_stats,Deviation_of_stats
0,1,Bulbasaur,45,49,49,65,65,45,Grass,Poison,318,53.0,8.64
1,2,Ivysaur,60,62,63,80,80,60,Grass,Poison,405,67.5,8.9
2,3,Venusaur,80,82,83,100,100,80,Grass,Poison,525,87.5,8.9
3,4,Charmander,39,52,43,60,50,65,Fire,Fire,309,51.5,9.0
4,5,Charmeleon,58,64,58,80,65,80,Fire,Fire,405,67.5,9.23
5,6,Charizard,78,84,78,109,85,100,Fire,Flying,534,89.0,11.58
6,7,Squirtle,44,48,65,50,64,43,Water,Water,314,52.33,8.92
7,8,Wartortle,59,63,80,65,80,58,Water,Water,405,67.5,9.14
8,9,Blastoise,79,83,100,85,105,78,Water,Water,530,88.33,10.39
9,10,Caterpie,45,30,35,20,20,45,Bug,Bug,195,32.5,10.31
