# POKÉMON DATA WEBSCRAPING - LOADING DATA

- The webscraping is carried out from PokéAPI:
    - Link: https://pokeapi.co/

- You can check the PokéAPI docs section to search all possible trackeable information about pokémons. The pokémon data has JSON format.
    - PokéAPI docs section: https://pokeapi.co/docs/v2
    - Recommended: Use CTRL + F to search the pokémon attributes.

- **Imports**

In [1]:
import requests
import csv
import pandas as pd
import os

- **Method for web scraping information about pokémons**

In [6]:
# Base URL of the PokéAPI for Pokémon data
base_url = 'https://pokeapi.co/api/v2/pokemon/'

# Function to get detailed data for a specific Pokémon by its ID
def get_pokemon_info(pokemon_id):
    """Fetches detailed data for a Pokémon without downloading any images."""
    # Construct the URL for the Pokémon data
    url = f'{base_url}{pokemon_id}'
    response = requests.get(url)

    if response.status_code != 200:
        print(f"Failed to retrieve data for Pokémon ID {pokemon_id}")
        return None

    # Parse the JSON data
    data = response.json()

    # Get species data from the species URL
    species_url = data['species']['url']
    species_response = requests.get(species_url)
    if species_response.status_code != 200:
        print(f"Failed to retrieve species data from {species_url}")
        return None
    species_data = species_response.json()

    # Prepare stats as individual columns
    stats = {stat['stat']['name']: stat['base_stat'] for stat in data['stats']}

    # Extract relevant information
    pokemon_info = {
        # pokemon number in pokédex
        'id': data.get('id'),
        # name of the pokémon
        'name': data.get('name', '').capitalize(),

        # base experience points
        'base_experience': data.get('base_experience', None),

        # height in decimetres
        'height': data.get('height', None),

        #weight in hectograms
        'weight': data.get('weight', None),

        # If this Pokémon is the default form 
        'is_default': data.get('is_default', None),

        #order by date of release (similars are grouped together)
        'order': data.get('order', None),

        #terrain where the pokémon can be found
        'habitat': species_data['habitat']['name'] if species_data.get('habitat') else None,

        #rate at which the pokémon gains levels.
        'growth_rate': species_data['growth_rate']['name'] if species_data.get('growth_rate') else None,

        #if it is a legendary pokémon
        'is_legendary': species_data.get('is_legendary', False),

        #if it is a mythical pokémon
        'is_mythical': species_data.get('is_mythical', False),

        #chance of the pokémon being femaile (-1: generderless)
        'gender_rate': species_data.get('gender_rate', None),

        #base capture rate
        'capture_rate': species_data.get('capture_rate', None),

        #base happiness
        'base_happiness': species_data.get('base_happiness', None),

        #possible abilities of the pokémon
        'abilities': ', '.join([ability['ability']['name'] for ability in data.get('abilities', [])]),

        #possible forms of the pokémon
        'forms': ', '.join([form['name'] for form in data.get('forms', [])]),

        #possible held items of the pokémon
        'held_items': bool(data.get('held_items', [])),

        #moves that the pokemon can learn
        'moves': ', '.join([move['move']['name'] for move in data.get('moves', [])]),

        #types of the pokémon
        'types': ', '.join([ptype['type']['name'] for ptype in data.get('types', [])]),

        #stats of the pokémon: hp, attack, defense, special-attack, special-defense, speed
        'hp': stats.get('hp', None),
        'attack': stats.get('attack', None),
        'defense': stats.get('defense', None),
        'special-attack': stats.get('special-attack', None),
        'special-defense': stats.get('special-defense', None),
        'speed': stats.get('speed', None)
    }

    return pokemon_info

- **Test the get_pokemon_info method for web scraping**

In [7]:
# Range of Pokémon IDs to scrape data. The PokéAPI has data for every Pokémon of the official Pokédex (up to 1025 pokémons)
start_id = 5
end_id = 10

# Fetch Pokémon data and store in a list
pokemon_details = []

for pokemon_id in range(start_id, end_id + 1):
    print(f"Fetching data for Pokémon ID {pokemon_id}...") #Prints for inform the user which pokemon is being fetched 
    info = get_pokemon_info(pokemon_id)
    
    if info:
        print(f"Fetched data for {info['name']}") #Prints for inform the user that the pokemon was fetched
        pokemon_details.append(info)
    else:
        print(f"Failed to retrieve data for Pokémon ID {pokemon_id}") #Prints for inform the user that the pokemon was not fetched

Fetching data for Pokémon ID 5...
Fetched data for Charmeleon
Fetching data for Pokémon ID 6...
Fetched data for Charizard
Fetching data for Pokémon ID 7...
Fetched data for Squirtle
Fetching data for Pokémon ID 8...
Fetched data for Wartortle
Fetching data for Pokémon ID 9...
Fetched data for Blastoise
Fetching data for Pokémon ID 10...
Fetched data for Caterpie


**CSV Creation**
- Establish the csv name and its columns. 
- Write the data to the csv.
- Check csv

In [8]:
# Define the CSV file name
csv_filename = 'pokemon_dataset.csv'

# Define the column names for the CSV file
fieldnames = [
    'id', 'name', 'base_experience', 'height', 'weight', 'is_default', 'order', 
    'habitat', 'growth_rate', 'is_legendary', 'is_mythical', 'gender_rate', 
    'capture_rate', 'base_happiness', 'abilities', 'forms', 'held_items', 'moves', 
    'types', 'hp', 'attack', 'defense', 'special-attack', 'special-defense', 'speed'
]

# Write data to CSV
with open(csv_filename, mode='w', newline='') as file:
    writer = csv.DictWriter(file, fieldnames=fieldnames)

    # Write the header
    writer.writeheader()

    # Write the data
    for pokemon in pokemon_details:
        if isinstance(pokemon, dict):
            writer.writerow(pokemon)

print(f"Data saved to {csv_filename}")

Data saved to pokemon_dataset.csv


**Load the csv as a dataframe**

In [9]:
# Now create a pandas DataFrame from the CSV file
df = pd.read_csv(csv_filename)

# Show the DataFrame
print("\nPokémon DataFrame:")
df



Pokémon DataFrame:


Unnamed: 0,id,name,base_experience,height,weight,is_default,order,habitat,growth_rate,is_legendary,...,forms,held_items,moves,types,hp,attack,defense,special-attack,special-defense,speed
0,5,Charmeleon,142,11,190,True,6,mountain,medium-slow,False,...,charmeleon,False,"mega-punch, fire-punch, thunder-punch, scratch...",fire,58,64,58,80,65,80
1,6,Charizard,267,17,905,True,7,mountain,medium-slow,False,...,charizard,False,"mega-punch, fire-punch, thunder-punch, scratch...","fire, flying",78,84,78,109,85,100
2,7,Squirtle,63,5,90,True,10,waters-edge,medium-slow,False,...,squirtle,False,"mega-punch, ice-punch, mega-kick, headbutt, ta...",water,44,48,65,50,64,43
3,8,Wartortle,142,10,225,True,11,waters-edge,medium-slow,False,...,wartortle,False,"mega-punch, ice-punch, mega-kick, headbutt, ta...",water,59,63,80,65,80,58
4,9,Blastoise,265,16,855,True,12,waters-edge,medium-slow,False,...,blastoise,False,"mega-punch, ice-punch, mega-kick, headbutt, ta...",water,79,83,100,85,105,78
5,10,Caterpie,39,3,29,True,14,forest,medium,False,...,caterpie,False,"tackle, string-shot, snore, bug-bite, electroweb",bug,45,30,35,20,20,45
