# Pokemon Scraper
This notebook contains information of Generations II-IX Pokemons from [Bulbapedia's National Pokedex](https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_by_National_Pok%C3%A9dex_number). All data was collected using `requests` and `Beautiful Soup`

### Import `requests` library
In order to access Bulbapedia's HTML code, we import the `request` library to extract data from the website.

In [1]:
import requests
import pandas as pd
import re
URL = "https://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A9mon_by_National_Pok%C3%A9dex_number"

In [2]:
from datetime import datetime
start = datetime.now()
start_time = start.strftime("%H:%M:%S")

### Load the Page

In [3]:
page = requests.get(URL)

### Parse HTML data
#### Beautiful Soup
This python package allows you to `parse` the information in the HTML file.

In [4]:
from bs4 import BeautifulSoup
soup = BeautifulSoup(page.content, 'html.parser')

### Find all tables that contain Pokemon details
The information needed to find all text content and tables containing Pokemon details can be found in the website when pressing f12 or clicking the right click button on the mouse and choose Inspect. 

In [5]:
poke_content=soup.find(id='mw-content-text')
poke_tables=poke_content.find_all('table')

## Generation II Pokemons


This section of the notebook will be about the Second Generation Pokemons. Since there are multiple tables in the mw-context-text dive, we'll first check elements found in Second Generation Pokemons.

In [6]:
select_generation=2
gen2_list=poke_tables[select_generation]
list(gen2_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #D6D6D6"><a href="/wiki/List_of_Pok%C3%A9mon_by_Johto_Pok%C3%A9dex_number" title="List of Pokémon by Johto Pokédex number"><span style="color:#000;">Jdex</span></a>
</th>
<th style="background: #D6D6D6">Ndex
</th>
<th style="background: #D6D6D6">MS
</th>
<th style="background: #D6D6D6">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #D6D6D6">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#152
</td>
<th><a href="/wiki/Chikorita_(Pok%C3%A9mon)" title="Chikorita"><img alt="Ch

In this section, it can be observed that each row is a pokemon.

In [7]:
for each,row_num in zip(gen2_list.contents[1], range(0,len(gen2_list.contents[1]))):
    print('##################################')
    print(row_num)
    print(each)
    print('##################################')

##################################
0
<tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #D6D6D6"><a href="/wiki/List_of_Pok%C3%A9mon_by_Johto_Pok%C3%A9dex_number" title="List of Pokémon by Johto Pokédex number"><span style="color:#000;">Jdex</span></a>
</th>
<th style="background: #D6D6D6">Ndex
</th>
<th style="background: #D6D6D6">MS
</th>
<th style="background: #D6D6D6">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #D6D6D6">Type
</th></tr>
##################################
##################################
1


##################################
##################################
2
<tr style="backgro

It is observed that rows with even numbers are only the ones filled with pokemon information.
Upon further examination, other Pokemon information included are: region, ndex, name, types, URL to the Pokemon's wiki page, and the Pokemon's generation.
Now, to access these information, we convert the HTML block into text.

In [8]:
poke_info=gen2_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #D6D6D6"><a href="/wiki/List_of_Pok%C3%A9mon_by_Johto_Pok%C3%A9dex_number" title="List of Pokémon by Johto Pokédex number"><span style="color:#000;">Jdex</span></a>
</th>
<th style="background: #D6D6D6">Ndex
</th>
<th style="background: #D6D6D6">MS
</th>
<th style="background: #D6D6D6">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #D6D6D6">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#152
</td>
<th><a href="/wiki/Chikorita_(Pok%C3%A9mon)" title="Chikorita"><img alt="Ch

In [9]:
print(poke_info.text)


Jdex

Ndex

MS

Pokémon

Type


#001

#152



Chikorita

Grass


#002

#153



Bayleef

Grass


#003

#154



Meganium

Grass


#004

#155



Cyndaquil

Fire


#005

#156



Quilava

Fire


#006

#157



Typhlosion

Fire


 

#157



Typhlosion

Fire
Ghost


#007

#158



Totodile

Water


#008

#159



Croconaw

Water


#009

#160



Feraligatr

Water


#019

#161



Sentret

Normal


#020

#162



Furret

Normal


#015

#163



Hoothoot

Normal
Flying


#016

#164



Noctowl

Normal
Flying


#030

#165



Ledyba

Bug
Flying


#031

#166



Ledian

Bug
Flying


#032

#167



Spinarak

Bug
Poison


#033

#168



Ariados

Bug
Poison


#039

#169



Crobat

Poison
Flying


#176

#170



Chinchou

Water
Electric


#177

#171



Lanturn

Water
Electric


#021

#172



Pichu

Electric


#040

#173



Cleffa

Fairy


#044

#174



Igglybuff

Normal
Fairy


#046

#175



Togepi

Fairy


#047

#176



Togetic

Fairy
Flying


#161

#177



Natu

Psychic
Flying


#162

#178



Xatu

Psychic
Fly

Now, we will try to print all the raw text pokemon info in a text.

In [10]:
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            print(each.text.strip().split('\n'))

['Jdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#152', '', '', '', 'Chikorita', '', 'Grass']
['#002', '', '#153', '', '', '', 'Bayleef', '', 'Grass']
['#003', '', '#154', '', '', '', 'Meganium', '', 'Grass']
['#004', '', '#155', '', '', '', 'Cyndaquil', '', 'Fire']
['#005', '', '#156', '', '', '', 'Quilava', '', 'Fire']
['#006', '', '#157', '', '', '', 'Typhlosion', '', 'Fire']
['#157', '', '', '', 'Typhlosion', '', 'Fire', 'Ghost']
['#007', '', '#158', '', '', '', 'Totodile', '', 'Water']
['#008', '', '#159', '', '', '', 'Croconaw', '', 'Water']
['#009', '', '#160', '', '', '', 'Feraligatr', '', 'Water']
['#019', '', '#161', '', '', '', 'Sentret', '', 'Normal']
['#020', '', '#162', '', '', '', 'Furret', '', 'Normal']
['#015', '', '#163', '', '', '', 'Hoothoot', '', 'Normal', 'Flying']
['#016', '', '#164', '', '', '', 'Noctowl', '', 'Normal', 'Flying']
['#030', '', '#165', '', '', '', 'Ledyba', '', 'Bug', 'Flying']
['#031', '', '#166', '', '', '', 'Ledian', '', 

Now, we add the Pokemon's URL. 

In [11]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[6]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Jdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#152', '', '', '', 'Chikorita', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Chikorita_(Pok%C3%A9mon)
['#002', '', '#153', '', '', '', 'Bayleef', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Bayleef_(Pok%C3%A9mon)
['#003', '', '#154', '', '', '', 'Meganium', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Meganium_(Pok%C3%A9mon)
['#004', '', '#155', '', '', '', 'Cyndaquil', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Cyndaquil_(Pok%C3%A9mon)
['#005', '', '#156', '', '', '', 'Quilava', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Quilava_(Pok%C3%A9mon)
['#006', '', '#157', '', '', '', 'Typhlosion', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Typhlosion_(Pok%C3%A9mon)
['#157', '', '', '', 'Typhlosion', '', 'Fire', 'Ghost'] https://bulbapedia.bulbagarden.net/wiki/Fire_(Pok%C3%A9mon)
['#007', '', '#158', '', '', '', 'Totodile', '', 'Water'] https://bulbapedia.bulbagarden.net/

In the next code snippet, I tried to see if the indexes of different raw pokemin info would be the same without the 5 other pokemons who only have 7 and 8 raw information.

In [12]:
index = 0
for each in poke_info:
    if each != '\n':
        if ((len(each.text.strip().split('\n')) != 8)):
            if (len(each.text.strip().split('\n')) != 7):
                if len(each.text.strip().split('\n')):
                    if index!=0:
                        print(each.text.strip().split('\n'), end=" ")
                    elif index==0:
                        print(each.text.strip().split('\n'))

                    spliturl = each.text.strip().split('\n')[6]
                    if index!=0:
                        print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
                    index+=1

['Jdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#152', '', '', '', 'Chikorita', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Chikorita_(Pok%C3%A9mon)
['#002', '', '#153', '', '', '', 'Bayleef', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Bayleef_(Pok%C3%A9mon)
['#003', '', '#154', '', '', '', 'Meganium', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Meganium_(Pok%C3%A9mon)
['#004', '', '#155', '', '', '', 'Cyndaquil', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Cyndaquil_(Pok%C3%A9mon)
['#005', '', '#156', '', '', '', 'Quilava', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Quilava_(Pok%C3%A9mon)
['#006', '', '#157', '', '', '', 'Typhlosion', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Typhlosion_(Pok%C3%A9mon)
['#007', '', '#158', '', '', '', 'Totodile', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Totodile_(Pok%C3%A9mon)
['#008', '', '#159', '', '', '', 'Croconaw', '', 'Water'] https://bulbapedia.bulbagarde

### Dataframe Conversion 

In [13]:
gen2_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen2_list.contents[info_start]

for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "II"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2
            
        elif len(pokemon_raw_info) == 7:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = ''
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen2_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [14]:
df_gen2_pokemon = pd.DataFrame(gen2_pokemon)

In [15]:
df_gen2_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen2_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#152,Chikorita,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Chikor...
1,#002,#153,Bayleef,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Baylee...
2,#003,#154,Meganium,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Megani...
3,#004,#155,Cyndaquil,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Cyndaq...
4,#005,#156,Quilava,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Quilav...
...,...,...,...,...,...,...,...
100,#250,#247,Pupitar,Rock,Ground,II,https://bulbapedia.bulbagarden.net/wiki/Pupita...
101,#251,#248,Tyranitar,Rock,Dark,II,https://bulbapedia.bulbagarden.net/wiki/Tyrani...
102,#252,#249,Lugia,Psychic,Flying,II,https://bulbapedia.bulbagarden.net/wiki/Lugia_...
103,#253,#250,Ho-Oh,Fire,Flying,II,https://bulbapedia.bulbagarden.net/wiki/Ho-Oh_...


Now, we just do the same to all other generations

## Generation III Pokemons


In [16]:
select_generation=3
gen3_list=poke_tables[select_generation]
list(gen3_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #5959C1"><a class="mw-disambig" href="/wiki/List_of_Pok%C3%A9mon_by_Hoenn_Pok%C3%A9dex_number" title="List of Pokémon by Hoenn Pokédex number"><span style="color:#000;">Hdex</span></a>
</th>
<th style="background: #5959C1">Ndex
</th>
<th style="background: #5959C1">MS
</th>
<th style="background: #5959C1">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #5959C1">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#252
</td>
<th><a href="/wiki/Treecko_(Pok%C3%A9mon)" title="Treec

In [17]:
for each,row_num in zip(gen3_list.contents[1], range(0,len(gen3_list.contents[1]))):
    print('##################################')
    print(row_num)
    print(each)
    print('##################################')

##################################
0
<tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #5959C1"><a class="mw-disambig" href="/wiki/List_of_Pok%C3%A9mon_by_Hoenn_Pok%C3%A9dex_number" title="List of Pokémon by Hoenn Pokédex number"><span style="color:#000;">Hdex</span></a>
</th>
<th style="background: #5959C1">Ndex
</th>
<th style="background: #5959C1">MS
</th>
<th style="background: #5959C1">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #5959C1">Type
</th></tr>
##################################
##################################
1


##################################
##################################


In [18]:
poke_info=gen3_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #5959C1"><a class="mw-disambig" href="/wiki/List_of_Pok%C3%A9mon_by_Hoenn_Pok%C3%A9dex_number" title="List of Pokémon by Hoenn Pokédex number"><span style="color:#000;">Hdex</span></a>
</th>
<th style="background: #5959C1">Ndex
</th>
<th style="background: #5959C1">MS
</th>
<th style="background: #5959C1">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #5959C1">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#252
</td>
<th><a href="/wiki/Treecko_(Pok%C3%A9mon)" title="Treec

In [19]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[6]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Hdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#252', '', '', '', 'Treecko', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Treecko_(Pok%C3%A9mon)
['#002', '', '#253', '', '', '', 'Grovyle', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Grovyle_(Pok%C3%A9mon)
['#003', '', '#254', '', '', '', 'Sceptile', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Sceptile_(Pok%C3%A9mon)
['#004', '', '#255', '', '', '', 'Torchic', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Torchic_(Pok%C3%A9mon)
['#005', '', '#256', '', '', '', 'Combusken', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Combusken_(Pok%C3%A9mon)
['#006', '', '#257', '', '', '', 'Blaziken', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Blaziken_(Pok%C3%A9mon)
['#007', '', '#258', '', '', '', 'Mudkip', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Mudkip_(Pok%C3%A9mon)
['#008', '', '#259', '', '', '', 'Marshtomp', '', 'Water', 'Ground'] https:

### Dataframe Conversion

In [20]:
gen3_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen3_list.contents[info_start]

for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "III"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen3_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [21]:
df_gen3_pokemon = pd.DataFrame(gen3_pokemon)

In [22]:
df_gen3_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen3_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#252,Treecko,Grass,,III,https://bulbapedia.bulbagarden.net/wiki/Treeck...
1,#002,#253,Grovyle,Grass,,III,https://bulbapedia.bulbagarden.net/wiki/Grovyl...
2,#003,#254,Sceptile,Grass,,III,https://bulbapedia.bulbagarden.net/wiki/Scepti...
3,#004,#255,Torchic,Fire,,III,https://bulbapedia.bulbagarden.net/wiki/Torchi...
4,#005,#256,Combusken,Fire,Fighting,III,https://bulbapedia.bulbagarden.net/wiki/Combus...
...,...,...,...,...,...,...,...
138,#201,#385,Jirachi,Steel,Psychic,III,https://bulbapedia.bulbagarden.net/wiki/Jirach...
139,#202,#386,Deoxys,Psychic,,III,https://bulbapedia.bulbagarden.net/wiki/Deoxys...
140,#202,#386,Deoxys,Psychic,,III,https://bulbapedia.bulbagarden.net/wiki/Deoxys...
141,#202,#386,Deoxys,Psychic,,III,https://bulbapedia.bulbagarden.net/wiki/Deoxys...


## Generation IV Pokemons


In [23]:
select_generation=4
gen4_list=poke_tables[select_generation]
list(gen4_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #FFC8C8"><a href="/wiki/List_of_Pok%C3%A9mon_by_Sinnoh_Pok%C3%A9dex_number" title="List of Pokémon by Sinnoh Pokédex number"><span style="color:#000;">Sdex</span></a>
</th>
<th style="background: #FFC8C8">Ndex
</th>
<th style="background: #FFC8C8">MS
</th>
<th style="background: #FFC8C8">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #FFC8C8">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#387
</td>
<th><a href="/wiki/Turtwig_(Pok%C3%A9mon)" title="Turtwig"><img alt="Turt

In [24]:
poke_info=gen4_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #FFC8C8"><a href="/wiki/List_of_Pok%C3%A9mon_by_Sinnoh_Pok%C3%A9dex_number" title="List of Pokémon by Sinnoh Pokédex number"><span style="color:#000;">Sdex</span></a>
</th>
<th style="background: #FFC8C8">Ndex
</th>
<th style="background: #FFC8C8">MS
</th>
<th style="background: #FFC8C8">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #FFC8C8">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#387
</td>
<th><a href="/wiki/Turtwig_(Pok%C3%A9mon)" title="Turtwig"><img alt="Turt

In [25]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[6]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Sdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#387', '', '', '', 'Turtwig', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Turtwig_(Pok%C3%A9mon)
['#002', '', '#388', '', '', '', 'Grotle', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Grotle_(Pok%C3%A9mon)
['#003', '', '#389', '', '', '', 'Torterra', '', 'Grass', 'Ground'] https://bulbapedia.bulbagarden.net/wiki/Torterra_(Pok%C3%A9mon)
['#004', '', '#390', '', '', '', 'Chimchar', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Chimchar_(Pok%C3%A9mon)
['#005', '', '#391', '', '', '', 'Monferno', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Monferno_(Pok%C3%A9mon)
['#006', '', '#392', '', '', '', 'Infernape', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Infernape_(Pok%C3%A9mon)
['#007', '', '#393', '', '', '', 'Piplup', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Piplup_(Pok%C3%A9mon)
['#008', '', '#394', '', '', '', 'Prinplup', '', 'Water'] https:/

### Dataframe Conversion 

In [26]:
gen4_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen4_list.contents[info_start]

for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "IV"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2
            
        elif len(pokemon_raw_info) == 7:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = ''
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen4_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [27]:
df_gen4_pokemon = pd.DataFrame(gen4_pokemon)

In [28]:
df_gen4_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen4_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#387,Turtwig,Grass,,IV,https://bulbapedia.bulbagarden.net/wiki/Turtwi...
1,#002,#388,Grotle,Grass,,IV,https://bulbapedia.bulbagarden.net/wiki/Grotle...
2,#003,#389,Torterra,Grass,Ground,IV,https://bulbapedia.bulbagarden.net/wiki/Torter...
3,#004,#390,Chimchar,Fire,,IV,https://bulbapedia.bulbagarden.net/wiki/Chimch...
4,#005,#391,Monferno,Fire,Fighting,IV,https://bulbapedia.bulbagarden.net/wiki/Monfer...
...,...,...,...,...,...,...,...
117,#151,#490,Manaphy,Water,,IV,https://bulbapedia.bulbagarden.net/wiki/Manaph...
118,,#491,Darkrai,Dark,,IV,https://bulbapedia.bulbagarden.net/wiki/Darkra...
119,,#492,Shaymin,Grass,,IV,https://bulbapedia.bulbagarden.net/wiki/Shaymi...
120,,#492,Shaymin,Grass,Flying,IV,https://bulbapedia.bulbagarden.net/wiki/Shaymi...


## Generation V Pokemons


In [29]:
select_generation=5
gen5_list=poke_tables[select_generation]
list(gen5_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #EBEBEB"><a class="mw-disambig" href="/wiki/List_of_Pok%C3%A9mon_by_Unova_Pok%C3%A9dex_number" title="List of Pokémon by Unova Pokédex number"><span style="color:#000;">Udex</span></a>
</th>
<th style="background: #EBEBEB">Ndex
</th>
<th style="background: #EBEBEB">MS
</th>
<th style="background: #EBEBEB">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #EBEBEB">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#000
</td>
<td style="font-family:monospace">#494
</td>
<th><a href="/wiki/Victini_(Pok%C3%A9mon)" title="Victi

In [30]:
poke_info=gen5_list.contents[1]
poke_info


<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #EBEBEB"><a class="mw-disambig" href="/wiki/List_of_Pok%C3%A9mon_by_Unova_Pok%C3%A9dex_number" title="List of Pokémon by Unova Pokédex number"><span style="color:#000;">Udex</span></a>
</th>
<th style="background: #EBEBEB">Ndex
</th>
<th style="background: #EBEBEB">MS
</th>
<th style="background: #EBEBEB">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #EBEBEB">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#000
</td>
<td style="font-family:monospace">#494
</td>
<th><a href="/wiki/Victini_(Pok%C3%A9mon)" title="Victi

In [31]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[6]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Udex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#000', '', '#494', '', '', '', 'Victini', '', 'Psychic', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Victini_(Pok%C3%A9mon)
['#001', '', '#495', '', '', '', 'Snivy', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Snivy_(Pok%C3%A9mon)
['#002', '', '#496', '', '', '', 'Servine', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Servine_(Pok%C3%A9mon)
['#003', '', '#497', '', '', '', 'Serperior', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Serperior_(Pok%C3%A9mon)
['#004', '', '#498', '', '', '', 'Tepig', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Tepig_(Pok%C3%A9mon)
['#005', '', '#499', '', '', '', 'Pignite', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Pignite_(Pok%C3%A9mon)
['#006', '', '#500', '', '', '', 'Emboar', '', 'Fire', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Emboar_(Pok%C3%A9mon)
['#007', '', '#501', '', '', '', 'Oshawott', '', 'Water'] https://bulbapedia.

### Dataframe Conversion 

In [32]:
gen5_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen5_list.contents[info_start]

for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "V"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2
            
        elif len(pokemon_raw_info) == 7:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = ''
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen5_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [33]:
df_gen5_pokemon = pd.DataFrame(gen5_pokemon)

In [34]:
df_gen5_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen5_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#000,#494,Victini,Psychic,Fire,V,https://bulbapedia.bulbagarden.net/wiki/Victin...
1,#001,#495,Snivy,Grass,,V,https://bulbapedia.bulbagarden.net/wiki/Snivy_...
2,#002,#496,Servine,Grass,,V,https://bulbapedia.bulbagarden.net/wiki/Servin...
3,#003,#497,Serperior,Grass,,V,https://bulbapedia.bulbagarden.net/wiki/Serper...
4,#004,#498,Tepig,Fire,,V,https://bulbapedia.bulbagarden.net/wiki/Tepig_...
...,...,...,...,...,...,...,...
168,#152,#646,Kyurem,Dragon,Ice,V,https://bulbapedia.bulbagarden.net/wiki/Kyurem...
169,#153,#647,Keldeo,Water,Fighting,V,https://bulbapedia.bulbagarden.net/wiki/Keldeo...
170,#154,#648,Meloetta,Normal,Psychic,V,https://bulbapedia.bulbagarden.net/wiki/Meloet...
171,#154,#648,Meloetta,Normal,Fighting,V,https://bulbapedia.bulbagarden.net/wiki/Meloet...


## Generation VI Pokemons


In [35]:
select_generation=6
gen6_list=poke_tables[select_generation]
list(gen6_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #F16A81"><a href="/wiki/List_of_Pok%C3%A9mon_by_Kalos_Pok%C3%A9dex_number" title="List of Pokémon by Kalos Pokédex number"><span style="color:#000;"><small>Ce/Co/Mo</small><br/>Kdex</span></a>
</th>
<th style="background: #F16A81">Ndex
</th>
<th style="background: #F16A81">MS
</th>
<th style="background: #F16A81">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #F16A81">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001Ce
</td>
<td style="font-family:monospace">#650
</td>
<th><a href="/wiki/Chespin_(Pok%C3%A9mon)" ti

In [36]:
poke_info=gen6_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #F16A81"><a href="/wiki/List_of_Pok%C3%A9mon_by_Kalos_Pok%C3%A9dex_number" title="List of Pokémon by Kalos Pokédex number"><span style="color:#000;"><small>Ce/Co/Mo</small><br/>Kdex</span></a>
</th>
<th style="background: #F16A81">Ndex
</th>
<th style="background: #F16A81">MS
</th>
<th style="background: #F16A81">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #F16A81">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001Ce
</td>
<td style="font-family:monospace">#650
</td>
<th><a href="/wiki/Chespin_(Pok%C3%A9mon)" ti

In [37]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[6]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Ce/Co/MoKdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001Ce', '', '#650', '', '', '', 'Chespin', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Chespin_(Pok%C3%A9mon)
['#002Ce', '', '#651', '', '', '', 'Quilladin', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Quilladin_(Pok%C3%A9mon)
['#003Ce', '', '#652', '', '', '', 'Chesnaught', '', 'Grass', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Chesnaught_(Pok%C3%A9mon)
['#004Ce', '', '#653', '', '', '', 'Fennekin', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Fennekin_(Pok%C3%A9mon)
['#005Ce', '', '#654', '', '', '', 'Braixen', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Braixen_(Pok%C3%A9mon)
['#006Ce', '', '#655', '', '', '', 'Delphox', '', 'Fire', 'Psychic'] https://bulbapedia.bulbagarden.net/wiki/Delphox_(Pok%C3%A9mon)
['#007Ce', '', '#656', '', '', '', 'Froakie', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Froakie_(Pok%C3%A9mon)
['#008Ce', '', '#657', '', '', '', 'Frogadier', 

### Dataframe Conversion 

In [38]:
gen6_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen6_list.contents[info_start]

for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "VI"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen6_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [39]:
df_gen6_pokemon = pd.DataFrame(gen6_pokemon)

In [40]:
df_gen6_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen6_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001Ce,#650,Chespin,Grass,,VI,https://bulbapedia.bulbagarden.net/wiki/Chespi...
1,#002Ce,#651,Quilladin,Grass,,VI,https://bulbapedia.bulbagarden.net/wiki/Quilla...
2,#003Ce,#652,Chesnaught,Grass,Fighting,VI,https://bulbapedia.bulbagarden.net/wiki/Chesna...
3,#004Ce,#653,Fennekin,Fire,,VI,https://bulbapedia.bulbagarden.net/wiki/Fennek...
4,#005Ce,#654,Braixen,Fire,,VI,https://bulbapedia.bulbagarden.net/wiki/Braixe...
...,...,...,...,...,...,...,...
71,#150Mo,#718,Zygarde,Dragon,Ground,VI,https://bulbapedia.bulbagarden.net/wiki/Zygard...
72,#151Ce,#719,Diancie,Rock,Fairy,VI,https://bulbapedia.bulbagarden.net/wiki/Dianci...
73,#152Ce,#720,Hoopa,Psychic,Ghost,VI,https://bulbapedia.bulbagarden.net/wiki/Hoopa_...
74,#152Ce,#720,Hoopa,Psychic,Dark,VI,https://bulbapedia.bulbagarden.net/wiki/Hoopa_...


## Generation VII Pokemons


In [41]:
select_generation=7
gen7_list=poke_tables[select_generation]
list(gen7_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #90BDDC">Adex
</th>
<th style="background: #90BDDC">Ndex
</th>
<th style="background: #90BDDC">MS
</th>
<th style="background: #90BDDC">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #90BDDC">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#722
</td>
<th><a href="/wiki/Rowlet_(Pok%C3%A9mon)" title="Rowlet"><img alt="Rowlet" decoding="async" height="68" src="//archives.bulbagarden.net/media/upload/d/d1/722MS8.png" width="68"/></a>
</th>
<td><a href="/wiki/Rowlet_(Pok%C3%A9m

In [42]:
poke_info=gen7_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #90BDDC">Adex
</th>
<th style="background: #90BDDC">Ndex
</th>
<th style="background: #90BDDC">MS
</th>
<th style="background: #90BDDC">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #90BDDC">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#722
</td>
<th><a href="/wiki/Rowlet_(Pok%C3%A9mon)" title="Rowlet"><img alt="Rowlet" decoding="async" height="68" src="//archives.bulbagarden.net/media/upload/d/d1/722MS8.png" width="68"/></a>
</th>
<td><a href="/wiki/Rowlet_(Pok%C3%A9m

In [43]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[7]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Adex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#722', '', '', '', 'Rowlet', '', 'Grass', 'Flying'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#002', '', '#723', '', '', '', 'Dartrix', '', 'Grass', 'Flying'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#003', '', '#724', '', '', '', 'Decidueye', '', 'Grass', 'Ghost'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#724', '', '', '', 'Decidueye', '', 'Grass', 'Fighting'] https://bulbapedia.bulbagarden.net/wiki/Fighting_(Pok%C3%A9mon)
['#004', '', '#725', '', '', '', 'Litten', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#005', '', '#726', '', '', '', 'Torracat', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#006', '', '#727', '', '', '', 'Incineroar', '', 'Fire', 'Dark'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%C3%A9mon)
['#007', '', '#728', '', '', '', 'Popplio', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/_(Pok%

### Dataframe Conversion 

In [44]:
gen7_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen7_list.contents[info_start]
for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "VII"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2
            
        elif len(pokemon_raw_info) == 8:
            kdex = ''
            ndex = pokemon_raw_info[0]
            poke_name = pokemon_raw_info[4]
            type1 = pokemon_raw_info[6]
            type2 = pokemon_raw_info[7]
            poke_url = url1+pokemon_raw_info[4]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen7_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [45]:
df_gen7_pokemon = pd.DataFrame(gen7_pokemon)

In [46]:
df_gen7_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen7_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#722,Rowlet,Grass,Flying,VII,https://bulbapedia.bulbagarden.net/wiki/Rowlet...
1,#002,#723,Dartrix,Grass,Flying,VII,https://bulbapedia.bulbagarden.net/wiki/Dartri...
2,#003,#724,Decidueye,Grass,Ghost,VII,https://bulbapedia.bulbagarden.net/wiki/Decidu...
3,,#724,Decidueye,Grass,Fighting,VII,https://bulbapedia.bulbagarden.net/wiki/Decidu...
4,#004,#725,Litten,Fire,,VII,https://bulbapedia.bulbagarden.net/wiki/Litten...
...,...,...,...,...,...,...,...
87,#392,#805,Stakataka,Rock,Steel,VII,https://bulbapedia.bulbagarden.net/wiki/Stakat...
88,#393,#806,Blacephalon,Fire,Ghost,VII,https://bulbapedia.bulbagarden.net/wiki/Blacep...
89,#403,#807,Zeraora,Electric,,VII,https://bulbapedia.bulbagarden.net/wiki/Zeraor...
90,#---,#808,Meltan,Steel,,VII,https://bulbapedia.bulbagarden.net/wiki/Meltan...


## Generation VIII Pokemons


In [47]:
select_generation=8
gen8_list=poke_tables[select_generation]
list(gen8_list.contents)[1]

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #D5598C">Gdex
</th>
<th style="background: #D5598C">Ndex
</th>
<th style="background: #D5598C">MS
</th>
<th style="background: #D5598C">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #D5598C">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#810
</td>
<th><a href="/wiki/Grookey_(Pok%C3%A9mon)" title="Grookey"><img alt="Grookey" decoding="async" height="68" src="//archives.bulbagarden.net/media/upload/e/ec/810MS8.png" width="68"/></a>
</th>
<td><a href="/wiki/Grookey_(Pok%C3

In [48]:
poke_info=gen8_list.contents[1]
poke_info

<tbody><tr>
<th style="border-top-left-radius: 5px; -moz-border-radius-topleft: 5px; -webkit-border-top-left-radius: 5px; -khtml-border-top-left-radius: 5px; -icab-border-top-left-radius: 5px; -o-border-top-left-radius: 5px; background: #D5598C">Gdex
</th>
<th style="background: #D5598C">Ndex
</th>
<th style="background: #D5598C">MS
</th>
<th style="background: #D5598C">Pokémon
</th>
<th colspan="2" style="border-top-right-radius: 5px; -moz-border-radius-topright: 5px; -webkit-border-top-right-radius: 5px; -khtml-border-top-right-radius: 5px; -icab-border-top-right-radius: 5px; -o-border-top-right-radius: 5px; background: #D5598C">Type
</th></tr>
<tr style="background:#FFF">
<td style="font-family:monospace">#001
</td>
<td style="font-family:monospace">#810
</td>
<th><a href="/wiki/Grookey_(Pok%C3%A9mon)" title="Grookey"><img alt="Grookey" decoding="async" height="68" src="//archives.bulbagarden.net/media/upload/e/ec/810MS8.png" width="68"/></a>
</th>
<td><a href="/wiki/Grookey_(Pok%C3

In [49]:
index = 0
for each in poke_info:
    if each != '\n':
        if len(each.text.strip().split('\n')):
            if index!=0:
                print(each.text.strip().split('\n'), end=" ")
            elif index==0:
                print(each.text.strip().split('\n'))
            spliturl = each.text.strip().split('\n')[8]
            if index!=0:
                print("https://bulbapedia.bulbagarden.net/wiki/{}_(Pok%C3%A9mon)".format(spliturl))
            index+=1

['Gdex', '', 'Ndex', '', 'MS', '', 'Pokémon', '', 'Type']
['#001', '', '#810', '', '', '', 'Grookey', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Grass_(Pok%C3%A9mon)
['#002', '', '#811', '', '', '', 'Thwackey', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Grass_(Pok%C3%A9mon)
['#003', '', '#812', '', '', '', 'Rillaboom', '', 'Grass'] https://bulbapedia.bulbagarden.net/wiki/Grass_(Pok%C3%A9mon)
['#004', '', '#813', '', '', '', 'Scorbunny', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Fire_(Pok%C3%A9mon)
['#005', '', '#814', '', '', '', 'Raboot', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Fire_(Pok%C3%A9mon)
['#006', '', '#815', '', '', '', 'Cinderace', '', 'Fire'] https://bulbapedia.bulbagarden.net/wiki/Fire_(Pok%C3%A9mon)
['#007', '', '#816', '', '', '', 'Sobble', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Water_(Pok%C3%A9mon)
['#008', '', '#817', '', '', '', 'Drizzile', '', 'Water'] https://bulbapedia.bulbagarden.net/wiki/Water_(Pok%C3%A9mon

### Dataframe Conversion 

In [50]:
gen8_pokemon = []
info_start = 1
url1 = "https://bulbapedia.bulbagarden.net/wiki/"
url2 = "_(Pok%C3%A9mon)"
# place where to get the pokemon info
info_row=gen8_list.contents[info_start]
for pokemon_info_values, even_index_chec in zip(info_row.contents, range(0,len(info_row.contents))):
    # Pokemons' values are stored in even index (divisible by 2 and is not 0)
    if ((even_index_chec % 2) == 0) & (even_index_chec != 0) :
        pokemon_raw_info = pokemon_info_values.text.strip().split('\n')
        
        generation = "VIII"

        if len(pokemon_raw_info) == 10:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = pokemon_raw_info[9]
            poke_url = url1+pokemon_raw_info[6]+url2
        
        elif len(pokemon_raw_info) == 9:
            kdex = pokemon_raw_info[0]
            ndex = pokemon_raw_info[2]
            poke_name = pokemon_raw_info[6]
            type1 = pokemon_raw_info[8]
            type2 = ''
            poke_url = url1+pokemon_raw_info[6]+url2

        else:
            print('Check out elements containing ' + str(len(pokemon_raw_info)) + ' elements')
        
        # Saving as a tuple
        gen8_pokemon.append((kdex, ndex, poke_name, type1, type2, generation, poke_url))
        
    
    else:
        pass
#         print(pokemon_info_values)

In [51]:
df_gen8_pokemon = pd.DataFrame(gen8_pokemon)

In [52]:
df_gen8_pokemon.columns = ['Kdex', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_gen8_pokemon

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#810,Grookey,Grass,,VIII,https://bulbapedia.bulbagarden.net/wiki/Grooke...
1,#002,#811,Thwackey,Grass,,VIII,https://bulbapedia.bulbagarden.net/wiki/Thwack...
2,#003,#812,Rillaboom,Grass,,VIII,https://bulbapedia.bulbagarden.net/wiki/Rillab...
3,#004,#813,Scorbunny,Fire,,VIII,https://bulbapedia.bulbagarden.net/wiki/Scorbu...
4,#005,#814,Raboot,Fire,,VIII,https://bulbapedia.bulbagarden.net/wiki/Raboot...
...,...,...,...,...,...,...,...
96,#---,#901,Ursaluna,Ground,Normal,VIII,https://bulbapedia.bulbagarden.net/wiki/Ursalu...
97,#---,#902,Basculegion,Water,Ghost,VIII,https://bulbapedia.bulbagarden.net/wiki/Bascul...
98,#---,#903,Sneasler,Fighting,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Sneasl...
99,#---,#904,Overqwil,Dark,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Overqw...


## Joining all Dataframes
Combining all dataframes into one can be done using concat and `ignore_index` should be set to true to reset the index numbers.

In [53]:
df_pokemons = pd.concat([df_gen2_pokemon, 
                         df_gen3_pokemon, 
                         df_gen4_pokemon, 
                         df_gen5_pokemon, 
                         df_gen6_pokemon, 
                         df_gen7_pokemon, 
                         df_gen8_pokemon], ignore_index=True)

In [54]:
df_pokemons

Unnamed: 0,Kdex,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#152,Chikorita,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Chikor...
1,#002,#153,Bayleef,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Baylee...
2,#003,#154,Meganium,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Megani...
3,#004,#155,Cyndaquil,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Cyndaq...
4,#005,#156,Quilava,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Quilav...
...,...,...,...,...,...,...,...
807,#---,#901,Ursaluna,Ground,Normal,VIII,https://bulbapedia.bulbagarden.net/wiki/Ursalu...
808,#---,#902,Basculegion,Water,Ghost,VIII,https://bulbapedia.bulbagarden.net/wiki/Bascul...
809,#---,#903,Sneasler,Fighting,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Sneasl...
810,#---,#904,Overqwil,Dark,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Overqw...


### Print Dataframe consisting all Pokemon Generations

In [55]:
df_pokemons.columns = ['Region', 'Ndex', 'Pokemon', 'Type 1', 'Type 2', 'Generation', 'Pokemon\'s URL']
df_pokemons

Unnamed: 0,Region,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#152,Chikorita,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Chikor...
1,#002,#153,Bayleef,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Baylee...
2,#003,#154,Meganium,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Megani...
3,#004,#155,Cyndaquil,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Cyndaq...
4,#005,#156,Quilava,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Quilav...
...,...,...,...,...,...,...,...
807,#---,#901,Ursaluna,Ground,Normal,VIII,https://bulbapedia.bulbagarden.net/wiki/Ursalu...
808,#---,#902,Basculegion,Water,Ghost,VIII,https://bulbapedia.bulbagarden.net/wiki/Bascul...
809,#---,#903,Sneasler,Fighting,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Sneasl...
810,#---,#904,Overqwil,Dark,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Overqw...


## Scrapetime

In [56]:
end = datetime.now()
end_time = end.strftime("%H:%M:%S")

In [57]:
start = start_time.split(":")
startHour = int(start[0])
startMinute = int(start[1])
startSeconds = int(start[2])

end = end_time.split(":")
endHour = int(end[0])
endMinute = int(end[1])
endSeconds = int(end[2])

In [58]:
hours = endHour - startHour
mins  = endMinute - startMinute
secs  = endSeconds - startSeconds

scrapetime = secs + (mins*60) + (hours*3600)

### Total Scrapetime

In [59]:
print("Scrapetime started at "+ start_time +" and ended at "+end_time)
print("Total Scrapetime:",scrapetime, "seconds")

Scrapetime started at 01:29:28 and ended at 01:29:34
Total Scrapetime: 6 seconds


## Converting Dataframe to JSON file

In [60]:
df_pokemons.to_json('celestino_dayon_pokemon.json',orient='columns')

### Read JSON file

In [61]:
pd.read_json('celestino_dayon_pokemon.json')

Unnamed: 0,Region,Ndex,Pokemon,Type 1,Type 2,Generation,Pokemon's URL
0,#001,#152,Chikorita,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Chikor...
1,#002,#153,Bayleef,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Baylee...
2,#003,#154,Meganium,Grass,,II,https://bulbapedia.bulbagarden.net/wiki/Megani...
3,#004,#155,Cyndaquil,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Cyndaq...
4,#005,#156,Quilava,Fire,,II,https://bulbapedia.bulbagarden.net/wiki/Quilav...
...,...,...,...,...,...,...,...
807,#---,#901,Ursaluna,Ground,Normal,VIII,https://bulbapedia.bulbagarden.net/wiki/Ursalu...
808,#---,#902,Basculegion,Water,Ghost,VIII,https://bulbapedia.bulbagarden.net/wiki/Bascul...
809,#---,#903,Sneasler,Fighting,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Sneasl...
810,#---,#904,Overqwil,Dark,Poison,VIII,https://bulbapedia.bulbagarden.net/wiki/Overqw...
