# WebScraping - Several Tables (no ID) in same page

---
* Author:  [Yuttapong Mahasittiwat](mailto:khala1391@gmail.com)
* Technologist | Data Modeler | Data Analyst
* [YouTube](https://www.youtube.com/khala1391)
* [LinkedIn](https://www.linkedin.com/in/yuttapong-m/)
* [Tableau](https://public.tableau.com/app/profile/yuttapong.m/vizzes)
---

ref: [WS CubeTech youtube channel](https://www.youtube.com/watch?v=UabBGhnVqSo&list=PLc20sA5NNOvrsn3a78ewy2VTCXVV47NB4&index=1&t=0s)

In [None]:
import datetime
print(datetime.datetime.now())

2024-10-20 20:41:04.090947


## import library

In [1]:
from urllib.request import urlopen # option#1
import requests  # option#2

import pandas as pd
from bs4 import BeautifulSoup

In [114]:
url = "https://www.iplt20.com/auction/2022"

r = requests.get(url)
soup = BeautifulSoup(r.text, "lxml")

In [115]:
table = soup.find("table", class_="ih-td-tab auction-tbl")
print(table)

<table class="ih-td-tab auction-tbl" width="100%">
<tr class="ih-pt-tbl" style="display:revert">
<th class="skip-filter" style="width:25%;text-align: left;">TEAM</th>
<th class="skip-filter" style="width:25%;">FUNDS REMAINING</th>
<th class="skip-filter" style="width:25%;">OVERSEAS PLAYERS </th>
<th class="skip-filter" style="width:25%;">TOTAL PLAYERS</th>
</tr>
<tbody id="pointsdata">
<tr>
<td class="ih-t-color">
<div class="ih-pt-ic">
<div class="ih-pt-img" style="width: 60px">
<img alt="" src="https://documents.iplt20.com/ipl/franchises/1644311961_CSKroundbig.png"/>
</div>
<h2 class="ih-pt-cont">Chennai Super Kings</h2>
</div>
</td>
<td>₹2,95,00,000</td>
<td>8</td>
<td>25</td>
</tr>
<tr>
<td class="ih-t-color">
<div class="ih-pt-ic">
<div class="ih-pt-img" style="width: 60px">
<img alt="" src="https://documents.iplt20.com/ipl/franchises/1644312373_DCroundbig.png"/>
</div>
<h2 class="ih-pt-cont">Delhi Capitals</h2>
</div>
</td>
<td>₹10,00,000</td>
<td>7</td>
<td>24</td>
</tr>
<tr>
<t

In [116]:
headers = table.find_all("th", class_="skip-filter")
header_list = [header.text for header in headers]
print(header_list)

df = pd.DataFrame(columns=header_list)
df


['TEAM', 'FUNDS REMAINING', 'OVERSEAS PLAYERS ', 'TOTAL PLAYERS']


Unnamed: 0,TEAM,FUNDS REMAINING,OVERSEAS PLAYERS,TOTAL PLAYERS


In [117]:
rows = table.find_all("tr")

# way#1 strip after loop
for i in rows[1:]:
    data = i.find_all("td")
    row = [tr.text for tr in data]
    # print(row)
    l = len(df)
    df.loc[l] = row
    
df['TEAM'] = df['TEAM'].str.strip()

# way#2 strip in loop
# for i in rows[1:]:
#     first_td = i.find_all("td")[0].find("h2", class_="ih-pt-cont").text.strip()
#     # first_td = i.find_all("td")[0].find("div", class_="ih-pt-ic").text.strip()
#     data = i.find_all("td")[1:]
#     row = [tr.text for tr in data]
#     row.insert(0,first_td)
#     # print(row)
#     l = len(df)
#     df.loc[l] = row


df

Unnamed: 0,TEAM,FUNDS REMAINING,OVERSEAS PLAYERS,TOTAL PLAYERS
0,Chennai Super Kings,"₹2,95,00,000",8,25
1,Delhi Capitals,"₹10,00,000",7,24
2,Gujarat Titans,"₹15,00,000",8,23
3,Kolkata Knight Riders,"₹45,00,000",8,25
4,Lucknow Super Giants,₹0,7,21
5,Mumbai Indians,"₹10,00,000",8,25
6,Punjab Kings,"₹3,45,00,000",7,25
7,Rajasthan Royals,"₹95,00,000",8,24
8,Royal Challengers Bangalore,"₹1,55,00,000",8,22
9,Sunrisers Hyderabad,"₹10,00,000",8,23


In [74]:
df.to_csv("data/auction_data.csv")

## Exercise: data from table#2

**key point**
- Use `find_all` and `specify index`
- remaining is same

In [136]:
table2 = soup.find_all("table", class_="ih-td-tab auction-tbl")[1]
# print(table2)
print(table2.text)



TEAM
PLAYER
TYPE 
PRICE








Mumbai Indians


Ishan Kishan
Wicket Keeper
₹15,25,00,000







Chennai Super Kings


Deepak Chahar
Bowler
₹14,00,00,000







Kolkata Knight Riders


Shreyas Iyer
Batsman
₹12,25,00,000







Punjab Kings


Liam Livingstone
All-Rounder
₹11,50,00,000







Delhi Capitals


Shardul Thakur
Bowler
₹10,75,00,000







Royal Challengers Bangalore


Harshal Patel
All-Rounder
₹10,75,00,000







Royal Challengers Bangalore


Wanindu Hasaranga
All-Rounder
₹10,75,00,000







Sunrisers Hyderabad


Nicholas Pooran
Wicket Keeper
₹10,75,00,000







Gujarat Titans


Lockie Ferguson
Bowler
₹10,00,00,000







Lucknow Super Giants


Avesh Khan
Bowler
₹10,00,00,000







Rajasthan Royals


Prasidh Krishna
Bowler
₹10,00,00,000





In [140]:
headers = table2.find_all("th", class_="skip-filter")
header_list = [header.text for header in headers]
# print(header_list)

df2 = pd.DataFrame(columns=header_list)
df2

Unnamed: 0,TEAM,PLAYER,TYPE,PRICE


In [141]:
rows = table2.find_all("tr")

# way#1 strip after loop
for i in rows[1:]:
    data = i.find_all("td")
    row = [tr.text for tr in data]
    # print(row)
    l = len(df2)
    df2.loc[l] = row
    
df2['TEAM'] = df2['TEAM'].str.strip()

df2

Unnamed: 0,TEAM,PLAYER,TYPE,PRICE
0,Mumbai Indians,Ishan Kishan,Wicket Keeper,"₹15,25,00,000"
1,Chennai Super Kings,Deepak Chahar,Bowler,"₹14,00,00,000"
2,Kolkata Knight Riders,Shreyas Iyer,Batsman,"₹12,25,00,000"
3,Punjab Kings,Liam Livingstone,All-Rounder,"₹11,50,00,000"
4,Delhi Capitals,Shardul Thakur,Bowler,"₹10,75,00,000"
5,Royal Challengers Bangalore,Harshal Patel,All-Rounder,"₹10,75,00,000"
6,Royal Challengers Bangalore,Wanindu Hasaranga,All-Rounder,"₹10,75,00,000"
7,Sunrisers Hyderabad,Nicholas Pooran,Wicket Keeper,"₹10,75,00,000"
8,Gujarat Titans,Lockie Ferguson,Bowler,"₹10,00,00,000"
9,Lucknow Super Giants,Avesh Khan,Bowler,"₹10,00,00,000"
