Q1. Write a Python program to scrape all available books from the website (https://books.toscrape.com/) Books to Scrape – a live site built for practicing scraping (safe, legal, no anti-bot). For each book, extract the following details: 
1.	Title 
2.	Price 
3.	Availability (In stock / Out of stock) 
4.	Star Rating (One, Two, Three, Four, Five) 

Store the scraped results into a Pandas DataFrame and export them to a CSV file named books.csv. 

(Note: Use the requests library to fetch the HTML page. Use BeautifulSoup to parse and extract book details and handle pagination so that books from all pages are scraped) 


In [36]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

base_url = "https://books.toscrape.com/catalogue/page-{}.html"
books_data = []

page = 1
while True:
    url = base_url.format(page)
    response = requests.get(url)
    
    if response.status_code != 200:
        break
    
    soup = BeautifulSoup(response.text, "html.parser")
    books = soup.select("article.product_pod")
    
    if not books:
        break
    
    for book in books:
        title = book.h3.a['title']
        price = book.select_one("p.price_color").text.strip()
        availability = book.select_one("p.instock.availability").text.strip()
        rating_tag = book.select_one("p.star-rating")
        star_rating = "Unknown"
        if rating_tag:
            for cls in rating_tag.get("class", []):
                if cls in ["One","Two","Three","Four","Five"]:
                    star_rating = cls
                    break
        
        books_data.append({
            "Title": title,
            "Price": price,
            "Availability": availability,
            "Star Rating": star_rating
        })
    
    page += 1

df = pd.DataFrame(books_data)
df.to_csv("books.csv", index=False)

print(f"Scraped {len(df)} books and saved to books.csv")
pd.set_option('display.max_rows', None)
df = pd.read_csv("books.csv")
df.head(100) 

Scraped 1000 books and saved to books.csv


Unnamed: 0,Title,Price,Availability,Star Rating
0,A Light in the Attic,Â£51.77,In stock,Three
1,Tipping the Velvet,Â£53.74,In stock,One
2,Soumission,Â£50.10,In stock,One
3,Sharp Objects,Â£47.82,In stock,Four
4,Sapiens: A Brief History of Humankind,Â£54.23,In stock,Five
5,The Requiem Red,Â£22.65,In stock,One
6,The Dirty Little Secrets of Getting Your Dream...,Â£33.34,In stock,Four
7,The Coming Woman: A Novel Based on the Life of...,Â£17.93,In stock,Three
8,The Boys in the Boat: Nine Americans and Their...,Â£22.60,In stock,Four
9,The Black Maria,Â£52.15,In stock,One


Q2. 	Write 	a 	Python 	program 	to 	scrape 	the 	IMDB 	Top 	250 	Movies 	list (https://www.imdb.com/chart/top/) . For each movie, extract the following details: 
1.	Rank (1–250) 
2.	Movie Title 
3.	Year of Release 
4.	IMDB Rating 

Store the results in a Pandas DataFrame and export it to a CSV file named imdb_top250.csv. 

(Note: Use Selenium/Playwright to scrape the required details from this website) 


In [37]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import pandas as pd

options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36")

driver = webdriver.Chrome(options=options)
driver.get("https://www.imdb.com/chart/top/")

wait = WebDriverWait(driver, 20)
wait.until(
    EC.presence_of_element_located((By.CSS_SELECTOR, "ul.ipc-metadata-list"))
)
rows = driver.find_elements(By.CSS_SELECTOR, "li.ipc-metadata-list-summary-item")

print("Found", len(rows), "rows")

data = []

for row in rows:
    title = row.find_element(By.CSS_SELECTOR, "h3.ipc-title__text").text
    year = row.find_element(By.CSS_SELECTOR, "span.cli-title-metadata-item:nth-child(1)").text
    rating = row.find_element(By.CSS_SELECTOR, "span.ipc-rating-star--rating").text
    data.append([title, year, rating])

df = pd.DataFrame(data, columns=["Title", "Year", "Rating"])
df.to_csv("imdb_top250.csv", index=False)

print("✅ imdb_top250.csv saved successfully with", len(df), "movies")

driver.quit()

df = pd.read_csv("imdb_top250.csv")
df.head(100)


Found 250 rows
✅ imdb_top250.csv saved successfully with 250 movies


Unnamed: 0,Title,Year,Rating
0,1. The Shawshank Redemption,1994,9.3
1,2. The Godfather,1972,9.2
2,3. The Dark Knight,2008,9.1
3,4. The Godfather Part II,1974,9.0
4,5. 12 Angry Men,1957,9.0
5,6. The Lord of the Rings: The Return of the King,2003,9.0
6,7. Schindler's List,1993,9.0
7,8. The Lord of the Rings: The Fellowship of th...,2001,8.9
8,9. Pulp Fiction,1994,8.8
9,"10. The Good, the Bad and the Ugly",1966,8.8


Q3. Write a Python program to scrape the weather information for top world cities from the given website (https://www.timeanddate.com/weather/) . For each city, extract the following details:

1.	City Name 
2.	Temperature 
3.	Weather Condition (e.g., Clear, Cloudy, Rainy, etc.) 

Store the results in a Pandas DataFrame and export it to a CSV file named weather.csv. 


In [None]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup
import pandas as pd
import io

driver = webdriver.Chrome()
url = "https://www.timeanddate.com/weather/"
driver.get(url)

wait = WebDriverWait(driver, 15)
wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "table")))

soup = BeautifulSoup(driver.page_source, "html.parser")

table = soup.find("table")
if not table:
    raise Exception("Could not find any <table> element. The structure may have changed.")

df = pd.read_html(io.StringIO(str(table)))[0]

print(df.head())
driver.quit()


  Local Time and Weather Around the World Sort By: CityCountryTimeTemperatureCities Shown: Capitals (215)Most Popular (143)Popular (357)Somewhat Popular (471)  \
0                                              Accra                                                                                                             
1                                        Addis Ababa                                                                                                             
2                                           Adelaide                                                                                                             
3                                            Algiers                                                                                                             
4                                             Almaty                                                                                                             

  Local Time and Weather Ar