# 🧪 Web Scraping Hands-On: Books to Scrape
In this notebook, you'll learn how to scrape data from a simple, static website using `requests` and `BeautifulSoup`.

## ✅ Step 1: Fetch the HTML content

In [1]:
import requests
from bs4 import BeautifulSoup

url = "http://books.toscrape.com/"
headers = {"User-Agent": "Mozilla/5.0"}
response = requests.get(url, headers=headers)
html = response.text

# Create a BeautifulSoup object
soup = BeautifulSoup(html, 'html.parser')

## ✅ Step 2: Extract book titles

In [2]:
# Find all book title elements
titles = [tag['title'] for tag in soup.select('article.product_pod h3 a')]
print("Sample Book Titles:")
for title in titles[:10]:
    print("-", title)

Sample Book Titles:
- A Light in the Attic
- Tipping the Velvet
- Soumission
- Sharp Objects
- Sapiens: A Brief History of Humankind
- The Requiem Red
- The Dirty Little Secrets of Getting Your Dream Job
- The Coming Woman: A Novel Based on the Life of the Infamous Feminist, Victoria Woodhull
- The Boys in the Boat: Nine Americans and Their Epic Quest for Gold at the 1936 Berlin Olympics
- The Black Maria


## ✅ Step 3: Extract book prices

In [3]:
prices = [price.text for price in soup.select('.price_color')]
print("Sample Prices:")
print(prices[:10])

Sample Prices:
['Â£51.77', 'Â£53.74', 'Â£50.10', 'Â£47.82', 'Â£54.23', 'Â£22.65', 'Â£33.34', 'Â£17.93', 'Â£22.60', 'Â£52.15']


## ✅ Step 4: Create a DataFrame and Save

In [4]:
import pandas as pd

books = pd.DataFrame({
    "title": titles,
    "price": prices
})

books.to_csv("books_scraped.csv", index=False)
books.head()

Unnamed: 0,title,price
0,A Light in the Attic,Â£51.77
1,Tipping the Velvet,Â£53.74
2,Soumission,Â£50.10
3,Sharp Objects,Â£47.82
4,Sapiens: A Brief History of Humankind,Â£54.23


## 💡 Reflection Questions
- What would make this site harder to scrape?
- What should you check before scraping any website?
- What are some ethical limits of scraping?
