In [None]:
#  CodeAlpha Internship – Task 1

## Web Scraping Project: Books to Scrape



###  Objective

The goal of this project is to perform web scraping on the *Books to Scrape* practice website and extract useful information about books for analysis.


###  Website Used

http://books.toscrape.com

This website is designed for practicing web scraping techniques safely.



### Tools & Libraries Used

* Python
* Requests
* BeautifulSoup
* Pandas
* Jupyter Notebook



###  Data Collected

The following information was extracted for each book:

* Title
* Price
* Rating
* Availability



###  Project Workflow

1. Connected to the website using Requests
2. Parsed HTML using BeautifulSoup
3. Extracted book details from page structure
4. Stored data in a Pandas DataFrame
5. Exported dataset to CSV file



###  Output

The scraped dataset was saved as:


books_data.csv


This file can now be used for:

* Exploratory Data Analysis (EDA)
* Data Visualization
* Dashboard creation
* Machine learning practice



In [1]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

In [2]:
url = "http://books.toscrape.com/"
response = requests.get(url)

print(response.status_code)

200


In [3]:
soup = BeautifulSoup(response.text, "lxml")

print(soup.title.text)


    All products | Books to Scrape - Sandbox



In [4]:
books = soup.find_all("article", class_="product_pod")

titles = []
prices = []
ratings = []
availability = []

for book in books:
    
    # Title
    title = book.h3.a["title"]
    titles.append(title)
    
    # Price
    price = book.find("p", class_="price_color").text
    prices.append(price)
    
    # Rating
    rating = book.p["class"][1]
    ratings.append(rating)
    
    # Availability
    stock = book.find("p", class_="instock availability").text.strip()
    availability.append(stock)

In [5]:
df = pd.DataFrame({
    "Title": titles,
    "Price": prices,
    "Rating": ratings,
    "Availability": availability
})

df.head()

Unnamed: 0,Title,Price,Rating,Availability
0,A Light in the Attic,Â£51.77,Three,In stock
1,Tipping the Velvet,Â£53.74,One,In stock
2,Soumission,Â£50.10,One,In stock
3,Sharp Objects,Â£47.82,Four,In stock
4,Sapiens: A Brief History of Humankind,Â£54.23,Five,In stock


In [6]:
df.to_csv("books_data.csv", index=False)

print("File saved successfully!")

File saved successfully!
