Skip to content

Perform web scraping on a provided HTML page that contains different types of elements. The goal is to extract specific data from the page and process it into structured formats such as CSV or JSON. https://baraasalout.github.io/test.html

Notifications You must be signed in to change notification settings

S123AM/Web-Scraping-Task

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈβ€β™€οΈ Web Scraping Task

Python BeautifulSoup Pandas


πŸ“Œ Overview

A Python script that scrapes structured data from an HTML page and exports it into CSV and JSON formats.
Part of a Data Science practice project focusing on real-world web scraping.


πŸ“‚ Outputs

  • Extract_Text_Data.CSV β†’ Headings, paragraphs, and list items.
  • Extract_Table_Data.CSV β†’ Product table (Name, Price, Stock).
  • Product_Information.JSON β†’ Book cards with title, price, stock, button text.
  • Form_Details.JSON β†’ Form fields with name, type, placeholder, and label.
  • Iframe_Links.JSON β†’ Extracted video links.
  • Featured_Products.JSON β†’ Featured products with hidden prices & colors.

πŸ›  Tools

  • Python 3
  • requests, BeautifulSoup4, pandas, json

πŸš€ Run

pip install requests beautifulsoup4 pandas lxml
python web_scrap_p.py

About

Perform web scraping on a provided HTML page that contains different types of elements. The goal is to extract specific data from the page and process it into structured formats such as CSV or JSON. https://baraasalout.github.io/test.html

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages