Skip to content

πŸš€ A powerful Python web scraping toolkit with CLI & export support. Scrape any website, clean & save data (CSV/JSON/XLSX), schedule jobs, and extend with AI for insights. Perfect for learners, researchers & pros building data-driven apps.

License

Notifications You must be signed in to change notification settings

Tanviib12/Web-Scraper

Repository files navigation

Web-Scraper

πŸš€ A powerful Python web scraping toolkit with CLI & export support. Scrape any website, clean & save data (CSV/JSON/XLSX), schedule jobs, and extend with AI for insights. Perfect for learners, researchers & pros building data-driven apps.

Web Scraping Command-Line Tool πŸ•΅οΈβ€β™‚οΈπŸ“Š

A simple yet powerful Python-based command-line tool for extracting data from websites and presenting it in a clean, tabular format. This project demonstrates the use of requests, BeautifulSoup, and BeautifulTable to fetch, parse, and display web data efficiently.

✨ Features

β€’ πŸ”Ž Fetch and parse live website content

β€’ πŸ“‘ Extract structured information from HTML

β€’ πŸ“Š Display data in a neat table format

β€’ πŸ’Ύ Save scraped data with an alias for later use

β€’ ⚑ Lightweight and easy-to-use command-line interface

πŸ› οΈ Tech Stack

β€’ Python 3.9+

β€’ Requests – for making HTTP requests

β€’ BeautifulSoup4 – for HTML parsing

β€’ BeautifulTable – for tabular output

πŸ“‚ Project Structure

πŸš€ Installation & Usage

1.Create and activate a virtual environment

python -m venv venv source venv/bin/activate # Mac/Linux venv\Scripts\activate # Windows

2.Install dependencies

Inside your CMD type

pip install requests pip install beautifulsoup4 pip install beautifultable pip install lxml pip install certifi pip install attrs pip install soupsieve

Or instead Installing one by one you can run all at a same time

pip install requests beautifulsoup4 beautifultable lxml certifi attrs soupsieve

  1. Run the script

python web_scraping_command_line_tool.py

πŸ“Έ Demo

πŸ“ˆ Future Improvements

β€’ 🌍 Multi-website scraping support

β€’ πŸ“Š Export data to CSV, Excel, or JSON

β€’ πŸ”§ Add custom scraping rules (XPath/CSS selectors)

β€’ ⚑ Parallel scraping for speed

β€’ 🌐 Option to scrape JS-rendered websites (via Selenium/Playwright)

🀝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to fork this repo and submit a pull request.

πŸ“œ License

This project is licensed under the MIT License

About

πŸš€ A powerful Python web scraping toolkit with CLI & export support. Scrape any website, clean & save data (CSV/JSON/XLSX), schedule jobs, and extend with AI for insights. Perfect for learners, researchers & pros building data-driven apps.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published