π A powerful Python web scraping toolkit with CLI & export support. Scrape any website, clean & save data (CSV/JSON/XLSX), schedule jobs, and extend with AI for insights. Perfect for learners, researchers & pros building data-driven apps.
Web Scraping Command-Line Tool π΅οΈββοΈπ
A simple yet powerful Python-based command-line tool for extracting data from websites and presenting it in a clean, tabular format. This project demonstrates the use of requests, BeautifulSoup, and BeautifulTable to fetch, parse, and display web data efficiently.
β¨ Features
β’ π Fetch and parse live website content
β’ π Extract structured information from HTML
β’ π Display data in a neat table format
β’ πΎ Save scraped data with an alias for later use
β’ β‘ Lightweight and easy-to-use command-line interface
π οΈ Tech Stack
β’ Python 3.9+
β’ Requests β for making HTTP requests
β’ BeautifulSoup4 β for HTML parsing
β’ BeautifulTable β for tabular output
π Project Structure
π Installation & Usage
1.Create and activate a virtual environment
python -m venv venv source venv/bin/activate # Mac/Linux venv\Scripts\activate # Windows
2.Install dependencies
Inside your CMD type
pip install requests pip install beautifulsoup4 pip install beautifultable pip install lxml pip install certifi pip install attrs pip install soupsieve
Or instead Installing one by one you can run all at a same time
pip install requests beautifulsoup4 beautifultable lxml certifi attrs soupsieve
- Run the script
python web_scraping_command_line_tool.py
πΈ Demo
π Future Improvements
β’ π Multi-website scraping support
β’ π Export data to CSV, Excel, or JSON
β’ π§ Add custom scraping rules (XPath/CSS selectors)
β’ β‘ Parallel scraping for speed
β’ π Option to scrape JS-rendered websites (via Selenium/Playwright)
π€ Contributing
Contributions, issues, and feature requests are welcome! Feel free to fork this repo and submit a pull request.
π License
This project is licensed under the MIT License