This is a simple command-line web scraper built with Python, requests, and BeautifulSoup.
It's designed to demonstrate the basic workflow of:
- Fetching a web page's HTML.
- Parsing the HTML to find specific data.
- Saving that data to a structured JSONfile.
This script scrapes all quotes and authors from http://quotes.toscrape.com/.
- Python 3
- requests (for making HTTP requests)
- beautifulsoup4 (for parsing HTML)
- 
Clone the repo: git clone [https://github.com/justknuth/python-web-scraper.git](https://github.com/justknuth/python-web-scraper.git) cd python-web-scraper
- 
Create a virtual environment: python -m venv venv 
- 
Activate the virtual environment: - On macOS/Linux (Bash):
source venv/bin/activate
- On Windows (Command Prompt or PowerShell):
.\venv\Scripts\activate 
 
- On macOS/Linux (Bash):
- 
Install dependencies: pip install -r requirements.txt 
- 
Run the scraper: python scraper.py 
This script will print the authors it finds to the console and create a quotes.json file in the root of the project containing the full text and author for all scraped quotes.