This small scraper fetches product data (title, price, old price, discount, url) from a Rozetka category page and saves the results to products.json.
Usage
-
Install dependencies:
- requests
- beautifulsoup4
- lxml
- (optional) fake-useragent
Example: pip install requests beautifulsoup4 lxml fake-useragent
-
Run the script: python script.py
Output
products.json: JSON file with structure: { "products": [ { "name": "...", "price": "...", "old_price": "...", "discount": "...", "url": "..." }, ... ] }
Notes
- The scraper uses a User-Agent header. If
fake_useragentis not available, a stable fallback is used. - The script includes basic error handling:
- Network errors return an empty product list and log an error.
- Parsing errors return an empty product list and log an error.
- File write errors are logged.
- To adapt selectors or classes, update the
find_allcalls inrozetka_scraper.