🕷️ Amazon Product Scraper v1.0

⚙️ Overview

AmazonPlaywrightSpider is a powerful Scrapy + Playwright-based web scraper built to extract product details (title, price, rating, image) from Amazon.com. It automates a Chromium browser to safely and efficiently scrape dynamic product data, even from JavaScript-heavy pages. You can extract data without any kind Proxy and this is able to do 300+ requests on Amazone.

✨ Features

🕹 Playwright-powered Scraping – Handles JavaScript-rendered Amazon pages.
🌈 Colorful CLI – Fully color-coded output with banners and warnings.
⚠️ Ethical Notice System – Shows a warning box before starting.
📦 Auto Data Export – Saves results to product.csv and product.json.
🧭 Pagination Support – Automatically crawls through multiple result pages.
💻 Lightweight & Customizable – Works directly with scrapy crawl amazon_playwright.

🧰 Requirements

Python 3.9+
Scrapy
Scrapy-Playwright
Playwright (Chromium browser)
Node.js (for Playwright backend)

🧠 Technology Stack

Python

Programming language.

Scrapy

Web scraping framework

Playwright

Headless browser automation

Twisted Reactor

Async I/O event system for Scrapy

🧩 Project Structure

amazon_scraper/
│
├── amazon/
│   ├── spiders/
│   │   └── amazon_playwright_spider.py   # main spider (this file)
│   ├── settings.py                       # Scrapy configuration
│
├── product.json                          # output file (auto-generated)
├── product.csv                           # output file (auto-generated)
└── README.md                             # documentation

⚙️ Installation Guide

1️⃣ Clone Repository

git clone https://github.com/your-username/amazon-playwright-scraper.git
cd amazon-playwright-scraper

2️⃣ Create Virtual Environment

python -m venv venv
venv\Scripts\activate  # (Windows)
# or
source venv/bin/activate  # (Linux/Mac)

3️⃣ Install Dependencies

pip install scrapy scrapy-playwright

4️⃣ Install Playwright Browsers

playwright install

▶️ How to Run

Option 1 — From Scrapy CLI

scrapy crawl amazon_playwright

Option 2 — Run Script Directly

python amazon_playwright_spider.py

When you run it directly, it will:

Show a fancy banner
Display a warning box
Show version and author
Ask confirmation before crawling

🧾 Output Example

Sample JSON Output

[
    {
        "title": "Logitech Wireless Mouse M510",
        "price": "$24.99",
        "rating": "4.7 out of 5 stars",
        "image": "https://images.amazon.com/...jpg"
    },
    {
        "title": "HP USB Keyboard 320K",
        "price": "$17.45",
        "rating": "4.5 out of 5 stars",
        "image": "https://images.amazon.com/...jpg"
    }
]

📊 Sample CSV Output

When the spider finishes running, it automatically saves results in product.csv and product.json.

Here’s an example of how the CSV output looks:

title	price	rating	image
Logitech MX Master 3S Wireless Mouse	$99.99	4.8 out of 5 stars	https://m.media-amazon.com/images/I/71X9ppvP+aL._AC_SL1500_.jpg
Corsair K70 RGB TKL Mechanical Gaming Keyboard	$129.99	4.7 out of 5 stars	https://m.media-amazon.com/images/I/81uO-KnH1HL._AC_SL1500_.jpg
Razer Kraken V3 X Gaming Headset	$49.99	4.5 out of 5 stars	https://m.media-amazon.com/images/I/61QyH9PoWQL._AC_SL1500_.jpg

📁 The files are saved automatically in your project root directory after each crawl:

product.csv
product.json

⚠️ Important Notes

This script is for educational and research use only.
Do NOT use it for aggressive or commercial scraping.
Always respect Amazon’s Terms of Service.
Use download delays and low concurrency to prevent blocking.

🧑‍💻 Author & Credits

Developer: MS Coder

*Version: v1.0
Language: Python
*Framework: Scrapy + Playwright

💡 Future Plans

Add support for multiple Amazon categories
Implement rotating user-agents & proxy pool
Add progress bar for live scraping status
Build web dashboard for live scraped data

🧾 License & Credits

Made with ❤️ by MS Coder
_{Version 1.0 • Built for learning, with style & responsibility 🧠**}

🏁 Final Note

“Scrape responsibly. Automate smartly. Respect platforms.”

— MS Coder

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
amazon_spider		amazon_spider
README.md		README.md
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🕷️ Amazon Product Scraper v1.0

⚙️ Overview

✨ Features

🧰 Requirements

🧠 Technology Stack

Python

Scrapy

Playwright

Twisted Reactor

🧩 Project Structure

⚙️ Installation Guide

▶️ How to Run

🧾 Output Example

📊 Sample CSV Output

⚠️ Important Notes

🧑‍💻 Author & Credits

Developer: MS Coder

💡 Future Plans

🧾 License & Credits

🏁 Final Note

“Scrape responsibly. Automate smartly. Respect platforms.”

About

Uh oh!

Releases

Packages

Languages

mscoder-py/amazon-playwright-scraper

Folders and files

Latest commit

History

Repository files navigation

🕷️ Amazon Product Scraper v1.0

⚙️ Overview

✨ Features

🧰 Requirements

🧠 Technology Stack

Python

Scrapy

Playwright

Twisted Reactor

🧩 Project Structure

⚙️ Installation Guide

▶️ How to Run

🧾 Output Example

📊 Sample CSV Output

⚠️ Important Notes

🧑‍💻 Author & Credits

Developer: MS Coder

💡 Future Plans

🧾 License & Credits

🏁 Final Note

“Scrape responsibly. Automate smartly. Respect platforms.”

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages