Skip to content

Latest commit

 

History

History
64 lines (46 loc) · 1.52 KB

USAGE.md

File metadata and controls

64 lines (46 loc) · 1.52 KB

Usage

List all available spiders

poetry run scrapy list

Using docker-compose

This step builds and runs the Burplist scrapy application along with a postgres Docker container.

This is perfect for users who simply wants to try out the application locally.

# To build and start scraping with all available spiders
docker-compose up -d --build

# To run all available spiders after build
docker start burplist_scrapy

Run single spider

# To run a single spider
poetry run scrapy crawl thirsty

# To run single spider with json output
poetry run scrapy crawl coldstorage -o coldstorage.json

Run all spiders

poetry run scrapy list | xargs -n 1 poetry run scrapy crawl

Run all spiders, in parallel

poetry shell
scrapy list | xargs -n 1 -P 0 scrapy crawl

Optional: Integrations

ScraperAPI is used as our proxy server provider. Sentry is used for error monitoring. ScrapeOps is used for job monitoring.

export SENTRY_DSN="<YOUR_SENTRY_DSN>"
export SCRAPER_API_KEY="<YOUR_SCRAPER_API_KEY>"
export SCRAPEOPS_API_KEY="<YOUR_ SCRAPEOPS_API_KEY>"