An example how to set up a Docker based web crawling infrastructure using Scrapy and MongoDB.
This requires:
- A dockerhost
- The
docker
command in order to talk to your dockerhost - The
docker-compose
command - This repository ;-)
$> docker-compose build
Edit the files in ./config/*.env
to suit your needs
Start the whole system with:
$> docker-compose up
and check the results in your MongoDB container at <dockerhost>:27017
.
$> docker-compose scale crawler=<num>
$> docker-compose stop crawler
or $> docker-compose scale crawler=0
$> docker-compose stop crawlerstore
$> docker-compose stop
$> docker-compose start
$> docker-compose rm