Scrapyd can manage multiple Scrapy projects. Each project can have multiple versions. The latest version is used by default for starting spiders.
The latest version is the alphabetically greatest, unless all version names are version specifiers like 1.0
or 1.0rc1
, in which case they are sorted as such.
Scrapyd is a server (typically run as a daemon) that listens for :doc:`api` and :ref:`webui` requests.
The API is especially used to upload projects and schedule crawls. To start a crawl, Scrapyd spawns a process that essentially runs:
scrapy crawl myspider
Scrapyd runs multiple processes in parallel, and manages the number of concurrent processes. See :ref:`config-launcher` for details.
If you are familiar with the Twisted Application Framework, you can essentially reconfigure every part of Scrapyd. See :doc:`config` for details.
Scrapyd has a minimal web interface for monitoring running processes and accessing log files and item fees. By default, it is available at at http://localhost:6800/ Other options to manage Scrapyd include: