Overview

Projects and versions

Scrapyd can manage multiple Scrapy projects. Each project can have multiple versions. The latest version is used by default for starting spiders.

Version order

The latest version is the alphabetically greatest, unless all version names are version specifiers like 1.0 or 1.0rc1, in which case they are sorted as such.

How Scrapyd works

Scrapyd is a server (typically run as a daemon) that listens for :doc:`api` and :ref:`webui` requests.

The API is especially used to upload projects and schedule crawls. To start a crawl, Scrapyd spawns a process that essentially runs:

scrapy crawl myspider

Scrapyd runs multiple processes in parallel, and manages the number of concurrent processes. See :ref:`config-launcher` for details.

If you are familiar with the Twisted Application Framework, you can essentially reconfigure every part of Scrapyd. See :doc:`config` for details.

Web interface

Scrapyd has a minimal web interface for monitoring running processes and accessing log files and item fees. By default, it is available at at http://localhost:6800/ Other options to manage Scrapyd include:

ScrapydWeb
spider-admin-pro

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

overview.rst

overview.rst

Overview

Projects and versions

Version order

How Scrapyd works

Web interface

Files

overview.rst

Latest commit

History

overview.rst

File metadata and controls

Overview

Projects and versions

Version order

How Scrapyd works

Web interface