Skip to content

Latest commit

 

History

History
40 lines (24 loc) · 1.53 KB

overview.rst

File metadata and controls

40 lines (24 loc) · 1.53 KB

Overview

Projects and versions

Scrapyd can manage multiple Scrapy projects. Each project can have multiple versions. The latest version is used by default for starting spiders.

Version order

The latest version is the alphabetically greatest, unless all version names are version specifiers like 1.0 or 1.0rc1, in which case they are sorted as such.

How Scrapyd works

Scrapyd is a server (typically run as a daemon) that listens for :doc:`api` and :ref:`webui` requests.

The API is especially used to upload projects and schedule crawls. To start a crawl, Scrapyd spawns a process that essentially runs:

scrapy crawl myspider

Scrapyd runs multiple processes in parallel, and manages the number of concurrent processes. See :ref:`config-launcher` for details.

If you are familiar with the Twisted Application Framework, you can essentially reconfigure every part of Scrapyd. See :doc:`config` for details.

Web interface

Scrapyd has a minimal web interface for monitoring running processes and accessing log files and item fees. By default, it is available at at http://localhost:6800/ Other options to manage Scrapyd include: