A Python wrapper for working with Scrapyd's API.
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.



The PyPI version Build status on Travis-CI Coverage status on Coveralls Documentation status on ReadTheDocs

A Python wrapper for working with Scrapyd's API.

Current released version: 2.1.2 (see history).

Allows a Python application to talk to, and therefore control, the Scrapy daemon: Scrapyd.


Easiest installation is via pip:

pip install python-scrapyd-api

Quick Usage

Please refer to the full documentation for more detailed usage but to get you started:

>>> from scrapyd_api import ScrapydAPI
>>> scrapyd = ScrapydAPI('http://localhost:6800')

Add a project egg as a new version:

>>> egg = open('some_egg.egg', 'rb')
>>> scrapyd.add_version('project_name', 'version_name', egg)
# Returns the number of spiders in the project.
>>> egg.close()

Cancel a scheduled job:

>>> scrapyd.cancel('project_name', '14a6599ef67111e38a0e080027880ca6')
# Returns the "previous state" of the job before it was cancelled: 'running' or 'pending'.

Delete a project and all sibling versions:

>>> scrapyd.delete_project('project_name')
# Returns True if the request was met with an OK response.

Delete a version of a project:

>>> scrapyd.delete_version('project_name', 'version_name')
# Returns True if the request was met with an OK response.

Request status of a job:

>>> scrapyd.job_status('project_name', '14a6599ef67111e38a0e080027880ca6')
# Returns 'running', 'pending', 'finished' or '' for unknown state.

List all jobs registered:

>>> scrapyd.list_jobs('project_name')
# Returns a dict of running, finished and pending job lists.
    'pending': [
            u'id': u'24c35...f12ae',
            u'spider': u'spider_name'
    'running': [
            u'id': u'14a65...b27ce',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358'
    'finished': [
            u'id': u'34c23...b21ba',
            u'spider': u'spider_name',
            u'start_time': u'2014-06-17 22:45:31.975358',
            u'end_time': u'2014-06-23 14:01:18.209680'

List all projects registered:

>>> scrapyd.list_projects()
[u'ecom_project', u'estate_agent_project', u'car_project']

List all spiders available to a given project:

>>> scrapyd.list_spiders('project_name')
[u'raw_spider', u'js_enhanced_spider', u'selenium_spider']

List all versions registered to a given project:

>>> scrapyd.list_versions('project_name'):
[u'345', u'346', u'347', u'348']

Schedule a job to run with a specific spider:

# Schedule a job to run with a specific spider.
>>> scrapyd.schedule('project_name', 'spider_name')
# Returns the Scrapyd job id.

Schedule a job to run while passing override settings:

>>> settings = {'DOWNLOAD_DELAY': 2}
>>> scrapyd.schedule('project_name', 'spider_name', settings=settings)

Schedule a job to run while passing extra attributes to spider initialisation:

>>> scrapyd.schedule('project_name', 'spider_name', extra_attribute='value')
# NB: 'project', 'spider' and 'settings' are reserved kwargs for this
# method and therefore these names should be avoided when trying to pass
# extra attributes to the spider init.

Setting up the project to contribute code

Please see CONTRIBUTING.md. This will guide you through our pull request guidelines, project setup and testing requirements.


2-clause BSD. See the full LICENSE.