Skip to content

Commit

Permalink
Added new Scrapy service with support for:
Browse files Browse the repository at this point in the history
* multiple projects
* uploading scrapy projects as Python eggs
* scheduling spiders using a JSON API

Documentation is added along with the code.

Closes scrapy#218.

--HG--
rename : debian/scrapy-service.default => debian/scrapyd.default
rename : debian/scrapy-service.dirs => debian/scrapyd.dirs
rename : debian/scrapy-service.install => debian/scrapyd.install
rename : debian/scrapy-service.lintian-overrides => debian/scrapyd.lintian-overrides
rename : debian/scrapy-service.postinst => debian/scrapyd.postinst
rename : debian/scrapy-service.postrm => debian/scrapyd.postrm
rename : debian/scrapy-service.upstart => debian/scrapyd.upstart
rename : extras/scrapy.tac => extras/scrapyd.tac
  • Loading branch information
pablohoffman committed Sep 3, 2010
1 parent 1b76687 commit 37e9c5d
Show file tree
Hide file tree
Showing 42 changed files with 1,023 additions and 193 deletions.
2 changes: 0 additions & 2 deletions MANIFEST.in
Expand Up @@ -3,8 +3,6 @@ include AUTHORS
include INSTALL
include LICENSE
include MANIFEST.in
include scrapy/core/downloader/responsetypes/mime.types
include scrapy/xlib/pydispatch/license.txt
recursive-include scrapy/templates *
recursive-include scrapy/tests/sample_data *
recursive-include docs *
Expand Down
10 changes: 6 additions & 4 deletions debian/control
Expand Up @@ -15,9 +15,11 @@ Description: Python web crawling and scraping framework
It can be used for a wide range of purposes, from data mining to
monitoring and automated testing.

Package: scrapy-service
Package: scrapyd
Architecture: all
Depends: scrapy
Depends: scrapy, python-setuptools
Description: Scrapy Service
This package provides support for running Scrapy as a system service,
controlled through an upstart script.
The Scrapy service allows you to deploy your Scrapy projects by building
Python eggs of them and uploading them to the Scrapy service using a JSON API
that you can also use for scheduling spider runs. It supports multiple
projects also.
4 changes: 0 additions & 4 deletions debian/scrapy-service.default

This file was deleted.

2 changes: 0 additions & 2 deletions debian/scrapy-service.dirs

This file was deleted.

2 changes: 0 additions & 2 deletions debian/scrapy-service.install

This file was deleted.

21 changes: 0 additions & 21 deletions debian/scrapy-service.prerm

This file was deleted.

12 changes: 0 additions & 12 deletions debian/scrapy-service.upstart

This file was deleted.

8 changes: 2 additions & 6 deletions debian/scrapy.1
@@ -1,12 +1,12 @@
.TH SCRAPY 1 "October 17, 2009"
.SH NAME
scrapy \- Python Scrapy control script
scrapy \- the Scrapy command-line tool
.SH SYNOPSIS
.B scrapy
[\fIcommand\fR] [\fIOPTIONS\fR] ...
.SH DESCRIPTION
.PP
Scrapy is controlled through the \fBscrapy\fR control script. The script provides several commands, for different purposes. Each command supports its own particular syntax. In other words, each command supports a different set of arguments and options.
Scrapy is controlled through the \fBscrapy\fR command-line tool. The script provides several commands, for different purposes. Each command supports its own particular syntax. In other words, each command supports a different set of arguments and options.
.SH OPTIONS
.SS fetch\fR [\fIOPTION\fR] \fIURL\fR
.TP
Expand Down Expand Up @@ -50,8 +50,6 @@ Create new project with an initial project template

.SS --help, -h
Print command help and options
.SS --version
Print Scrapy version and exit
.SS --logfile=FILE
Log file. if omitted stderr will be used
.SS --loglevel=LEVEL, -L LEVEL
Expand All @@ -68,8 +66,6 @@ Write lsprof profiling stats to FILE
Write process ID to FILE
.SS --set=SET
Set/override setting (may be repeated)
.SS --settings=MODULE
Python path to the Scrapy project settings

.SH AUTHOR
Scrapy was written by the Scrapy Developers
Expand Down
3 changes: 2 additions & 1 deletion debian/scrapy.install
@@ -1,2 +1,3 @@
debian/tmp/usr
usr/lib/python*/*-packages/scrapy
usr/bin
extras/scrapy_bash_completion etc/bash_completion.d/
7 changes: 7 additions & 0 deletions debian/scrapyd.cfg
@@ -0,0 +1,7 @@
[scrapyd]
http_port = 6800
debug = off
#max_proc = 1
eggs_dir = /var/lib/scrapyd/eggs
dbs_dir = /var/lib/scrapyd/dbs
logs_dir = /var/log/scrapyd
1 change: 1 addition & 0 deletions debian/scrapyd.default
@@ -0,0 +1 @@
# Defaults for Scrapy service
4 changes: 4 additions & 0 deletions debian/scrapyd.dirs
@@ -0,0 +1,4 @@
var/lib/scrapyd
var/lib/scrapyd/eggs
var/lib/scrapyd/dbs
var/log/scrapyd
3 changes: 3 additions & 0 deletions debian/scrapyd.install
@@ -0,0 +1,3 @@
usr/lib/python*/*-packages/scrapyd
debian/scrapyd.cfg etc
extras/scrapyd.tac usr/share/scrapyd
@@ -1,2 +1,2 @@
new-package-should-close-itp-bug
script-in-etc-init.d-not-registered-via-update-rc.d /etc/init.d/scrapy-service
script-in-etc-init.d-not-registered-via-update-rc.d /etc/init.d/scrapyd
13 changes: 8 additions & 5 deletions debian/scrapy-service.postinst → debian/scrapyd.postinst
Expand Up @@ -6,16 +6,19 @@ case "$1" in
configure)
# Create user to run the service as
if [ -z "`id -u scrapy 2> /dev/null`" ]; then
adduser --system --home /var/lib/scrapy --gecos "scrapy" \
adduser --system --home /var/lib/scrapyd --gecos "scrapy" \
--no-create-home --disabled-password \
--quiet scrapy || true
fi
if [ ! -d /var/run/scrapy ]; then
mkdir /var/run/scrapy
chown scrapy:nogroup /var/run/scrapy
if [ ! -d /var/run/scrapyd ]; then
mkdir /var/run/scrapyd
chown scrapy:nogroup /var/run/scrapyd
fi

chown scrapy:nogroup /var/log/scrapy /var/lib/scrapy /var/run/scrapy
chown scrapy:nogroup /var/log/scrapyd /var/run/scrapyd \
/var/lib/scrapyd /var/lib/scrapyd/eggs /var/lib/scrapyd/dbs

update-python-modules -p # so upstart restart uses the new code
;;

abort-upgrade|abort-remove|abort-deconfigure)
Expand Down
2 changes: 1 addition & 1 deletion debian/scrapy-service.postrm → debian/scrapyd.postrm
Expand Up @@ -8,7 +8,7 @@ if [ purge = "$1" ]; then
else
echo >&2 "not removing scrapy system account because deluser command was not found"
fi
rm -rf /var/run/scrapy
rm -rf /var/run/scrapyd
fi

#DEBHELPER#
Expand Down
13 changes: 13 additions & 0 deletions debian/scrapyd.upstart
@@ -0,0 +1,13 @@
# Scrapy service

start on runlevel [2345]
stop on runlevel [06]

script
[ -r /etc/default/scrapyd ] && . /etc/default/scrapyd
logdir=/var/log/scrapyd
exec twistd -ny /usr/share/scrapyd/scrapyd.tac \
-u scrapy -g nogroup \
--pidfile /var/run/scrapyd/scrapy.pid \
-l $logdir/scrapyd.log >$logdir/scrapyd.out 2>$logdir/scrapyd.err
end script
13 changes: 0 additions & 13 deletions debian/service_conf.py

This file was deleted.

4 changes: 4 additions & 0 deletions docs/index.rst
Expand Up @@ -125,6 +125,7 @@ Solving specific problems
topics/leaks
topics/images
topics/ubuntu
topics/scrapyd

:doc:`faq`
Get answers to most frequently asked questions.
Expand All @@ -144,6 +145,9 @@ Solving specific problems
:doc:`topics/ubuntu`
Install latest Scrapy packages easily on Ubuntu

:doc:`topics/scrapyd`
Deploying your Scrapy project in production.

.. _extending-scrapy:

Extending Scrapy
Expand Down

0 comments on commit 37e9c5d

Please sign in to comment.