The easiest way to install Scrapy Do is using pip
. You can then create a directory where you want your project data stored and just start the daemon there.
$ pip install scrapy-do $ mkdir /home/user/my-scrapy-do-data $ cd /home/user/my-scrapy-do-data $ scrapy-do scrapy-do
Yup, you need to type scrapy-do
twice. That's how twisted works, don't ask me. After doing that, you will see some content in this directory including the log file and the pidfile of the Scrapy Do daemon.
Installing Scrapy Do as a systemd service is a far better idea than the easy way described above. It's a bit of work that should really be done by a proper Debian/Ubuntu package, but we do not have one for the time being, so I will show you how to do it "by hand."
Although not strictly necessary, it's a good practice to run the daemon under a separate user account. I will create one called
pydaemon
because I run a couple more python daemons this way.$ sudo useradd -m -d /opt/pydaemon pydaemon
Make sure you have all of the following packages installed:
$ sudo apt-get install python3 python3-dev python3-virtualenv $ sudo apt-get install build-essential
Switch your session to this new user account:
$ sudo su - pydaemon
Create the virtual env and install Scrapy Do:
$ mkdir virtualenv $ cd virtualenv/ $ python3 /usr/lib/python3/dist-packages/virtualenv.py -p /usr/bin/python3 . $ . ./bin/activate $ pip install scrapy-do $ cd ..
Create a bin directory and a wrapper script that will set up the virtualenv on startup:
$ mkdir bin $ cat > bin/scrapy-do << EOF > #!/bin/bash > . /opt/pydaemon/virtualenv/bin/activate > exec /opt/pydaemon/virtualenv/bin/scrapy-do "\${@}" > EOF $ chmod 755 bin/scrapy-do
Create a data directory and a configuration file:
$ mkdir -p data/scrapy-do $ mkdir etc $ cat > etc/scrapy-do.conf << EOF > [scrapy-do] > project-store = /opt/pydaemon/data/scrapy-do > EOF
As root, create the following file with the following content:
# cat > /etc/systemd/system/scrapy-do.service << EOF > [Unit] > Description=Scrapy Do Service > > [Service] > ExecStart=/opt/pydaemon/bin/scrapy-do --nodaemon --pidfile= \ > scrapy-do --config /opt/pydaemon/etc/scrapy-do.conf > User=pydaemon > Group=pydaemon > Restart=always > > [Install] > WantedBy=multi-user.target > EOF
You can then reload the systemd configuration and let it manage the Scrapy Do daemon:
$ sudo systemctl daemon-reload $ sudo systemctl start scrapy-do $ sudo systemctl enable scrapy-do
Finally, you should now be able to see that the daemon is running:
$ sudo systemctl status scrapy-do ● scrapy-do.service - Scrapy Do Service Loaded: loaded (/etc/systemd/system/scrapy-do.service; enabled; vendor preset: enabled) Active: active (running) since Sun 2017-12-10 22:42:55 UTC; 4min 23s ago Main PID: 27543 (scrapy-do) ...
I know its awfully complicated. I will do some packaging work when I have a spare moment.