Skip to content

Commit

Permalink
Merge pull request #37 from funkyfuture/cli_with_subcommands
Browse files Browse the repository at this point in the history
cli with subcommands
  • Loading branch information
redapple committed Jun 19, 2017
2 parents 7bf56ec + 476c971 commit b650d49
Show file tree
Hide file tree
Showing 17 changed files with 636 additions and 91 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -4,3 +4,4 @@ dist
build
.tox/
.cache/
.coverage
16 changes: 7 additions & 9 deletions .travis.yml
@@ -1,15 +1,13 @@
language: python
sudo: false
matrix:
include:
- python: 2.7
env: TOXENV=py27
- python: 3.5
env: TOXENV=py35
- python: 3.6
env: TOXENV=py36
cache: pip
python:
- 2.7
- 3.4
- 3.5
- 3.6
install:
- pip install -U tox twine wheel codecov
- travis_retry pip install -U tox-travis twine wheel codecov
script: tox
after_success:
- codecov
167 changes: 122 additions & 45 deletions README.rst
Expand Up @@ -5,62 +5,97 @@ Scrapyd-client
.. image:: https://secure.travis-ci.org/scrapy/scrapyd-client.png?branch=master
:target: http://travis-ci.org/scrapy/scrapyd-client

Scrapyd-client is a client for `scrapyd <https://github.com/scrapy/scrapyd>`_. It provides the ``scrapyd-deploy`` utility which allows you to deploy your project to a Scrapyd server.
Scrapyd-client is a client for Scrapyd_. It provides the general ``scrapyd-client`` and the
``scrapyd-deploy`` utility which allows you to deploy your project to a Scrapyd server.

.. _how-it-works:
.. _Scrapyd: https://scrapyd.readthedocs.io

How It Works
------------

Deploying your project to a Scrapyd server typically involves two steps:
scrapyd-client
--------------

1. `Eggifying <http://peak.telecommunity.com/DevCenter/PythonEggs>`_ your project. You'll need to install `setuptools <http://pypi.python.org/pypi/setuptools>`_ for this. See `Egg Caveats`_ below.
2. Uploading the egg to the Scrapyd server through the `addversion.json <https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json>`_ endpoint.
For a reference on each subcommand invoke ``scrapyd-client <subcommand> --help``.

The ``scrapyd-deploy`` tool automates the process of building the egg and pushing it to the target Scrapyd server.
Where filtering with wildcards is possible, it is facilitated with fnmatch_.
The ``--project`` option can be omitted if one is found in a ``scrapy.cfg``.

.. _targets:
.. _fnmatch: https://docs.python.org/library/fnmatch.html

Targets
-------
deploy
~~~~~~

You can define Scrapyd targets in your project's ``scrapy.cfg`` file. Example::
At the moment this is a wrapper around `scrapyd-deploy`_. Note that the command line options
of this one are likely to change.

[deploy:example]
url = http://scrapyd.example.com/api/scrapyd
username = scrapy
password = secret
projects
~~~~~~~~

While your target needs to be defined with its URL in ``scrapy.cfg``, you can use `netrc <https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html>`_ for username and password, like so::
Lists all projects of a Scrapyd instance::

machine scrapyd.example.com
username scrapy
password secret
# lists all projects on the default target
scrapyd-client projects
# lists all projects from a custom URL
scrapyd-client -t http://scrapyd.example.net projects

If you want to list all available targets, you can use the ``-l`` option::
schedule
~~~~~~~~

scrapyd-deploy -l
Schedules one or more spiders to be executed::

To list projects available on a specific target, use the ``-L`` option::
# schedules any spider
scrapyd-client schedule
# schedules all spiders from the 'knowledge' project
scrapyd-client schedule -p knowledge \*
# schedules any spider from any project whose name ends with '_daily'
scrapyd-client schedule -p \* *_daily

scrapyd-deploy -L example
spiders
~~~~~~~

Lists spiders of one or more projects::

# lists all spiders
scrapyd-client spiders
# lists all spiders from the 'knowledge' project
scrapyd-client spiders -p knowledge


scrapyd-deploy
--------------

How It Works
~~~~~~~~~~~~

Deploying your project to a Scrapyd server typically involves two steps:

1. Eggifying_ your project. You'll need to install setuptools_ for this. See `Egg Caveats`_ below.
2. Uploading the egg to the Scrapyd server through the `addversion.json`_ endpoint.

The ``scrapyd-deploy`` tool automates the process of building the egg and pushing it to the target
Scrapyd server.

.. _addversion.json: https://scrapyd.readthedocs.org/en/latest/api.html#addversion-json
.. _Eggifying: http://peak.telecommunity.com/DevCenter/PythonEggs
.. _setuptools: https://pypi.python.org/pypi/setuptools

Deploying a Project
-------------------
~~~~~~~~~~~~~~~~~~~

First ``cd`` into your project's root, you can then deploy your project with the following::

scrapyd-deploy <target> -p <project>

This will eggify your project and upload it to the target. If you have a ``setup.py`` file in your project, it will be used, otherwise one will be created automatically.
This will eggify your project and upload it to the target. If you have a ``setup.py`` file in your
project, it will be used, otherwise one will be created automatically.

If successful you should see a JSON response similar to the following::

Deploying myproject-1287453519 to http://localhost:6800/addversion.json
Server response (200):
{"status": "ok", "spiders": ["spider1", "spider2"]}

To save yourself from having to specify the target and project, you can set the defaults in the ``scrapy.cfg`` file::
To save yourself from having to specify the target and project, you can set the defaults in the
``scrapy.cfg`` file::

[deploy]
url = http://scrapyd.example.com/api/scrapyd
Expand All @@ -72,54 +107,96 @@ To save yourself from having to specify the target and project, you can set the
You can now deploy your project with just the following::

scrapyd-deploy
If you have more than one target to deploy, you can deploy your project in all targets with one command::

scrapyd-deploy -a -p <project>
If you have more than one target to deploy, you can deploy your project in all targets with one
command::

.. _versioning:
scrapyd-deploy -a -p <project>

Versioning
----------
~~~~~~~~~~

By default, ``scrapyd-deploy`` uses the current timestamp for generating the project version, as shown above. However, you can pass a custom version using ``--version``::
By default, ``scrapyd-deploy`` uses the current timestamp for generating the project version, as
shown above. However, you can pass a custom version using ``--version``::

scrapyd-deploy <target> -p <project> --version <version>

Or for all targets::

scrapyd-deploy -a -p <project> --version <version>

The version must be comparable with `LooseVersion <http://epydoc.sourceforge.net/stdlib/distutils.version.LooseVersion-class.html>`_. Scrapyd will use the greatest version unless specified.
The version must be comparable with LooseVersion_. Scrapyd will use the greatest version unless
specified.

If you use Mercurial or Git, you can use ``HG`` or ``GIT`` respectively as the argument supplied to ``--version`` to use the current revision as the version. You can save yourself having to specify the version parameter by adding it to your target's entry in ``scrapy.cfg``::
If you use Mercurial or Git, you can use ``HG`` or ``GIT`` respectively as the argument supplied to
``--version`` to use the current revision as the version. You can save yourself having to specify
the version parameter by adding it to your target's entry in ``scrapy.cfg``::

[deploy:target]
...
version = HG

.. _local-settings:
.. _LooseVersion: http://epydoc.sourceforge.net/stdlib/distutils.version.LooseVersion-class.html

Local Settings
--------------
~~~~~~~~~~~~~~

You may want to keep certain settings local and not have them deployed to Scrapyd. To accomplish this you can create a ``local_settings.py`` file at the root of your project, where your ``scrapy.cfg`` file resides, and add the following to your project's settings::
You may want to keep certain settings local and not have them deployed to Scrapyd. To accomplish
this you can create a ``local_settings.py`` file at the root of your project, where your
``scrapy.cfg`` file resides, and add the following to your project's settings::

try:
from local_settings import *
except ImportError:
pass

``scrapyd-deploy`` doesn't deploy anything outside of the project module, so the ``local_settings.py`` file won't be deployed.

.. _egg-caveats:
``scrapyd-deploy`` doesn't deploy anything outside of the project module, so the
``local_settings.py`` file won't be deployed.

Egg Caveats
-----------
~~~~~~~~~~~

Some things to keep in mind when building eggs for your Scrapy project:

* Make sure no local development settings are included in the egg when you build it. The ``find_packages`` function may be picking up your custom settings. In most cases you want to upload the egg with the default project settings.
* You should avoid using ``__file__`` in your project code as it doesn't play well with eggs. Consider using `pkgutil.get_data() <http://docs.python.org/library/pkgutil.html#pkgutil.get_data>`_ instead.
* Be careful when writing to disk in your project, as Scrapyd will most likely be running under a different user which may not have write access to certain directories. If you can, avoiding writing to disk and always use `tempfile <http://docs.python.org/library/tempfile.html>`_ for temporary files.
* Make sure no local development settings are included in the egg when you build it. The
``find_packages`` function may be picking up your custom settings. In most cases you want to
upload the egg with the default project settings.
* You should avoid using ``__file__`` in your project code as it doesn't play well with eggs.
Consider using `pkgutil.get_data`_ instead.
* Be careful when writing to disk in your project, as Scrapyd will most likely be running under a
different user which may not have write access to certain directories. If you can, avoid writing
to disk and always use tempfile_ for temporary files.

.. _pkgutil.get_data: http://docs.python.org/library/pkgutil.html#pkgutil.get_data
.. _tempfile: http://docs.python.org/library/tempfile.html


Global settings
---------------

Targets
~~~~~~~

You can define Scrapyd targets in your project's ``scrapy.cfg`` file. Example::

[deploy:example]
url = http://scrapyd.example.com/api/scrapyd
username = scrapy
password = secret

While your target needs to be defined with its URL in ``scrapy.cfg``,
you can use netrc_ for username and password, like so::

machine scrapyd.example.com
username scrapy
password secret

If you want to list all available targets, you can use the ``-l`` option::

scrapyd-deploy -l

To list projects available on a specific target, use the ``-L`` option::

scrapyd-deploy -L example

.. _netrc: https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html
3 changes: 3 additions & 0 deletions requirements-dev.txt
@@ -0,0 +1,3 @@
pytest-console-scripts
pytest-mock
tox
97 changes: 97 additions & 0 deletions scrapyd_client/cli.py
@@ -0,0 +1,97 @@
from __future__ import print_function

import sys
from argparse import ArgumentParser
from traceback import print_exc

from requests.exceptions import ConnectionError

from scrapyd_client import commands
from scrapyd_client.utils import ErrorResponse, MalformedRespone, get_config


DEFAULT_TARGET_URL = 'http://localhost:6800'
ISSUE_TRACKER_URL = 'https://github.com/scrapy/scrapyd-client/issues'


def parse_cli_args(args):
target_default = get_config('deploy', 'url', fallback=DEFAULT_TARGET_URL).rstrip('/')
project_default = get_config('deploy', 'project', fallback=None)
project_kwargs = {
'metavar': 'PROJECT', 'required': True,
'help': 'Specifies the project, can be a globbing pattern.'
}
if project_default:
project_kwargs['default'] = project_default

description = 'A command line interface for Scrapyd.'
mainparser = ArgumentParser(description=description)
subparsers = mainparser.add_subparsers()
mainparser.add_argument('-t', '--target', default=target_default,
help="Specifies the Scrapyd's API base URL.")

parser = subparsers.add_parser('deploy', description=commands.deploy.__doc__)
parser.set_defaults(action=commands.deploy)

parser = subparsers.add_parser('projects', description=commands.projects.__doc__)
parser.set_defaults(action=commands.projects)

parser = subparsers.add_parser('schedule', description=commands.schedule.__doc__)
parser.set_defaults(action=commands.schedule)
parser.add_argument('-p', '--project', **project_kwargs)
parser.add_argument('spider', metavar='SPIDER',
help='Specifies the spider, can be a globbing pattern.')
parser.add_argument('--arg', action='append', default=[],
help='Additional argument (key=value), can be specified multiple times.')

parser = subparsers.add_parser('spiders', description=commands.spiders.__doc__)
parser.set_defaults(action=commands.spiders)
parser.add_argument('-p', '--project', **project_kwargs)
parser.add_argument('-v', '--verbose', action='store_true', default=False,
help="Prints project's and spider's name in each line, intended for "
"processing stdout in scripts.")

# TODO remove next two lines when 'deploy' is moved to this module
parsed_args, _ = mainparser.parse_known_args(args)
if getattr(parsed_args, 'action', None) is not commands.deploy:
parsed_args = mainparser.parse_args(args)

if not hasattr(parsed_args, 'action'):
mainparser.print_help()
raise SystemExit(0)

return parsed_args


def main():
try:
args = parse_cli_args(sys.argv[1:])
args.action(args)
except KeyboardInterrupt:
print('Aborted due to keyboard interrupt.')
exit_code = 0
except SystemExit as e:
exit_code = e.code
except ConnectionError as e:
print('Failed to connect to target ({}):'.format(args.target))
print(e)
exit_code = 1
except ErrorResponse as e:
print('Scrapyd responded with an error:')
print(e)
exit_code = 1
except MalformedRespone as e:
text = str(e)
if len(text) > 120:
text = text[:50] + ' [...] ' + text[-50:]
print('Received a malformed response:')
print(text)
exit_code = 1
except Exception:
print('Caught unhandled exception, please report at {}'.format(ISSUE_TRACKER_URL))
print_exc()
exit_code = 3
else:
exit_code = 0
finally:
raise SystemExit(exit_code)

0 comments on commit b650d49

Please sign in to comment.