Skip to content

Commit

Permalink
added -o option to scrapy crawl, a convenient shortcut for using feed…
Browse files Browse the repository at this point in the history
… exports
  • Loading branch information
pablohoffman committed Oct 22, 2011
1 parent 13cd9a1 commit ade5efd
Show file tree
Hide file tree
Showing 4 changed files with 12 additions and 5 deletions.
6 changes: 3 additions & 3 deletions docs/faq.rst
Expand Up @@ -197,15 +197,15 @@ Simplest way to dump all my scraped items into a JSON/CSV/XML file?

To dump into a JSON file::

scrapy crawl myspider -s FEED_URI=items.json -s FEED_FORMAT=json
scrapy crawl myspider -o items.json -t json

To dump into a CSV file::

scrapy crawl myspider -s FEED_URI=items.csv -s FEED_FORMAT=csv
scrapy crawl myspider -o items.csv -t csv

To dump into a XML file::

scrapy crawl myspider -s FEED_URI=items.xml -s FEED_FORMAT=xml
scrapy crawl myspider -o items.xml -t xml

For more information see :ref:`topics-feed-exports`

Expand Down
2 changes: 1 addition & 1 deletion docs/intro/overview.rst
Expand Up @@ -161,7 +161,7 @@ Run the spider to extract the data
Finally, we'll run the spider to crawl the site an output file
``scraped_data.json`` with the scraped data in JSON format::

scrapy crawl mininova.org -s FEED_URI=scraped_data.json -s FEED_FORMAT=json
scrapy crawl mininova.org -o scraped_data.json -t json

This uses :ref:`feed exports <topics-feed-exports>` to generate the JSON file.
You can easily change the export format (XML or CSV, for example) or the
Expand Down
2 changes: 1 addition & 1 deletion docs/intro/tutorial.rst
Expand Up @@ -420,7 +420,7 @@ Storing the scraped data
The simplest way to store the scraped data is by using the :ref:`Feed exports
<topics-feed-exports>`, with the following command::

scrapy crawl dmoz -s FEED_URI=items.json -s FEED_FORMAT=json
scrapy crawl dmoz -o items.json -t json

That will generate a ``items.json`` file containing all scraped items,
serialized in `JSON`_.
Expand Down
7 changes: 7 additions & 0 deletions scrapy/commands/crawl.py
Expand Up @@ -16,13 +16,20 @@ def add_options(self, parser):
ScrapyCommand.add_options(self, parser)
parser.add_option("-a", dest="spargs", action="append", default=[], metavar="NAME=VALUE", \
help="set spider argument (may be repeated)")
parser.add_option("-o", "--output", metavar="FILE", \
help="store scraped items into FILE (using feed exports)")
parser.add_option("-t", "--output-format", metavar="FORMAT", default="jsonlines", \
help="format to use in feed exports (default: %default)")

def process_options(self, args, opts):
ScrapyCommand.process_options(self, args, opts)
try:
opts.spargs = arglist_to_dict(opts.spargs)
except ValueError:
raise UsageError("Invalid -a value, use -a NAME=VALUE", print_help=False)
if opts.output:
self.settings.overrides['FEED_URI'] = opts.output
self.settings.overrides['FEED_FORMAT'] = opts.output_format

def run(self, args, opts):
if len(args) < 1:
Expand Down

0 comments on commit ade5efd

Please sign in to comment.