Skip to content

Asquera/scrapy-elasticsearch-bulk-item-exporter

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

scrapy-elasticsearch-bulk-item-exporter provides an exporter for Scrapy items that writes Elasticsearch Bulk format for easy further use with elasticsearch.

Install

pip install scrapy-elasticsearch-bulk-item-exporter

Usage

scrapy crawl -o my.bulk -t elasticsearchbulk

Elasticsearch has an upper limit of bulk document size. 100mb is standard, it can be pushed up to 2GB (not advisable). This splitting can be done using split(1):

scrapy crawl -o - -t elasticsearchbulk

Configure settings.py:

FEED_EXPORTERS = { 'elasticsearchbulk': 'scrapyelasticsearch.ElasticSearchBulkItemExporter' }

Changelog

0.1: Initial release

Credit

Thanks to Julien Duponchelle, I used his scrapy-elasticsearch for inspriration.

License

Scrapys License: BSD. See LICENSE for details.

About

A scrapy feed exporter that exports to elasticsearch bulk format. Very handy for development.

Resources

License

Stars

Watchers

Forks

Packages

No packages published