HEPCrawl v0.2.0

HEPCrawl v0.2.0 was released on 2nd of June, 2016.

About

HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP (http://inspirehep.net).

11 new spiders, including arXiv, APS, Base OAI source, Elsevier and many more.
Updated HEPRecord data items to conform with updates to INSPIRE data model.
Reorganization of loaders to have one place for input and output processing of metadata.
New pipelines for pushing content crawled to INSPIRE servers.
Better error handling and reporting, including support for Sentry.

$ pip install hepcrawl==0.2.0

http://pythonhosted.org/hepcrawl/

Happy hacking and thanks for flying HEPCrawl.

INSPIRE Development Team