Skip to content

Latest commit

 

History

History
46 lines (33 loc) · 1.19 KB

RELEASE-NOTES.rst

File metadata and controls

46 lines (33 loc) · 1.19 KB

HEPCrawl v0.2.0

HEPCrawl v0.2.0 was released on 2nd of June, 2016.

About

HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP (http://inspirehep.net).

What's new

  • 11 new spiders, including arXiv, APS, Base OAI source, Elsevier and many more.
  • Updated HEPRecord data items to conform with updates to INSPIRE data model.
  • Reorganization of loaders to have one place for input and output processing of metadata.
  • New pipelines for pushing content crawled to INSPIRE servers.
  • Better error handling and reporting, including support for Sentry.

Installation

$ pip install hepcrawl==0.2.0

Documentation

http://pythonhosted.org/hepcrawl/

Happy hacking and thanks for flying HEPCrawl.