Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvested XML from arXiv does not contain root element #41

Closed
kaplun opened this issue Sep 22, 2017 · 2 comments
Closed

Harvested XML from arXiv does not contain root element #41

kaplun opened this issue Sep 22, 2017 · 2 comments

Comments

@kaplun
Copy link
Member

kaplun commented Sep 22, 2017

It looks like, in some conditions, what is being harvested from arXiv is currently missing the root element.
E.g.:

$ head ///eos/workspace/i/inspire/PROD/var/data/oaiharvester/crawler/oaiharvest_2017-09-22_FiF81M.xml
<record xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><header><identifier>oai:arXiv.org:1709.07034</identifier><datestamp>2017-09-22</datestamp><setSpec>physics:astro-ph</setSpec></header><metadata><arXiv xmlns="http://arxiv.org/OAI/arXiv/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://arxiv.org/OAI/arXiv/ http://arxiv.org/OAI/arXiv.xsd"><id>1709.07034</id><created>2017-09-20</created><authors><author><keyname>Wijnands</keyname><forenames>Rudy</forenames></author><author><keyname>Degenaar</keyname><forenames>Nathalie</forenames></author><author><keyname>Page</keyname><forenames>Dany</forenames></author></authors><title>Cooling of Accretion-Heated Neutron Stars</title><categories>astro-ph.HE</categories><comments>Has appeared in Journal of Astrophysics and Astronomy special issue
  on 'Physics of Neutron Stars and Related Objects', celebrating the 75th
  birth-year of G. Srinivasan. In case of missing sources and/or references in
  the tables, please contact the first author and they will be included in
  updated versions of this review</comments><journal-ref>J. Astrophys. Astr. (September 2017) 38:49</journal-ref><doi>10.1007/s12036-017-9466-5</doi><license>http://arxiv.org/licenses/nonexclusive-distrib/1.0/</license><abstract>  We present a brief, observational review about the study of the cooling
behaviour of accretion-heated neutron stars and the inferences about the
neutron-star crust and core that have been obtained from these studies.
Accretion of matter during outbursts can heat the crust out of thermal
equilibrium with the core and after the accretion episodes are over, the crust
will cool down until crust-core equilibrium is restored. We discuss the

while:

head ///eos/workspace/i/inspire/PROD/var/data/oaiharvester/crawler/oaiharvest_2017-09-22_pwE0x6.xml
<ListRecords><record xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><header><identifier>oai:arXiv.org:1611.00593</identifier><datestamp>2017-09-21</datestamp><setSpec>physics</setSpec></header><metadata><arXiv xmlns="http://arxiv.org/OAI/arXiv/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://arxiv.org/OAI/arXiv/ http://arxiv.org/OAI/arXiv.xsd"><id>1611.00593</id><created>2016-11-02</created><updated>2017-09-20</updated><authors><author><keyname>Chaturvedi</keyname><forenames>Pankaj</forenames></author><author><keyname>Malvimat</keyname><forenames>Vinay</forenames></author><author><keyname>Sengupta</keyname><forenames>Gautam</forenames></author></authors><title>Covariant holographic entanglement negativity</title><categories>hep-th</categories><comments>17 pages latex, substantial modifications, references added</comments><license>http://arxiv.org/licenses/nonexclusive-distrib/1.0/</license><abstract>  We propose a covariant holographic conjecture for the entanglement negativity
of mixed states in bipartite systems described by $d$-dimensional conformal
field theories dual to bulk non static $AdS_{d+1}$ configurations. Application
of our conjecture to $(1+1)$-dimensional conformal field theories dual to bulk
rotating BTZ black holes exactly reproduces the corresponding entanglement
negativity in the large central charge limit and characterizes the distillable
entanglement. We further demonstrate that our conjecture applied to the case of
bulk extremal rotating BTZ black holes also characterizes the entanglement
negativity for the chiral half of the corresponding zero temperature
$(1+1)$-dimensional holographic conformal field theories.
@kaplun
Copy link
Member Author

kaplun commented Sep 22, 2017

This causes the harvesting of only the first record by hepcrawl.

cc: @fschwenn

@kaplun
Copy link
Member Author

kaplun commented Sep 22, 2017

The culprit is this function:

def write_to_dir(records, output_dir, max_records=1000, encoding='utf-8'):

that, when the number of records is larger than max_records badly closes the current file and badly open a new file.

michamos added a commit to michamos/invenio-oaiharvester that referenced this issue Sep 22, 2017
* Previously, the root tag was not repeated in every file.
* Fixes inveniosoftware#41.

Signed-off-by: Micha Moskovic <michamos@gmail.com>
michamos added a commit to michamos/invenio-oaiharvester that referenced this issue Sep 22, 2017
* Previously, the root tag was not repeated in every file.
* Fixes inveniosoftware#41.

Signed-off-by: Micha Moskovic <michamos@gmail.com>
michamos added a commit to michamos/invenio-oaiharvester that referenced this issue Sep 22, 2017
* Previously, the root tag was not repeated in every file (closes inveniosoftware#41).

Signed-off-by: Micha Moskovic <michamos@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant