Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in makeRequestErrorHandling from a listRecords call with from_ parameter #33

Closed
fxcoudert opened this issue Jul 20, 2018 · 4 comments

Comments

@fxcoudert
Copy link

This very simple code is requesting records from a figshare set:

#!/usr/bin/env python3

from oaipmh.client import Client
from oaipmh.metadata import MetadataRegistry, oai_dc_reader

import datetime

registry = MetadataRegistry()
registry.registerReader('oai_dc', oai_dc_reader)
client = Client('https://api.figshare.com/v2/oai', registry)

month_ago = datetime.datetime.now() - datetime.timedelta(days=30)
for record in client.listRecords(metadataPrefix='oai_dc', set='portal_259', from_=month_ago):
  print(record[0].datestamp(), end=' ')
  print(record[1]['title'][0])

After finding several records, the code throws an exception with the following error:

Traceback (most recent call last):
  File "./toto.py", line 13, in <module>
    for record in client.listRecords(metadataPrefix='oai_dc', set='portal_259', from_=month_ago):
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 365, in ResumptionListGenerator
    result, token = nextBatch(token)
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 194, in nextBatch
    resumptionToken=token)
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 308, in makeRequestErrorHandling
    raise getattr(error, code[0].upper() + code[1:] + 'Error')(msg)
oaipmh.error.NoRecordsMatchError: The result in an empty list.

If I remove the from_ parameter from the listRecords call, it all works fine.

@fxcoudert
Copy link
Author

I should also state that, even before breaking, the server seems to return 3 records whose datestamps do not match the requested from_ parameters:

2018-05-17 13:10:54 Quantitative Characterization of Molecular-Stream Separation
2018-01-10 15:47:36 Melting of zeolitic imidazolate frameworks with different topologies: insight from first-principles molecular dynamics
2017-09-07 20:44:45 Facile Fabrication of Ultralow-Density Transparent Boehmite Nanofiber Cryogel Monoliths and Their Application in Volumetric Three-Dimensional Displays

Probably not related, and not as annoying as a crash, but still…

@jascoul
Copy link
Collaborator

jascoul commented Jul 24, 2018

When I run your code, I get 67 results, so I can't reproduce it. The NoRecordsMatch error gets raised when the server returns no results, this is part of the OAIPMH protocol.

I also got the 3 records with the wrong timestamp. The server should not have returned those.
These seem to be problems with the figshare api and not with this library.

@jascoul jascoul closed this as completed Jul 24, 2018
@fxcoudert
Copy link
Author

I understand that NoRecordsMatch should be returned when the server returns no results. The bug here is that, sometimes, the pyoai library raises this error while the server did return results.

In fact, from my testing it appears the NoRecordsMatch occurs when (and only when) the number of records returned is an exact multiple of ten. I thus suspect this is a pagination bug.

@fxcoudert
Copy link
Author

Using from_ and until to craft a time range for which there is exactly 10 results shows the bug:

bli /tmp $ cat a.py 
#!/usr/bin/env python3

from oaipmh.client import Client
from oaipmh.metadata import MetadataRegistry, oai_dc_reader

import datetime

registry = MetadataRegistry()
registry.registerReader('oai_dc', oai_dc_reader)
client = Client('https://api.figshare.com/v2/oai', registry)

f = datetime.datetime.strptime('2018-07-24 14:56:00', '%Y-%m-%d %H:%M:%S')
u = datetime.datetime.strptime('2018-07-27 15:00:00', '%Y-%m-%d %H:%M:%S')
for record in client.listRecords(metadataPrefix='oai_dc', set='portal_259', from_=f, until=u):
  print(record[0].datestamp(), end=' ')
  print(record[1]['title'][0])

which gives:

bli /tmp $ ./a.py  
2018-07-27 14:00:34 Boehmite Nanofiber-Reinforced Resorcinol-Formaldehyde Macroporous Monoliths for Heat/Flame Protection
2018-07-26 21:31:00 Theory of the reactant-stationary kinetics for zymogen activation coupled to  an enzyme catalyzed reaction
2018-07-26 16:57:19 Facile Synthesis of a Diverse Library of Mono-3-substituted β-Cyclodextrin Analogues
2018-07-26 14:00:22 Computationally-Inspired Discovery of an Unsymmetrical Porous Organic Cage
2018-07-26 13:57:17 Unzipping Natural Products: Improved Natural Product Structure Predictions by Ensemble Modeling and Fingerprint Matching
2018-07-25 18:45:57 Air Quality in Puerto Rico in the Aftermath of Hurricane Maria: A Case Study on the Use of Lower-Cost Air Quality Monitors
2018-07-25 15:08:33 Magnetic Structure of UO2 and NpO2 by First-Principle Methods
2018-07-25 15:06:07 Tailing miniSOG: Structural Bases of the Complex Photophysics of a Flavin-Binding Singlet Oxygen Photosensitizing Protein
2018-07-25 14:31:52 On-Surface Radical Oligomerisation: A New Approach to STM Tip-Induced Reactions
2018-07-24 14:56:00 Hue Parameter Fluorescence Identification of Edible Oils with a Smartphone
Traceback (most recent call last):
  File "./a.py", line 14, in <module>
    for record in client.listRecords(metadataPrefix='oai_dc', set='portal_259', from_=f, until=u):
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 365, in ResumptionListGenerator
    result, token = nextBatch(token)
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 194, in nextBatch
    resumptionToken=token)
  File "/Users/fx/anaconda3/lib/python3.6/site-packages/oaipmh/client.py", line 308, in makeRequestErrorHandling
    raise getattr(error, code[0].upper() + code[1:] + 'Error')(msg)
oaipmh.error.NoRecordsMatchError: The result in an empty list.

With this from_/until specification, it should be reproducible for you. I hope you can reopen the bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants