Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APS spider: use LastRunStore spider #261

Merged
merged 1 commit into from
Feb 12, 2019

Conversation

vbalbp
Copy link
Contributor

@vbalbp vbalbp commented Feb 6, 2019

Signed-off-by: Victor Balbuena vbalbp@gmail.com

@@ -199,3 +213,6 @@ def _get_authors_and_collab(self, article):

def _file_name_from_url(self, url):
return "{}.xml".format(url[url.rfind('/') + 1:])

def make_file_fingerprint(self, set_):
return u'metadataPrefix={}&set={}'.format(self.format, set_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is the format coming from? do you really need that part? it seems OAI-PMH specific. Are you sure this really works?

@vbalbp vbalbp force-pushed the use-LastRunStoreSpider-on-APS branch from 710a20c to 0efd08a Compare February 6, 2019 15:49
self.message = u"Failed to load file at {} for set {}".format(
file_path,
set_,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

@vbalbp vbalbp force-pushed the use-LastRunStoreSpider-on-APS branch 2 times, most recently from 2307c01 to f851cde Compare February 7, 2019 13:43

.. _See documentation here:
http://harvest.aps.org/docs/harvest-api#endpoints

Journals are not supported as a parameter anymore for APS Spider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note:
    Selecting specific journals is not supported for technical reasons as it's incompatible with the way the last run time is stored.

@vbalbp vbalbp force-pushed the use-LastRunStoreSpider-on-APS branch 4 times, most recently from 3a19a63 to 4e336d3 Compare February 8, 2019 12:54
'per_page': self.per_page,
'date': self.date
}
return furl(APSSpider.aps_base_url).add(params).url
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are the None checks that were done previously useless?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it out and it does matter, so changing it back to the old way with the checks

Signed-off-by: Victor Balbuena <vbalbp@gmail.com>
@vbalbp vbalbp force-pushed the use-LastRunStoreSpider-on-APS branch from 4e336d3 to 93be8b3 Compare February 11, 2019 15:02
@vbalbp vbalbp merged commit ec17e4e into inspirehep:master Feb 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants