Skip to content
This repository has been archived by the owner on Feb 27, 2021. It is now read-only.

BASE dump import and optimizations for speed #760

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Conversation

wetneb
Copy link
Member

@wetneb wetneb commented May 13, 2019

Apparently we didn't even have some code to import a dump, not sure exactly where this went… This is some incremental progress towards being able to stay in sync more efficiently with BASE.

@wetneb wetneb marked this pull request as ready for review May 14, 2019 15:09
@wetneb
Copy link
Member Author

wetneb commented May 14, 2019

When importing papers that are already there and do not need updating, this code runs at ~ 600 papers/sec in the sandbox. I think we should start running this in production to import the missing papers.

@wetneb wetneb requested a review from Phyks May 14, 2019 17:23
Copy link
Member

@Phyks Phyks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Note that Travis is unhappy :(

backend/oai.py Show resolved Hide resolved
backend/oai.py Outdated Show resolved Hide resolved
backend/oai.py Outdated Show resolved Hide resolved
backend/oai.py Outdated Show resolved Hide resolved
backend/tests/test_oai.py Show resolved Hide resolved
backend/oai.py Outdated Show resolved Hide resolved
backend/translators.py Outdated Show resolved Hide resolved
backend/utils.py Outdated Show resolved Hide resolved
backend/oai.py Show resolved Hide resolved
@Phyks Phyks mentioned this pull request May 15, 2019
1 task
@coveralls
Copy link

coveralls commented Jan 8, 2020

Coverage Status

Coverage increased (+0.2%) to 82.002% when pulling e2c08f3 on base_optims into 2300bae on master.

@beckstefan
Copy link
Member

The tests are passing.

In the OaiRecords from BASE we have irregular pubtypes article-journal, '' and None.

The last one seems to come from the fact that on production the value is not set, but the to other seem mysterious to me at the moment. I'd like to investigate this, then we can probably merge and correct our incorrect types.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants