Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get_marc_record_from_ia: use existing metadata if present #8332

Conversation

scottbarnes
Copy link
Collaborator

@scottbarnes scottbarnes commented Sep 25, 2023

Fix.

Technical

Currently ia_import() will twice query the (memcached) Internet Archive API during the process for importing from an Internet Archive ID.

This PR passes the already-fetched metadata to the function that calls the metadata API again (get_marc_record_from_ia()), thereby avoiding the second call to the (memcached) API.

First, in "case 1", the metadata is gathered:

metadata = ia.get_metadata(identifier)

Once #8331 is merged, this will be the final query to the the IA metadata API, if that IA record has data in the openlibrary_edition field. ("Case 2".)

However, if the IA record does not have data in openlibrary_edition, then there is a query for MARC data from the IA item, which calls get_marc_record_from_ia(), which again calls the metadata API:

metadata = ia.get_metadata(identifier)

This PR avoids that second query.

Testing

Verify already-fetch metadata is now passed into get_marc_record_from_ia(), and that IA imports still work as expected.

Stakeholders

@cdrini

@cdrini cdrini added the Priority: 1 Do this week, receiving emails, time sensitive, . [managed] label Sep 25, 2023
Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm! Note the ia.get_metadata fn is memcached, so we won't be avoiding an IA query, but will be removing a memached network request, I believe!

openlibrary/tests/catalog/test_get_ia.py Outdated Show resolved Hide resolved
openlibrary/catalog/get_ia.py Outdated Show resolved Hide resolved
openlibrary/catalog/get_ia.py Show resolved Hide resolved
@cdrini cdrini added the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Oct 19, 2023
@cdrini
Copy link
Collaborator

cdrini commented Oct 19, 2023

Note I didn't test ; this seems pretty safe. (famous last words 😁 )

@cdrini cdrini added the On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing label Oct 19, 2023
@scottbarnes scottbarnes force-pushed the fix/ia_import_twice_queries_ia_metadata_api branch from c637964 to b8456b3 Compare October 19, 2023 22:05
@scottbarnes scottbarnes merged commit 01f6c6d into internetarchive:master Oct 19, 2023
3 checks passed
@scottbarnes scottbarnes deleted the fix/ia_import_twice_queries_ia_metadata_api branch October 19, 2023 22:12
@scottbarnes scottbarnes removed the Needs: Submitter Input Waiting on input from the creator of the issue/pr [managed] label Oct 23, 2023
@jimchamp jimchamp removed the On testing.openlibrary.org This PR has been deployed to testing.openlibrary.org for testing label Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 1 Do this week, receiving emails, time sensitive, . [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants