This repository has been archived by the owner on Jan 13, 2022. It is now read-only.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ChariniNana
requested review from
a team,
kss682 and
mathemancer
and removed request for
a team
July 19, 2020 21:52
…into smithsonian_unit_code_check
…into smithsonian_unit_code_check
mathemancer
suggested changes
Aug 3, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main change I'd request is to try to make sure not to duplicate any constants from other parts of the code base. Over time, this will make things easier to maintain. See my specific comments for details. Otherwise, please make sure that the function raises an exception if human intervention is required, since it's unlikely I'll remember to always check that table.
src/cc_catalog_airflow/dags/util/loader/smithsonian_unit_codes.py
Outdated
Show resolved
Hide resolved
src/cc_catalog_airflow/dags/util/loader/smithsonian_unit_codes.py
Outdated
Show resolved
Hide resolved
…ode for consistency
…into smithsonian_unit_code_check
mathemancer
approved these changes
Aug 4, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes
Fixes #451 by @ChariniNana
Description
This implementation helps keep all the the Smithsonian unit codes maintained in the
SMITHSONIAN_SUB_PROVIDERS
dictionary of theprovider_details.py
file up-to-date. The dictionarySMITHSONIAN_SUB_PROVIDERS
maintains all known unit codes associated with Smithsonian images to help with the retrieval of corresponding sub-provider values. However, if there's an update to the unit code values at the Smithsonian API level, if we are unaware of them, issues would arise when we attempt to retrieve Smithsonian sub provider values. Therefore, we have implemented a workflow which can be used to frequently check for potential changes to unit codes at the Smithsonian API level, and manually update theSMITHSONIAN_SUB_PROVIDERS
dictionary to reflect those changes.Technical details
The latest unit codes maintained at the Smithsonian API level for images can be retrieved by calling the following end point: https://api.si.edu/openaccess/api/v1.0/terms/unit_code?q=online_media_type:Images&api_key=REDACTED
We retrieve the latest unit codes by calling this endpoint, and any unit code that is currently not seen in the
SMITHSONIAN_SUB_PROVIDERS
dictionary is stored in a table calledsmithsonian_new_unit_codes
. The logic appears in thesmithsonian_unit_codes.py
program. The logic can be executed by triggering thecheck_new_smithsonian_unit_codes_workflow
via the Airflow UI, and you will see thesmithsonian_new_unit_codes
table getting updated with the latest unit codes we need to add to theSMITHSONIAN_SUB_PROVIDERS
dictionary. If no new unit codes are seen, thesmithsonian_new_unit_codes
table would be empty. Please not that a person who maintains the CC repo is expected to do the actual update in theSMITHSONIAN_SUB_PROVIDERS
dictionary.Tests
test_smithsonian_unit_codes.py
test suite checks that the new unit code retrieval logic works correctlytest_check_new_smithsonian_unit_codes_workflow.py
test suite verifies that the corresponding workflow dag is loaded properlyChecklist
Update index.md
).main
ormaster
).I added or updated documentation (if applicable).visible errors.
Developer Certificate of Origin
Developer Certificate of Origin