This repository has been archived by the owner on Jan 13, 2022. It is now read-only.
Retrieve sub providers within Smithsonian #455
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes
Fixes #454 by @ChariniNana, Related to #392, Related to #451
Description
This addresses the requirement of retrieving all sub providers within Smithsonian. There are two aspects to this requirement which are as follows:
Retrieve sub-providers at the API level, as and when pulling data from the Smithsonian API.
Update the existing Smithsonian related information present in the database to reflect the sub-provider information
Technical details
The content of the 'unit_code' field of the Smithsonian API response helps to identify the sub providers uniquely. We maintain a mapping of the sub provider name to the 'unit_code' value(s) to help with the sub provider retrieval. The 'unit_code' value is stored as meta data in the image store.
Since our requirement is to categorise every image under unique sub providers, we expect the 'unit_code' value of each image to correspond to some sup provider in our mapping. If we happen to encounter an unknown 'unit_code' we throw an error and terminate the program execution. Since the 'unit_code' values supported by Smithsonian can change over time, we need to have a mechanism of frequently checking whether our known set of unit code values is up to date. If such a mechanism is available, we can update the unit code, sub provider mapping prior to executing Smithsonian sub-provider retrieval, and avoid raising errors. This is monitored in a seperate ticket #451
The workflow
smithsonian_sub_provider_update_workflow
allows triggering the DB update related to Smithsonian sub-provider retrieval.Tests
test_process_image_data_with_sub_provider
withintest_smithsonian
test suite checks whether the source is properly set when a sub provider from our mapping is encountered.test_update_smithsonian_sub_providers
within test_sql checks the successful updating of the image table.test_smithsonian_dag_loads_with_no_errors
within thetest_sub_provider_update_workflow
test suite.Checklist
Update index.md
).master
branch of the repository.I added or updated documentation (if applicable).visible errors.
Developer Certificate of Origin
Developer Certificate of Origin