-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Map Aardvark.dct_spatial_sm to Locations@kind="Place Name" #146
Map Aardvark.dct_spatial_sm to Locations@kind="Place Name" #146
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The update looks good, and I think achieves what we need, but I might propose that we lean into the post-transform "hook" or "enrichment" (name TBD) pattern that was established in #141.
While I think it's agreed that we'll want to revist how to orchestrate these hooks, this second case of needing to do something very similar suggests there is a pattern that we can identify and abstract out.
Until then, I think we'd benefit from not locating this Subject --> Location
data extraction/duplication in the MITAardvark
transform only, lest we forget about it later and accidentally duplicate it's functionality. Moreover, if we lean into this pattern now and define it at the Transformer
level, I think it provides more concrete use cases to consider when that refactor happens in earnest.
More concretely, where we currently have:
fields = self.create_dates_and_locations_from_publishers(fields)
I might propose we add another like:
# post transform hooks and enrichments
fields = self.create_dates_and_locations_from_publishers(fields)
fields = self.create_locations_from_spatial_subjects(fields) # <--- new
The needle I think we want to thread is: making changes now that are solid and understandable, while understanding a future refactor may combine and extend this pattern that is emerging from these first two use cases.
Why these changes are being introduced: * Support place names for location-based searching How this addresses that need: * Add global Transformer method to create Location objects from spatial subjects Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-217
ec52a9c
to
95349bd
Compare
@ghukill Thank you for clarifying! I undid the changes to the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General structure, looks good! Thanks for going with this pattern of locating these "post hook" work together.
I left a syntactical / stylistic comment for the filtering of subjects
down to spatial ones. I find list comprehensions easier to scan than filter
+ lambda, but open to pushback!
if not (subjects := fields.get("subjects")): | ||
return fields | ||
|
||
if not ( | ||
spatial_subjects := list( | ||
filter( | ||
lambda subject: subject.kind == "Dublin Core; Spatial", subjects # type: ignore[arg-type] | ||
) | ||
) | ||
): | ||
return fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that I see the logic here, refining a list of potential subjects down using filter
+ a lambda, but perhaps this could be a simpler list comprehension?
# early return; handles edge case where subjects is value None and would return as such from .get()
if not fields.get("subjects"):
return fields
# filters list, potentially to empty list which is okay
spatial_subjects = [
subject
for subject in fields.get("subjects", []) # defaults to empty list
if subject.kind == "Dublin Core; Spatial"
and subject.value is not None
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree on list comprehension, that's what we've used throughout the app and I find it more readable as well
d47375c
to
df3ab26
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Purpose and background context
Map
Aardvark.dct_spatial_sm
toLocations@kind="Place Name"
to support the use of place namesfor location-based searching for GeoData.
Note:
MITAardvark
is the only source transformer that writesSubject@kind="Dublin Core; Spatial"
(i.e., whatAardvark.dct_spatial_sm
currently maps to), so the updated mapping is only applied to this transformer.How can a reviewer manually see the effects of these changes?
Temporarily set AWS credentials for
TimdexManagers
forDev1
in your terminal.Transform
gismit
recordstransmogrifier
:output/output_gismit.json
. Lines 67-88 should read:Transform
gisogm
records (a partial run, at least!)transmogrifier
:output/output_gisogm.json
. Lines 49-66 should read:Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?
https://mitlibraries.atlassian.net/browse/GDT-217
Developer
Code Reviewer(s)