generated from MITLibraries/python-cli-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Normalize to controlled terms for dct_format_s and gbl_resourceType_s…
…m fields Why these changes are being introduced: Two fields written to the outputted MITAardvark records, 'dct_format_s' and 'gbl_resourceType_sm', did not have their values controlled to suggested values from the Aardvark schema. This was revealed when attempting to map facet filters from the geo TIMDEX UI that rely on these fields. While updates are still required in Transmogrifier and the TIMDEX data model for where these values end up, normalizing them to controlled terms will benefit the quality of the data for facet aggregations. How this addresses that need: * sets of controlled terms have been added to records.controlled_terms * method SourceRecord.get_controlled_dct_format_s_term() created to normalize values from source metadata * method SourceRecord.get_controlled_gbl_resourceType_sm_terms() created to normalize values from source metadata * these two new methods applied to FGDC, ISO19139, GBL1, and Aardvark source classes Side effects of this change: * Normalization of data for dct_format_s and gbl_resourceType_sm fields Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/GDT-195
- Loading branch information
Showing
14 changed files
with
330 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
"""harvester.records.controlled_terms""" | ||
|
||
# Controlled terms for Aardvark field: "dct_format_s" | ||
# https://opengeometadata.org/ogm-aardvark/#format | ||
DCT_FORMAT_S_OGM_TERMS = { | ||
"ArcGRID", | ||
"CD - ROM", | ||
"DEM", | ||
"DVD - ROM", | ||
"Feature", | ||
"Class", | ||
"Geodatabase", | ||
"GeoJPEG", | ||
"GeoJSON", | ||
"GeoPackage", | ||
"GeoPDF", | ||
"GeoTIFF", | ||
"JPEG", | ||
"JPEG2000", | ||
"KML", | ||
"KMZ", | ||
"LAS", | ||
"LAZ", | ||
"Mixed", | ||
"MrSID", | ||
"PDF", | ||
"PNG", | ||
"Pulsewaves", | ||
"Raster", | ||
"Dataset", | ||
"Shapefile", | ||
"SQLite", | ||
"Database", | ||
"Tabular", | ||
"Data", | ||
"TIFF", | ||
} | ||
|
||
# https://opengeometadata.org/ogm-aardvark/#resource-type-values-loc | ||
# note: suggested most applicable to scanned maps | ||
GBL_RESOURCETYPE_SM_LOC_TERMS = { | ||
"Aerial photographs", | ||
"Aerial views", | ||
"Aeronautical charts", | ||
"Armillary spheres", | ||
"Astronautical charts", | ||
"Astronomical models", | ||
"Atlases", | ||
"Bathymetric maps", | ||
"Block diagrams", | ||
"Bottle-charts", | ||
"Cadastral maps", | ||
"Cartographic materials", | ||
"Cartographic materials for people with visual disabilities", | ||
"Celestial charts", | ||
"Celestial globes", | ||
"Census data", | ||
"Children's atlases", | ||
"Children's maps", | ||
"Comparative maps", | ||
"Composite atlases", | ||
"Digital elevation models", | ||
"Digital maps", | ||
"Early maps", | ||
"Ephemerides", | ||
"Ethnographic maps", | ||
"Fire insurance maps", | ||
"Flow maps", | ||
"Gazetteers", | ||
"Geological cross-sections", | ||
"Geological maps", | ||
"Globes", | ||
"Gores (Maps)", | ||
"Gravity anomaly maps", | ||
"Index maps", | ||
"Linguistic atlases", | ||
"Loran charts", | ||
"Manuscript maps", | ||
"Mappae mundi", | ||
"Mental maps", | ||
"Meteorological charts", | ||
"Military maps", | ||
"Mine maps", | ||
"Miniature maps", | ||
"Nautical charts", | ||
"Outline maps", | ||
"Photogrammetric maps", | ||
"Photomaps", | ||
"Physical maps", | ||
"Pictorial maps", | ||
"Plotting charts", | ||
"Portolan charts", | ||
"Quadrangle maps", | ||
"Relief models", | ||
"Remote-sensing maps", | ||
"Road maps", | ||
"Statistical maps", | ||
"Stick charts", | ||
"Strip maps", | ||
"Thematic maps", | ||
"Topographic maps", | ||
"Tourist maps", | ||
"Upside-down maps", | ||
"Wall maps", | ||
"World atlases", | ||
"World maps", | ||
"Worm's-eye views", | ||
"Zoning maps", | ||
} | ||
|
||
# https://opengeometadata.org/ogm-aardvark/#resource-type-values-ogm | ||
# note: suggested most applicable to geospatial data | ||
GBL_RESOURCETYPE_SM_OGM_TERMS = { | ||
"Annotations", | ||
"Basemaps", | ||
"LiDAR", | ||
"Line data", | ||
"Mesh data", | ||
"Multi-spectral data", | ||
"Oblique photographs", | ||
"Point cloud data ", | ||
"Point data", | ||
"Polygon data", | ||
"Raster data", | ||
"Satellite imagery", | ||
"Streetview photographs", | ||
"Table data", | ||
} | ||
|
||
# Controlled terms for Aardvark field: "gbl_resourceType_sm" | ||
# note: controlled terms are allowed from LOC or OGM terms | ||
GBL_RESOURCETYPE_SM_TERMS = GBL_RESOURCETYPE_SM_LOC_TERMS.union( | ||
GBL_RESOURCETYPE_SM_OGM_TERMS | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.