Data Quality Flags

Alex Thompson edited this page Mar 23, 2016 · 21 revisions

Purpose

This document describes how iDigBio identifies known data quality issues of ingested specimen data and represents them in the iDigBio Search API. During the ingestion process, iDigBio often encounters data that are missing, factually incorrect, or out of compliance with meta-data standards and controlled vocabularies. For example, Taxonomic Names are added from the GBIF Backbone Taxonomy. To facilitate indexing, corrections are made to these data and they are flagged in the search API.

Flags

The table below describes the flags currently used by iDigBio:

Flag Definition
datecollected_bounds Date Collected out of bounds (1700-01-02, Date of Indexing).
dwc_basisofrecord_paleo_conflict Basis of Record was not FossilSpecimen, but the record contains paleo context terms.
dwc_class_added Darwin Core Class Added. http://terms.tdwg.org/wiki/dwc:class
dwc_class_replaced Darwin Core Class Corrected.
dwc_continent_added Darwin Core Continent Added. http://terms.tdwg.org/wiki/dwc:continent
dwc_continent_replaced Darwin Core Continent Corrected.
dwc_country_added Darwin Core Country Added. http://terms.tdwg.org/wiki/dwc:country
dwc_country_replaced Darwin Core Country Corrected.
dwc_kingdom_added Darwin Core Kingdom Added. http://terms.tdwg.org/wiki/dwc:kingdom
dwc_kingdom_replaced Darwin Core Kingdom Corrected.
dwc_order_added Darwin Core Order Added. http://terms.tdwg.org/wiki/dwc:order
dwc_order_replaced Darwin Core Order Corrected.
dwc_phylum_added Darwin Core Phylum Added. http://terms.tdwg.org/wiki/dwc:phylum
dwc_phylum_replaced Darwin Core Phylum Corrected.
dwc_stateprovince_replaced Darwin Core State or Province Corrected.
geopoint_0_coord Geographic Coordinate had literal '0' values.
geopoint_bounds Geographic Coordinate was out of bounds.
geopoint_datum_error Geographic Coordinate has Invalid Geodetic Datum.
geopoint_datum_missing Geographic Coordinate Missing Geodetic Datum (Assumed to be WGS84).
geopoint_low_precision Geographic Coordinate has Low Precision.
geopoint_pre_flip Prior to examining other factors, the magnitude of latitude was determined to be greater than 180, and the longitude was less than 90, so their values were swapped.
geopoint_similar_coord Geographic Coordinate had similar latitude and longitude (+/- lat == +/- lon).
idigbio_isocountrycode_added iDigBio ISO 3166-1 alpha-3 Country Code Added. iDigBio correction table
rev_geocode_both_sign Geographic Coordinate had its Latitude and Longitude negated to place it in correct country.
rev_geocode_corrected The reverse geocoding process was able to find a coordinate operation that placed the point within the stated country.
rev_geocode_eez The Reverse geocode does not fall within the land boarders of a country, but does fall inside a countries' exclusive economic zone water boundary (approx. 200 miles from shore).
rev_geocode_eez_corrected The reverse geocoding process was able to find a coordinate operation that placed the point within the stated country's exclusive economic zone.
rev_geocode_failure The point was not able to be reverse geocoded to any country.
rev_geocode_flip Geographic Coordinate had its Latitude and Longitude swapped to place it in correct country.
rev_geocode_flip_both_sign Geographic Coordinate had its Latitude and Longitude both swapped and negated to place it in correct country.
rev_geocode_flip_lat_sign Geographic Coordinate had its Latitude and Longitude swapped, and its Latitude negated to place it in correct country.
rev_geocode_flip_lon_sign Geographic Coordinate had its Latitude and Longitude swapped, and its Longitude negated to place it in correct country.
rev_geocode_lat_sign Geographic Coordinate had its Latitude negated to place it in correct country.
rev_geocode_lon_sign Geographic Coordinate had its Longitude negated to place it in correct country.
rev_geocode_mismatch Geographic Coordinate did not reverse geocode to correct country.
scientificname_added Scientific name added by concatenating genus and species.

Query Examples

Searching records for the flag scientificname_added:

{
  "flags":"scientificname_added"
}
http://search.idigbio.org/v2/search/records?rq={%22flags%22:%22scientificname_added%22}

Searching my recordset records that are flagged with scientificname_added:

{
  "flags":"scientificname_added",
  "recordset":"c38b867b-05f3-4733-802e-d8d2d3324f84"
}
http://search.idigbio.org/v2/search/records?rq={%22flags%22:%22scientificname_added%22,%22recordset%22:%22c38b867b-05f3-4733-802e-d8d2d3324f84%22}