Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_COUNTRYCODE_FROM_COORDINATES #73

Open
iDigBioBot opened this issue Jan 5, 2018 · 111 comments
Open

TG2-AMENDMENT_COUNTRYCODE_FROM_COORDINATES #73

iDigBioBot opened this issue Jan 5, 2018 · 111 comments
Labels
Amendment Completeness CORE TG2 CORE tests ISO/DCMI STANDARD Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID 8c5fe9c9-4ba9-49ef-b15a-9ccd0424e6ae
Label AMENDMENT_COUNTRYCODE_FROM_COORDINATES
Description Proposes an amendment to the value of dwc:countryCode if dwc:decimalLatitude and dwc:decimalLongitude fall within a boundary from the bdq:countryShapes that is attributable to a single valid country code.
TestType Amendment
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:countryCode
dwc:decimalLatitude
dwc:decimalLongitude
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either dwc:decimalLatitude or dwc:decimalLongitude is EMPTY, or if dwc:countryCode is NOT_EMPTY; FILLED_IN dwc:countryCode if dwc:decimalLatitude and dwc:decimalLongitude fall within a boundary from the bdq:countryShapes that is attributable to a single valid country code; otherwise NOT_AMENDED.
Data Quality Dimension Completeness
Term-Actions COUNTRYCODE_FROM_COORDINATES
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "ADM1 boundaries spatial UNION with Exclusive Economic Zones" {[https://gadm.org] spatial UNION [https://marineregions.org]}
Specification Last Updated 2024-08-18
Examples [dwc:decimalLatitude="-25.23", dwc:decimalLongitude="135.43", dwc:countryCode="": Response.status=FILLED_IN, Response.result=dwc:countryCode="AU", Response.comment="dwc:decimalLatitude and dwc:decimalLongitude contain interpretable values"]
[dwc:decimalLatitude="-38.280937", dwc:decimalLongitude="72.047790", dwc:countryCode="": Response.status=NOT_AMENDED, Response.result="", Response.comment="Coordinates do not fall in the boundary of any country"]
Source ALA, GBIF, iDigBio
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes This amendment simply fills dwc:countryCode from a lookup of dwc:decimalLatitude and dwc:decimalLongitude. dwc:coordinateUncertaintyInMeters and dwc:coordinatePrecicision (if present) imply a buffer around the provided coordinates. Likewise, country polygons cannot be 100% accurate at all scales (Dooley 2005), so a spatial buffer of the country boundaries is also justified. Taking spatial buffers into account does however greatly complicate the logic and the implementation of this and related tests. In this test, a detection of multiple country codes by sampling within the buffer while possible, is not considered.
@iDigBioBot
Copy link
Collaborator Author

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet:
Only useful if performed AFTER decimalLat and decimalLong interpretation.

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
@pz amendment sequence is indeed important. Tianhong Song has a paper on this in the context of prerequisites for workflow actors.

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
Implementation requires guidance on how to handle marine material inside a country's exclusive economic zone.

@iDigBioBot
Copy link
Collaborator Author

Comment by Paul Morris (@chicoreus) migrated from spreadsheet:
Should include coordinateUncertainty in meters. Also see Lee's note in Principles about some geographic tests needing buffers

@ArthurChapman
Copy link
Collaborator

Change Country to Country Codes - name and elsewhere

@godfoder godfoder changed the title TG2-AMENDMENT_COUNTRY_FROM_COORDINATES TG2-AMENDMENT_COUNTRYCODE_FROM_COORDINATES Jan 17, 2018
@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 17, 2018
@tucotuco
Copy link
Member

Agreed at TDWG 2018 DQIG meeting that the name TG2-AMENDMENT_COUNTRYCODE_FROM_COORDINATES is satisfactory.

@tucotuco
Copy link
Member

Similar to the problem raised in Issue #185, this tests mentions a source authority that can not deliver the AMENDMENT. It is also silent on the authority for the geometries for the country codes. To me, the bdq:sourceAuthority should be the GBIF reverse geocoding API (https://github.com/gbif/geocode), coming in 2020. It will be based on Natural Earth, GADM, Open Street Maps, EEZones and more. The documentation says it will be for internal GBIF use, but @timrobertson100 says that he expects the API to be exposed.

@timrobertson100
Copy link
Member

timrobertson100 commented Apr 14, 2020

The geocode service is available today, and provides a lookup based on the given coordinate.

For example latitude 51.0 and longitude 1.0 yields this response:

[
  {
    "id": "80",
    "type": "Political",
    "source": "http://www.naturalearthdata.com",
    "title": "United Kingdom",
    "isoCountryCode2Digit": "GB"
},
{
    "id": "212",
    "type": "EEZ",
    "source": "http://vliz.be/vmdcdata/marbound/",
    "title": "United Kingdom",
    "isoCountryCode2Digit": "GB"
    }
]

Because boundaries don't align (resolution of the polygons), we buffer the search which is why multiple results can be returned.

To reduce WS traffic, we also encode the database into an image with dictionary encoded colors (i.e. by seeing a non-black or white colour, you can refer to the dictionary to know the country).

Today the service only has EEZ and NaturalEarth files, but can (will) be extended.

When a record has a stated country and coordinates we verify that seems reasonable, and if not flips coordinates around and negates them "hunting" for a match. This is because in many cases the negative sign is omitted, or coordinates swapped. All of this happens after a reprojection to WGS84 if necessary.

@tucotuco
Copy link
Member

@timrobertson100 This is ideal. May we cite it as our bdq:sourceAuthority default?

@chicoreus
Copy link
Collaborator

We are conflating authority with service with thesaurus in bdq:sourceAuthority. The authority for country codes is the ISO two letter country code list. The GBIF service is a service that wraps natural earth plus (source?) EEZ layers (and will change overtime as layers are added (with versioning?), the thesaurus is the natural earth and EEZ layers. For implementation, I'd much rather use a local GIS data store containing natural earth data and some appropriate EEZ layer than consulting a remote service - I want to use the same thesaurus, but not the service.

@tucotuco
Copy link
Member

tucotuco commented Apr 14, 2020 via email

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 14, 2020

Thanks @timrobertson100 , @tucotuco and @chicoreus. I am the least able person in the group to provide wisdom but figure I would put my thoughts down, anyway.

As we have discussed, GBIF is likely for the near future, to be a service location that aggregates thesauri and what we currently call 'bdq:sourceAuthority' (authorities).

Are the thesauri themselves 'source authorities'? If we are dependent on them as the reference, then in our context, I'd say yes. I agree with @chicoreus that the services associated with any source authority are a separate issue. That's why we use " if the bdq:sourceAuthority service ..."

To address @tucotuco 's question about how we reference, we either reference to the GBIF namespace end point (?) which references the relevant external source authority (e.g., ISO) in a standard way or we reference both the GBIF and the 'external' source authority. The former would be nice. Then there is a separate reference to the service associated with the thesaurus.

In the Expected response, we have agreed to use bdq:sourceAuthority [and service.] where there are implementation-dependent options or directly use the name of the source authority where there is no choice.

What we place in References and Notes, I defer to more appropriate authorities.

@ArthurChapman
Copy link
Collaborator

My view is same as @tuco and his question to @timrobertson100
"@timrobertson100 This is ideal. May we cite it as our bdq:sourceAuthority default"

If can just add the GBIF Geocode Authority as our bdq:sourceAuthority as we have done elsewhere, then I think this is our best option.

@chicoreus
Copy link
Collaborator

The more I think about this, the less I like the idea of specifying a query endpoint as a sourceAuthority. This leaves the resolution of the shape files, buffering near borders, and changes over time as opaque to the consumer, and also forces implementations to use a service for a test that should not be implemented with remote service calls but with a local spatial data store.

We should specify three parameters: (1) A shape file for country boundaries. (2) A shape file for exclusive economic zones. (3) An explicit buffer for points that fall near country boundaries that takes into account the resolution of the shapefiles. In addition, we should explicilty state in the specification how points that fall into the buffer are to be handled (preferably by not asserting an amendment, probably with INTERNAL_PREREQUISITES_NOT_MET, point falls too near boundary of shape to determine placement). In addition, we need to consider coordinatePrecision, and how that as an uncertainty on the coordinate intersects with countries or buffers.

We need to be much, much more explicit in how edge cases are to be handled in any test that involves GIS data.

This amendment also needs to cover the case of FILLED_IN, AMENDED would only cover an existing case of countryCode being altered based on the coordinates. We should consider if this test should ever assert AMENDED, or should restrict itself to FILLED_IN. I would generally be much more comfortable with asserting only FILLED_IN, as AMENDED could fix the wrong value (the coordinates, or their precision, or their error radius could be in error, but the country and countryCode be correct). This is particularly true near boundaries, where the error may lie in the resolution of the shape rather than either the coordinate or the countryCode.

@chicoreus
Copy link
Collaborator

I would suggest:

EXTERNAL_PREREQUISITES_NOT_MET if an external source authority service or local spatial data store was not available; INTERNAL_PREREQUISITES_NOT_MET if the fields dwc:decimalLatitude, dwc:decimalLongitude are EMPTY or the dwc:decimalLatitude and dwc:decimalLongitude cannot be converted to the SRS used for queries to the spatial service or data store, or if the area represented by the dwc:decimalLatitude, dwc:decimalLongitude, and dwc:coordinatePrecision overlaps with a 3km buffer zone on any country or EEZ shape in the service or spatial data store; FILLED_IN if the value of dwc:countryCode was EMPTY and was unambiguously inferred from supplied dwc:decimalLatitude, dwc:decimalLongitude and dwc:coordinatePrecision falling within a single boundary defined by the combination of terrestrial and exclusive economic zone and not overlapping a 3km buffer around such boundary; otherwise NOT_CHANGED

@ArthurChapman
Copy link
Collaborator

I would stay with AMENDED and NOT_AMENDED

chicoreus added a commit that referenced this issue Mar 30, 2022
…ons as of 2022-03-29 export has fixes for errors in #73 and #68 found in f8ff0d4.
@Tasilee
Copy link
Collaborator

Tasilee commented Apr 18, 2022

Changed "AMENDED" to "FILLED_IN" in accordance with discussions April 16.

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 21, 2022

@ArthurChapman and I have discussed my recent changes to

  • Expected Response (generic use of ISO 3166)
  • Source Authority (added gadm and ISO)
  • References (Do we duplicate the links to ISO and GADM?)

Also, this test no longer uses bdq:spatialBufferInMeters so the question needs to be asked if #50, #51 and #56. To be discussed at next Zoom. Also use of SRS? We need to be consistent.

@ArthurChapman
Copy link
Collaborator

We have dwc:coordinatePrecision and dwc:geodeticDatum in the Information Elements but they do not occur in the Expected Response?

Should these be deleted from the Information Elements or is our Expected Response deficient?

Note that dwc:coordinatePrecicision and dwc:coordinateUncertaintyInMeters occur in the Notes

@ArthurChapman
Copy link
Collaborator

in Expected Response bdq:sourceAuthority[countryShapes] changed to bdq:sourceAuthority[countryshapes] to conform with other similar terms under bdq:sourceAuthority

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 28, 2023

If we believe the Expected response is correct, then dwc:coordinatePrecision and dwc:geodeticDatum should not be information elements. Were we thinking a buffer?

@chicoreus
Copy link
Collaborator

The notes suggest using dwc:coordinatePrecision and dwc:geodeticDatum to specify a buffer, but also invoke the a buffer associated with the country boundary representation. But, we don't list a buffer as an element in the specification. So, in essence, we say both that a buffer should be used and that a buffer is not used...

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 29, 2023

Here are the tests that include buffer

#50

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if one or more of dwc:decimalLatitude, dwc:decimalLongitude, or dwc:countryCode are EMPTY or invalid; COMPLIANT if the geographic coordinates fall on or within the boundary defined by the union of the boundary of the country from dwc:countryCode plus it's Exclusive Economic Zone, if any, plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT.

#51

EXTERNAL_PREREQUISITES_NOT_MET if either bdq:sourceAuthority[taxonomyismarine] or bdq:sourceAuthority[geospatialland] are not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:dcientificName was EMPTY or the marine/non-marine status of the taxon is not interpretable from bdq:sourceAuthority[taxonomyismarine] or the values of dwc:decimalLatitude or dwc:decimalLongitude are EMPTY; COMPLIANT if the taxon marine/non-marine status from bdq:sourceAuthority[taxonomyismarine] matches the marine/non-marine status of dwc:decimalLatitude and dwc:decimalLongitude on the boundaries given by bdq:sourceAuthority[geospatialland] plus an exterior buffer given by bdq:spatialBufferInMeters; otherwise NOT_COMPLIANT.

#56

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority was not available; INTERNAL_PREREQUISITES_NOT_MET if the values of dwc:decimalLatitude, dwc:decimalLongitude, or dwc:stateProvince are EMPTY or invalid; COMPLIANT if the geographic coordinates fall on or within the boundary from the bdq:sourceAuthority for the given dwc:stateProvince (after coordinate reference system transformations, if any, have been accounted for), or within the distance given by bdq:spatialBufferInMeters outside that boundary; otherwise NOT_COMPLIANT.

BUT, I don't think a buffer is justified in this AMENDMENT. We are not comparing two things. We are looking up dwc:countryCode from coordinates. Using dwc:coordinateUncertaintyInMeters or dwc:coordinatePrecision or dwc:geodeticDatum wouldn't help, unless the test matches to country - which it doesn't of course.

So, I am going to amend the specs accordingly, and the rest of you can argue the case...

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 29, 2023

Was this the intent?

Notes

This amendment simply fills dwc:countryCode from a lookup of dwc:decimalLatitude and dwc:decimalLongitude. dwc:coordinateUncertaintyInMeters and dwc:coordinatePrecicision (if present) imply a buffer around the provided coordinates. Likewise, country polygons cannot be 100% accurate at all scales (Dooley 2005), so a spatial buffer of the country boundaries is also justified. Taking spatial buffers into account does however greatly complicate the logic and the implementation of this and related tests. In this test, a detection of multiple country codes by sampling within the buffer while possible, is not considered.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jun 29, 2023

Another issue with this test (and the others where we have bdq:sourceAuthority[xxxx]). If these are separated into different terms in the Vocabulary as suggested as a possibility in #152 (comment) - then we would need to change Parameter(s) to bdq:sourceAuthority to "bdq:sourceAuthority[countryshapes]" and Source Authority to "bdq:sourceAuthority default = "ADM1 boundaries" [https://gadm.org] UNION with "EEZs" [https://marineregions.org]"

ALSO - we have in bdq:souceAuthority "bdq:sourceAuthority[countryCode]" but this term doesn't occur anywhere in the test (or any other test) and is not in the Vocabulary - should this be bdq:sourceAuthority[countryshapes]???

@Tasilee
Copy link
Collaborator

Tasilee commented Jun 30, 2023

I have updated the Source authority and description

@ArthurChapman
Copy link
Collaborator

I don't think there is anything outstanding wrt NEEDS WORK - other than a decision on how we treat bdq:sourceAuthority[countryshapes]

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jul 4, 2023

Updated Expected Response, bdq:sourceAuthority and Description to replace bdq:sourceAuthority:[countryShapes] with bdq:countryShapes and Specification Last Updated

Changed Parameter(s) from bdq:sourceAuthority to bdq:countryShapes

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

I thought @tucotuco's suggestion was bdq:sourceAuthority_countryshapes (or maybe "countryShapes") etc? That way, we don't loose the source authority link.

@ArthurChapman
Copy link
Collaborator

We don't use that for any other bdq: terms But if all agree would could change it - but see the Vocab or #205 for all the terms we would have to change

| bdq:annotationAlertIf | [https://github.com//issues/152] | |
| bdq:countryShapes | [https://github.com//issues/152] | |
| bdq:annotationAnnotationSystem | [https://github.com//issues/152] | |
| bdq:defaultGeodeticDatum | [https://github.com//issues/152] | |
| bdq:earliestValidDate | [https://github.com//issues/152] | |
| bdq:geospatialLand | [https://github.com//issues/152] | |
| bdq:includeEventDate | [https://github.com//issues/152] | |
| bdq:latestValidDate | [https://github.com//issues/152] | |
| bdq:maximumValidDepthInMeters | [https://github.com//issues/152] | |
| bdq:maximumValidElevationInMeters | [https://github.com//issues/152] | |
| bdq:minimumValidDepthInMeters | [https://github.com//issues/152] | |
| bdq:minimumValidElevationInMeters | [https://github.com//issues/152] | |
| bdq:sourceAuthority | [https://github.com//issues/152] | |
| bdq:spatialBufferInMeters | [https://github.com//issues/152] | |
| bdq:targetCRS | [https://github.com//issues/152] | |
| bdq:taxonomyIsMarine | [https://github.com//issues/152] | |

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 4, 2023

Thanks @ArthurChapman. The list includes terms that do relate to a bdq:sourceAuthority (annotations, country shapes/land, geodetic datum) but the rest don't really, unless you want to stretch the concept.

I have therefore changed my mind (backflip?) and think that the syntax like "bdq:countryShapes" seems expedient as all bdqs are listed under the 'heading' "Source Authority" anyway.

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 6, 2023

Aligned the Source Authority entries between #50 and this test

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 11, 2023

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "ADM1 boundaries" {[https://gadm.org] spatial UNION with "Exclusive Economic Zones" [https://marineregions.org]}

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jul 25, 2023
…-06-28) specifications. Addressing tdwg/bdq#73 AMENDMENT_COUNTRYCODE_FROM_COORDINATES adding implementation, adding a utility to lookup a country code from a point using the combined natural earth countries and EEZ data sets.  Adding unit tests and integration tests looping through a (mostly correct) list of center points of countries.
chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Jul 28, 2023
… newer coldfusion. Adding TODO notes for tdwg/bdq#73 based on failures with the validation data.  Cleaning up indentation for clarity.
@Tasilee
Copy link
Collaborator

Tasilee commented Jul 31, 2023

In the light of #50 suggestion, changed Source Authority to

| Source Authority | bdq:sourceAuthority default = "ADM1 boundaries spatial UNION with Exclusive Economic Zones" {[https://gadm.org] spatial UNION [https://marineregions.org]} |

@Tasilee
Copy link
Collaborator

Tasilee commented Jul 31, 2023

Changed Parameter(s) from bdq:countryShapes to bdq:sourceAuthority and presume we will leave it Parameterized.

@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@Tasilee
Copy link
Collaborator

Tasilee commented Apr 16, 2024

Standardized reference to bdq:sourceAuthority in Expected Response to "EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available"

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 18, 2024

Changed Expected Response from

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either dwc:decimalLatitude or dwc:decimalLongitude is EMPTY or uninterpretable, or if dwc:countryCode is NOT_EMPTY; FILLED_IN dwc:countryCode if dwc:decimalLatitude and dwc:decimalLongitude fall within a boundary from the bdq:countryShapes that is attributable to a single valid country code; otherwise NOT_AMENDED.

to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either dwc:decimalLatitude or dwc:decimalLongitude is EMPTY, or if dwc:countryCode is NOT_EMPTY; FILLED_IN dwc:countryCode if dwc:decimalLatitude and dwc:decimalLongitude fall within a boundary from the bdq:countryShapes that is attributable to a single valid country code; otherwise NOT_AMENDED.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Completeness CORE TG2 CORE tests ISO/DCMI STANDARD Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY
Development

No branches or pull requests

7 participants