-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG2-VALIDATION_COUNTRYCODE_NOTEMPTY #98
Comments
Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: |
This test should probably use the same word "EMPTY" as #20 instead of NULL. |
Need to add somewhere (Expected Response) a reference to ISO 3166. I have added a reference in the References. |
Edited your comment (odd that you can) to 3166. |
Thanks @Tasilee - was just about to make that correction. |
Looking at this one again, we aren't checking for a valid dwc:countryCode, only that it is not EMPTY. A reference to ISO 3166 is fine, but isn't needed in Expected response. |
Agreed. |
…st current (2023-06-09) SPACE test descriptions. Adding ProvidesVersion (and Specification) annotations. Removing now empty file stubs for checked methods. Addressed tdwg/bdq#98 VALIDATION_COUNTRYCODE_NOTEMPTY
Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated" |
By including this test in CORE we are asserting that any data from the high seas is not fit for any of the use cases that include this test. We need some specific recommendation for handling data from the High Seas. Country code, using the ISO list, should be empty for data from the high seas. This test needs some way to accomodate that to allow for data from the high seas being fit for use. |
I agree @chicoreus: CORE suggests a universal use case. Does this raise the need for two other use cases - terrestrial and marine ecology? We could set this test as Supplementary for terrestrial domain, and optionally generate an equivalent for the marine domain (using dwc:waterBody?). As you suggest, we may be able to accommodate by trying to detect marine domains (dwc:waterBody, dwc:decimalLatitude and dwc:decimalLongitude, dwc:minimumDepthInMeters and dwc:maximumDepthInMeters....or ?). The simplest ER would be something like "COMPLIANT if dwc:countryCode is NOT_EMPTY, or if any of wc:waterBody, dwc:minimumDepthInMeters or dwc:maximumDepthInMeters are NOT_EMPTY..."? |
dwc:waterBody includes rivers and lakes etc. which are inside countries. I think we also decided sometime earlier, that waterbody in TGN was unworkable. |
On Tue, 13 Aug 2024 15:57:56 -0700 Lee Belbin ***@***.***> wrote:
"COMPLIANT if dwc:countryCode is NOT_EMPTY, or if any of
wc:waterBody, dwc:minimumDepthInMeters or dwc:maximumDepthInMeters
are NOT_EMPTY..."?
dwc:waterBody does not help us. Parts of the Atlantic Ocean are high seas, parts are within the EEZs of various countries, similarly for many marine water bodies at many heirarchical levels...
|
The UN/LOCODE system uses "XZ" to represent international waters or high seas. This is not an official ISO country code but is commonly used in logistics and transportation systems. I think this could be a good solution for covering fitness for use of data from the high seas. |
Also, ZZ is an often used user-defined ISO code taken to mean "unknown". This would apply to situations where the location is unknown (i.e., not found or explicitly stated as unknown) as well as situations where the location is known, but can not be assigned to a single country code (e.g., "Argentina/Uruguay"). |
If we had dwc:decimalLatitude and dwc:decimal longitude, we may be able to use the shapefile download of country+EEZ at https://www.marineregions.org/downloads.php. We could set INTERNAL_PREREQUISITES_NOT_MET if we didn't have latitude and longitude. Just a long shot. If we can't do something like this, then I guess it is Immature/Incomplete until dwc:countryCode value of "XZ" becomes widely used? |
I don't see a problem - we are not checking against Country Codes with this test - just checking if it has something in the field or not. Where we look at Standard etc. we could check against "country codes + XZ" and add a note about XZ |
As this test now stands, I agree with @chicoreus in that we will be wrongfully returning NOT_COMPLIANT for any 'high seas' records. As this area is more than half of the planet, we need to take it seriously. We therefore have three options
I am slightly inclined to (3). |
I disagree - this test - like all other tests for NOTEMPTY - is only checking if there is a value in that field - it makes no assumption on why it is empty. It is a simple YES/NO test. |
@ArthurChapman, it is a concern here as well. For #98, if we
aspirationally assert that High Seas should use "XZ" as the country
code, then no problem, dwc:countryCode is expected to contain a value,
and the absence of value indicates an absence of quality.
However, if we don't take that position, and take the current Darwin Core guidance, where dwc:countryCode is expected to be left empty for the High Seas, then all data from the High Seas will, according to this test, lack quality, because it lacks a value.
That's the nature of the problem for #98.
…On Wed, 25 Sep 2024 15:12:07 -0700 Arthur Chapman ***@***.***> wrote:
@Tasilee - I think what you are saying applies to Tests #73 and #62-
Not this test. In those tests I think they could be worded
(especially #73) to include "or XZ ..." I'd have to look more
closely to those two tests and possibly comment there rather than
here.
|
I still don't see a problem as it still a valuable test. Many datasets would not hold both terrestrial and marine data, and we have a separate test for terrestrial/marine (that would include high seas). There are many tests that one could argue won't add quality in every case - I think we discussed that with at least one other test. But in many datasets it would add quality knowing this. In the NOTEMPTY tests we are testing one simple thing. We then have other tests that test for other things, and we could, as @tucotuco suggested under #73 (#73 (comment)), develop further tests for High Seas - my view is yes - we could do that - but lets leave that for after the Standard is published. Let's not continue adding and deleting tests at this stage. |
That is an easy posture to get behind at this point! |
I am very reluctant to release a suite of core tests that will assert
that a very large portion of the world's marine data is unfit for use
(for several Use Cases, spatial-temporal in particular). This is what
VALIDATION_COUNTRYCODE_NOTEMPTY is guananteed to do, with no path to
resolving that, unless we assert that High Seas should use the value
"XZ" for dwc:countryCode. This gives a path to data having quality.
There is a fundamental problem that we have to solve here. Otherwise
the test suite is not itself usable. There are multiple possible
solutions. The simplest is to assert that high seas data should use XZ
for the country code. The second simplest is to exclude this test from
core, but this doesn't resolve the issue for the other country code
tests...
We are realizing a problem late in the game, but it is one we must
resolve.
|
By saying that the COUNTRYCODE is EMPTY does not say that the data is Not fit for use. It depends on the use and the user has to make that decision. Anyone working in the marine area knows that marine data would not have a Country Code. There are so many other tests that test for NOTEMPTY - by saying they are EMPTY does not make then not fit for use. KINGDOM_NOTEMPTY, GEODETICDATUM_NOTEMPTY, EVENTDATE_NOTEMPTY. There are many other tests that return NOT_COMPLIANT that don't make the data NOT FIT FOR USE for many uses Don't read too much into what each of the tests are doing and not doing. The EMPTY/NOTEMPTY tests are just that! There is, or there is not something in the field. Other tests then do the next stages. Because we don't have a workflow and the tests are stand alone, means that in many cases that test alone won't tell you if the data is fit for your use. If we had a workflow order, you may do MARINETERRESTRIAL test first and then only run this test on Terrestrial data, but we don't do that. I don't see that there is anything to resolve. If we make a change here, then we have to revisit nearly every other test, because similar arguments could be made for many of the tests. |
@Tasilee wrote: "There is a fundamental problem that we have to solve here. Otherwise Put in the notes that "This test will return 'NOT_COMPLIANT' for records in the "High Seas". We recommend that high seas data use the dwc:countryCode = XZ". I would strongly oppose moving this and similar tests out of CORE. |
On Wed, 25 Sep 2024 17:30:53 -0700 Arthur Chapman ***@***.***> wrote:
By saying that the COUNTRYCODE is EMPTY does not say that the data is
Not fit for use.
That is exactly the semantics of NOT_COMPLIANT.
It depends on the use and the user has to make that
decision.
The user is free to compose their own use cases. We are asserting that Spatial-Temporal Patterns is a use case. VALIDATION_COUNTRYCODE_NOTEMPTY asserts NOT_COMPLIANT if dwc:countryCode is bdq:Empty. This means that any SingleRecord for which dwc:countryCode is bdq:Empty is not fit for use for that use case. This is exactly the sematnics of the test and the use case.
No Marine data from the high seas are fit for use for Spatial-Temporal Patterns, (unless they incorrectly contain a country code).
Quality Assurance will exclude any data that is NOT_COMPLIANT for any validation in the use case. That is a fundamental of the framework.
We can't get around that by saying that users can compose tests in ways they like. We are asserting a use case (as the framework requires us to).
This is not a problem we can avoid. We must solve it. We can't claim it doesn't exist. We must solve it.
|
On Wed, 25 Sep 2024 17:35:34 -0700 Arthur Chapman ***@***.***> wrote:
Put in the notes that "This test will return 'NOT_COMPLIANT' for
records in the "High Seas". We recommend that high seas data use the
dwc:countryCode = XZ".
This is a workable solution.
I would strongly oppose moving this and
similar tests out of CORE.
Likewise. This test has value (particularly with requirements for documenting origin of material under the convention on biological diversity).
|
I made a change to the Expected Response from COMPLIANT if dwc:countryCode is bdq:NotEmpty; otherwise NOT_COMPLIANT to COMPLIANT if dwc:countryCode is bdq:NotEmpty or has a value of "XZ"; otherwise NOT_COMPLIANT and updated the Notes to This test will return 'NOT_COMPLIANT' for records on the "High seas" where dwc:countryCode is bdq:Empty. We recommend that data from the high seas (outside national jurisdictions) use dwc:countryCode = "XZ" and dwc:country = "High seas" until an agreement has been made. |
The text was updated successfully, but these errors were encountered: