Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-VALIDATION_COUNTRYSTATEPROVINCE_UNAMBIGUOUS #201

Open
Tasilee opened this issue Aug 28, 2022 · 15 comments
Open

TG2-VALIDATION_COUNTRYSTATEPROVINCE_UNAMBIGUOUS #201

Tasilee opened this issue Aug 28, 2022 · 15 comments
Labels
Conformance CORE TG2 CORE tests Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY

Comments

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 28, 2022

TestField Value
GUID d257eb98-27cb-48e5-8d3c-ab9fca4edd11
Label VALIDATION_COUNTRYSTATEPROVINCE_UNAMBIGUOUS
Description Is the combination of the values of the terms dwc:country, dwc:stateProvince unique in the bdq:sourceAuthority?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:country
dwc:stateProvince
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are bdq:Empty; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved to a single result with a child-parent relationship in the bdq:sourceAuthority and the entity matching the value of dwc:country in the bdq:sourceAuthority is an ISO 3166 country-like administrative entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions COUNTRYSTATEPROVINCE_UNAMBIGUOUS
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}
Specification Last Updated 2024-09-18
Examples [dwc:country="Argentina", dwc:stateProvince="Rio Negro": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country and dwc:stateProvince are unambiguous"]
[dwc:country="", dwc:stateProvince="WA": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:country and dwc:stateProvince are ambiguous. Matches Western Australia, Washington State (US)"]
Source VertNet, Kurator
References
Example Implementations (Mechanisms) Kurator
Link to Specification Source Code https://github.com/kurator-org/kurator-validation/blob/master/packages/kurator_dwca/workflows/dwca_geography_assessor.yaml
Notes See table #95 (comment). A fail condition may arise from the content being internally inconsistent (not all of the information can be true at the same time), or from the vocabulary being incapable of uniquely resolving the combination of term values. This test specifically does not consider the content of dwc:higherGeography. If dwc:country contains a value and dwc:stateProvince does not, this test will return NOT_COMPLIANT. Use cases where knowledge to the level of country is adequate for the fitness of the data should not include this test. @tucotuco: "Of #200 and #201, #201 is the strongest test. If it passes for a record, #200 must necessarily also pass and doesn't tell you anything. If #201 fails,#200 could still pass and that would tell you that there are multiple matches on the dwc:country/dwc:stateProvince combo: It would tell you the nature of the problem. Along with #42 (dwc:country not empty), #200 would tell you whether there was an ambiguous combination of country (not empty) and dwc:stateProvince, such as would happen with Argentina/Buenos Aires. While if country is empty, then the ambiguity is purely at the dwc:stateProvince level".
@ArthurChapman
Copy link
Collaborator

Suggest modifying the Expected Response (changes in italics)

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@Tasilee
Copy link
Collaborator Author

Tasilee commented Aug 29, 2022

I don't think that is right. As per @tucotuco examples with #95, we are testing for ambiguity and one of the terms can be empty.

chicoreus added a commit that referenced this issue Aug 29, 2022
…w copy of the test specifications as of 2022-08-29 including the new tests #199, #200, and #201.
@Tasilee Tasilee added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Aug 29, 2022
@tucotuco
Copy link
Member

tucotuco commented Sep 4, 2022

I don't think that is right. As per @tucotuco examples with #95, we are testing for ambiguity and one of the terms can be empty.

I agree, it is correct as "INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY".

@chicoreus
Copy link
Collaborator

How about:

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved to a single result with a child-parent relationship in the bdq:sourceAuthority and the entity matching the value of dwc:country in the bdq:sourceAuthority is an ISO country-like entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT

@chicoreus
Copy link
Collaborator

This phrasing avoids a compliant result from missmapping of dwc:county onto stateProvince and stateProvince onto country, or instances where dwc:country and dwc:stateProvince are switched.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 9, 2022

Done

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 12, 2022

Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."

@ArthurChapman
Copy link
Collaborator

In the Notes the Reference to "See table #95 (comment)" (i.e. "See table #95 (comment))" will need to be updated - but not sure how we can reference the comment

#95 can be changed to "VALIDATION_GEOGRAPHY_CONSISTENT (78640f09-8353-411a-800e-9b6d498fb1c9)" but the comment and table won't appear there without us putting it somewhere we can reference it.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jul 3, 2023

Updated Parameter(s) value to align with other tests

@Tasilee
Copy link
Collaborator Author

Tasilee commented Jul 11, 2023

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" [https://www.getty.edu/research/tools/vocabularies/tgn/index.html]

to

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}

@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
@chicoreus
Copy link
Collaborator

Removed inaplicable "fail" text from note. This is covered by unambigous in the specification, and leading/trailing whitespace should not block matches.

@Tasilee
Copy link
Collaborator Author

Tasilee commented Aug 12, 2024

Updated Notes from @tucotuco's Comment #21 (comment) which I thought was needed here.

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 16, 2024
…ation of tdwg/bdq#200 with some support for caching of responses from Getty TGN.  Adding a minimal implementation of tdwg/bdq#32 with backing method to interpret a few common forms of verbatim latitudes and longitudes.
@ArthurChapman
Copy link
Collaborator

Altered Expected Response to add "administrative" entity

chicoreus added a commit to FilteredPush/geo_ref_qc that referenced this issue Aug 27, 2024
…ta, including updates and additions of backing methods, unit tets, and some cleanup. Includes bugfixes and improvements for tdwg/bdq#55 and substantive bugfixes to tdwg/bdq#201.
@Tasilee
Copy link
Collaborator Author

Tasilee commented Sep 18, 2024

Added 3166 qualifier to the ISO ref in the Expected Response and added two ISO 3166 references

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Conformance CORE TG2 CORE tests Parameterized Test requires a parameter SPACE Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 Validation VOCABULARY
Projects
None yet
Development

No branches or pull requests

4 participants