Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_IDENTIFICATIONQUALIFIER_FROM_TAXON #106

Closed
iDigBioBot opened this issue Jan 5, 2018 · 25 comments
Closed

TG2-AMENDMENT_IDENTIFICATIONQUALIFIER_FROM_TAXON #106

iDigBioBot opened this issue Jan 5, 2018 · 25 comments
Labels
Amendment Completeness Immature/Incomplete A test where substantial work is needed to develop the specification to the point where the test ca NAME Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY

Comments

@iDigBioBot
Copy link
Collaborator

iDigBioBot commented Jan 5, 2018

TestField Value
GUID 65c5595b-6229-4f89-98e9-7a62dbda492d
Label AMENDMENT_IDENTIFICATIONQUALIFIER_FROM_TAXON
Description Can an identification qualifier be extracted from related taxon terms?
TestType Amendment
Darwin Core Class Taxon, Identification
Information Elements ActedUpon dwc:identificationQualifier
Information Elements Consulted dwc:scientificName
dwc:specificEpithet
dwc:infraspecificEpithet
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES NOT_MET if all of the taxon name fields were EMPTY or the field dwc:identificationQualifier was not EMPTY; AMENDED if the field dwc:identificationQualifier was FILLED_IN from any of the fields dwc:scientificName, dwc:specificEpithet or dwc:infraspecificEpithet; otherwise NOT_AMENDED
Data Quality Dimension Completeness
Term-Actions IDENTIFICATIONQUALIFIER_FROM_TAXON
Parameter(s) bdq:sourceAuthority
Source Authority default = "Darwin Core Identification Qualifier" {[https://dwc.tdwg.org/list/#identificationQualifier]} {dwc:identificationQualitifer vocabulary API [NO CURRENT API EXISTS]}
Specification Last Updated 2024-04-16
Examples [dwc:scientificName="Quercus aff. agrifolia var. oxyadenia", dwc:identificationQualifier="": Response.status=AMENDED, Response.result=dwc:identificationQualifier="aff. agrifolia var. oxyadenia", Response.comment="dwc:scientificName contains an interpretable dwc:identificationQualifier"]
[dwc:scientificName="Quercus", dwc:identificationQualifier="": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:scientificName does not contain an interpretable dwc:identificationQualifier"]
Source VertNet
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes dwc:genus is not included as an Information Element because if a "?" is present only in dwc:genus but not in dwc:scientificName, then by the Darwin Core definition of genus, this implies an uncertainty about placement in the classification rather than uncertainty about the identification (determination). We use a vocabulary to detect an identificationQualifier as a token, but the resulting dwc:identificationQualifier itself need not necessarily follow a controlled vocabulary.
@iDigBioBot
Copy link
Collaborator Author

Comment by John Wieczorek (@tucotuco) migrated from spreadsheet:
Name fields would be replaced with amended names and identification qualifier(s) put in identificationQualifier.

@iDigBioBot
Copy link
Collaborator Author

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet:
Should follow IDENTIFIER_QUALIFIER_DETECTED

@iDigBioBot
Copy link
Collaborator Author

Comment by John Wieczorek (@tucotuco) migrated from spreadsheet:
Can use a vocabulary to detect identificationQualifier as a token, but the resulting identificationQualifier need not necessarily follow a controlled vocabulary. For examples, see the description for identificationQualifier, where the names are included as well.

@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 18, 2018
@ArthurChapman ArthurChapman moved this from OTHER TESTS to NAME TESTS in Core Tests and Assertions (TG2) Aug 27, 2018
@tucotuco tucotuco added the Parameterized Test requires a parameter label Nov 5, 2018
@tucotuco
Copy link
Member

The notes could use some cleanup here (full sentences) for clarity.

@Tasilee
Copy link
Collaborator

Tasilee commented Jan 24, 2019

Is better now?

@tucotuco
Copy link
Member

Happy face applied.

@ianengelbrecht
Copy link
Collaborator

Same note here about question marks in genus as for VALIDATION_IDENTIFICATIONQUALIFIER_DETECTED. More importantly though, the internal prerequisite is that dwc:identificationQualifier is empty. What happens if it's not, and we find a qualifier in one of the taxon name fields that does not match? For example might be dwc:scientificName="Quercus aff. agrifolia var. oxyadenia" and dwc:identificationQualifier = "cf.". Not sure if this would ever happen in real examples though.

@ianengelbrecht
Copy link
Collaborator

Oh, and does the ammendment require that the qualifier be removed from the taxon name field it was found in too?

@ArthurChapman
Copy link
Collaborator

@ianengelbrecht Good questions. @chicoreus do we need to discuss this one further iun the light of these questions?

@chicoreus
Copy link
Collaborator

The intent of the test for identificationQualifier not empty is to prevent this test from suggesting a change to an existing value, internal prerequisites not met isn't the right response for that case, I don't think.

Instead of:

EXTERNAL_PREREQUISITES_NOT_MET if the specified source authority service was not available; INTERNAL _PREREQUISITES NOT_MET if all of the taxon name fields were EMPTY or the field dwc:identificationQualifier was not EMPTY; AMENDED if the field dwc:identificationQualifier was FILLED_IN from any of the fields dwc:scientificName, dwc:specificEpithet or dwc:infraspecificEpithet; otherwise NOT_CHANGED

We should have:

EXTERNAL_PREREQUISITES_NOT_MET if the specified source authority service was not available; INTERNAL _PREREQUISITES NOT_MET if all of the taxon name fields were EMPTY; AMENDED if the field dwc:identificationQualifier was FILLED_IN from any of the fields dwc:scientificName, dwc:specificEpithet or dwc:infraspecificEpithet; NOT_CHANGED if the the field dwc:identificationQualifier was not EMPTY; otherwise NOT_CHANGED

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 18, 2019

@chicoreus: Hmm, ok. I can live with that. Other comments before I race off to edit?

@ianengelbrecht
Copy link
Collaborator

ianengelbrecht commented Aug 18, 2019 via email

@ArthurChapman
Copy link
Collaborator

I can't see any easy way of writing this into the Expected Response. We could if needed but we can do it by just changing the Notes:

The AMENDMENT is made by finding the qualifier as a token within dwc:scientificName; if the first encountered match is inside the string, then place text from the qualifier to the end of the string in dwc:identificationQualifier, if the qualifier is first encountered at the end of the string, place the entire string in dwc:identificationQualifier.

Note that dwc:genus is not included as an Information Element because if a "?" is present only in dwc:genus but not in dwc:scientificName, then by the Darwin Core definition of genus, this implies an uncertainty about placement in the classification rather than uncertainty about the identification (determination). We use a vocabulary to detect an identificationQualifier as a token, but the resulting dwc:identificationQualifier itself need not necessarily follow a controlled vocabulary.

@chicoreus
Copy link
Collaborator

For a small vocabulary, (?, cf. nr.), this is probably tractable, but in the general case, we probably can't tell all possible other text from a qualifier.

@ArthurChapman
Copy link
Collaborator

Due to the complications in implementing this test and #97, I vote that we move them to Supplemental. I still believe that they are valuable tests, but then there are a lot of tests within the Supplemental tests that I would hope would be implemented a later date. But for now, I think the difficulties in implementing these two tests make them impractical at this time. The only alternative I see would be for a modification of #97 that just flags any record that has a qualifier - in any of the taxonomic fields +dwc:identification qualifier.

@Tasilee
Copy link
Collaborator

Tasilee commented May 21, 2020

On the basis of @pzermoglio 's research which indicates more than a thousand identification qualifier variants, AMENDMENTs based on their detection is fraught with issues. I'd suggest we set this to NOT CORE and hope that Paula's work will elevate the issues and that results in a solution (but I am not holding my breath).

@Tasilee
Copy link
Collaborator

Tasilee commented May 25, 2020

I vote to not include this AMENDMENT as CORE

@tucotuco
Copy link
Member

tucotuco commented May 25, 2020 via email

@Tasilee Tasilee added Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. and removed NEEDS WORK labels May 25, 2020
@Tasilee Tasilee closed this as completed May 25, 2020
@chicoreus
Copy link
Collaborator

@Tasilee I concur. A basic implementation using a small vocabulary would not gain much and would leave many false negatives. An effective implementation would be or need to use a very high quality name parser, and would still (given the list of values in the wild) be problematic in interpretation.

@Tasilee
Copy link
Collaborator

Tasilee commented May 26, 2020

Thanks @chicoreus - concisely put

@ArthurChapman ArthurChapman removed the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Sep 18, 2023
@chicoreus chicoreus added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Sep 18, 2023
@chicoreus
Copy link
Collaborator

Updated format of markdown table to match current usage.

@chicoreus chicoreus added Immature/Incomplete A test where substantial work is needed to develop the specification to the point where the test ca and removed Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. labels Feb 20, 2024
@Tasilee
Copy link
Collaborator

Tasilee commented Feb 20, 2024

Updated examples to align with current template.

@Tasilee
Copy link
Collaborator

Tasilee commented Feb 21, 2024

Added Description to align with current template

@Tasilee
Copy link
Collaborator

Tasilee commented Feb 22, 2024

Changed Field to TestField

@Tasilee
Copy link
Collaborator

Tasilee commented Apr 16, 2024

Standardized reference to "EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available" in Expected Response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Completeness Immature/Incomplete A test where substantial work is needed to develop the specification to the point where the test ca NAME Parameterized Test requires a parameter Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 VOCABULARY
Development

No branches or pull requests

7 participants