-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TG2-ISSUE_IDENTIFICATIONQUALIFIER_DETECTED #97
Comments
Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: |
@chicoreus - that may need to be a different test - which is a NOTIFICATION and this one should be a VALIDATION (i.e. dwc:identificationQualifier is not NULL). Need to check if we have that one. |
Just a note on the note here, the current Darwin Core definition of genus doesn't include anything about the interpretation of question marks. Does this need to be added? |
@ianengelbrecht - the note to the test mentions why we haven't included "?" in genus here - mainly because it could have two interpretations - 1) uncertainty in identification, 2) uncertainty in placement in heirarchy. One could argue that it was worth including and a flag, but at this stage we have agreed not to include. |
If this test is kep then dwc:identificationQualifier needs to be included in the test. |
As I note under #106 , I vote that we move that test and this one to Supplemental. Alternatively, I see some value in retaining this test to flag any record where the identification is suspect - identifying any of the terms listed in the @pzermoglio (final) list as a source Authority (except for "L." which is also used as the abbreviation for Linnaeus as an author) in any if the fields: dwc:scientificName, dwc:specificEpithet or dwc:infraspecificEpithet, or have a value in dwc:identificationQualifier. |
Thanks @ArthurChapman. Post discussions today, the key issue is how many false negatives that would result from this VALIDATION; returning COMPLIANT when the record contained some form of identification qualifier. @pzermoglio 's research suggests > 1000 variants of characters that may indicate an identification qualifier. If it was a long tail distribution where say 10 character combinations detected ~90%, then maybe this would be a useful VALIDATION. I also don't understand why this VALIDATION is not using dwc:identifictionQualifier. (This may have been raised in our tele today and was indeed mentioned by @ArthurChapman above). Pondering. Votes by COB May 26 please. |
I vote to deprecate the test.
…On Thu, May 21, 2020 at 8:07 PM Lee Belbin ***@***.***> wrote:
Thanks @ArthurChapman <https://github.com/ArthurChapman>. Post
discussions today, the key issue is how many false negatives that would
result from this VALIDATION; returning COMPLIANT when the record contained
some form of identification qualifier.
@pzermoglio <https://github.com/pzermoglio> 's research suggests > 1000
variants of characters that may indicate an identification qualifier. If it
was a long tail distribution where say 10 character combinations detected
~90%, then maybe this would be a useful VALIDATION.
I also don't understand why this VALIDATION is not using
dwc:identifictionQualifier. (This may have been raised in our tele today
and was indeed mentioned by @ArthurChapman
<https://github.com/ArthurChapman> above).
Pondering. Votes by COB May 26 please.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AADQ7226V4UOR34ZLQUJJK3RSWXZ3ANCNFSM4EKSQVAQ>
.
|
I concur with moving this test from core to supplemental. Implementation of effective detection of an identification qualifier, when the identificationQualifier term is empty, is non-trivial, and it is unclear what data quality needs are addressed that would not be addressed by the consumer of the data asking a simple question such as is identificationQualifier empty and specificEpithet contains exactly one word. |
Rephrasing as issue, updating markdown table to more closely conform with current usage. Marking as immature given absence of controlled vocabulary and discussion of immaturity above. |
Example needs updating to conform with current usage. |
Added examples to conform to current template |
Changed Test to TestField and added Description |
Standardized reference to "EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available" in Expected Response and tried to standardize bdq:sourceAuthority |
The text was updated successfully, but these errors were encountered: