Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TG2-AMENDMENT_DAY_STANDARDIZED #127

Open
Tracked by #24
pzermoglio opened this issue Jan 18, 2018 · 10 comments
Open
Tracked by #24

TG2-AMENDMENT_DAY_STANDARDIZED #127

pzermoglio opened this issue Jan 18, 2018 · 10 comments
Labels
Amendment Conformance CORE TG2 CORE tests Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 TIME

Comments

@pzermoglio
Copy link
Member

pzermoglio commented Jan 18, 2018

TestField Value
GUID b129fa4d-b25b-43f7-9645-5ed4d44b357b
Label AMENDMENT_DAY_STANDARDIZED
Description Proposes an amendment to the value of dwc:day as an integer between 1 and 31 inclusive.
TestType Amendment
Darwin Core Class dwc:Event
Information Elements ActedUpon dwc:day
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:day is bdq:Empty; AMENDED the value of dwc:day if the value is unambiguously interpreted as an integer between 1 and 31 inclusive; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions DAY_STANDARDIZED
Parameter(s)
Source Authority
Specification Last Updated 2023-09-18
Examples [dwc:day="23rd": Response.status=AMENDED, Response.result=dwc:day="23", Response.comment="dwc:day is interpretable as "23""]
[dwc:day="X": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:day is ambiguous, either a "X", "No data" or "10""]
Source TG2-Gainesville
References
Example Implementations (Mechanisms) Kurator:event_date_qc
Link to Specification Source Code A potential minimal implementation is at: https://github.com/FilteredPush/event_date_qc/blob/238f234a4947b3c2820fb2fe3987326f9ead5e54/src/main/java/org/filteredpush/qc/date/DwCEventDQ.java#L1114 unit test at https://github.com/FilteredPush/event_date_qc/blob/238f234a4947b3c2820fb2fe3987326f9ead5e54/src/test/java/org/filteredpush/qc/date/DwcEventDQTest.java#L824
Notes If dwc:day contains text that may be interpreted as Roman numerals, the result will be NOT_AMENDED as this is not standard. Values such as "3rd" or "12th" can be interpreted as the integers "3" and "12". Text such as "5th Friday" is ambiguous.
@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Jan 19, 2018
@chicoreus
Copy link
Collaborator

This issue and #128 and #129 feel like they may not belong in core, but as upstream tests for use cases involving preparing data for publication in Darwin Core like UPSTREAM_EVENTDATE_FILLED_IN_FROM_START_END urn:uuid:e4ddf9bc-cd10-46cc-b307-d6c7233a240a in Kurator

@chicoreus
Copy link
Collaborator

Need a clear definition of what things this amendment should attempt to do. Based on the example, I would suggest removing all non-numeric characters and attempting to parse an integer in the range of 1-31 out of the remaining characters. I would propose explicitly excluding roman numerals from translation here (as their presence in day suggests a transposition of day and month in the biodiversity domain) (MUST_NOT translate roman numerals to integers). Unclear if text "first", "second", etc, and in languages other than English should or should not be supported. If an upstream data preparation test, may need to translate numbers from more parts of the unicode character set into integers, this might be a MAY condition.

@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Jan 30, 2018

These are tests not in our original suite of tests, and like Paul, I question their value as CORE. I know some good arguments were put forward in Gainesville where some easy one may be corrected (as in example), but we excluded a number of other tests on Day and Month as we didn't regard those fields as CORE (other than helping to populate eventDate. I can see arguments for keeping YEAR as I know a lot of people use this on its own, but DAY and MONTH less so. I agree with @chicoreus - move to SUPPLEMENTAL

@Tasilee
Copy link
Collaborator

Tasilee commented Jan 31, 2018

I would agree with SUPPLEMENTAL for dwc:day

@Tasilee Tasilee added Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. and removed Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT labels Jan 31, 2018
@ArthurChapman
Copy link
Collaborator

ArthurChapman commented Feb 20, 2018

I am not sure our description for this is good. Perhaps we should add "unambiguously" interpreted. For example do we want to convert "32" to "23" - other examples?

@ArthurChapman ArthurChapman removed the Supplementary Tests supplementary to the core test suite. These are tests that the team regarded as not CORE. label Feb 20, 2018
@Tasilee
Copy link
Collaborator

Tasilee commented Feb 21, 2018

I agree about "unambiguosly"...will edit in. That makes "32" ambiguous as it could be "23" or "13" with typo etc.

@chicoreus
Copy link
Collaborator

chicoreus commented Mar 1, 2018

Let's pose some examples, with some alternatives included:

Value Interpretation Note
1 1 It's an integer, no change needed
1st 1 Strip off trailing 'st' (ordinal indicator)
first 1 Interpret string
2nd 2 Strip off trailing 'nd' (ordinal indicator)
2 Strip off trailing 'ú' (irish ordinal indicator)
2 Strip off trailing 'º' (spanish masculine ordinal indicator)
第二 2 Strip off ordinal prefix 第 , translate 二 to 2
2 Translate 二 to 2
3rd 3 Strip off trailing 'rd' (ordinal indicator)
3:e 3 Strip off trailing ':e' (Swedish ordinal indicator )
ke-3 3 Strip off leading 'ke-' (Malay ordinal indicator )
4th 4 Strip off trailing 'th' (ordinal indicator)
4ᵗʰ 4 Strip off trailing superscript 'th' (ordinal indicator)
1-2 A range, can't interpret.
1-2 1 A range, interpret as start value.
1 to 2 A range, can't interpret.
1 to 2 1 A range, interpret as start value.
32nd Stripping off 'nd' leaves an out of range value.
1u Can't interpret. Could be 1 with trailing string, or 17 with right hand in wrong place on keyboard, or irish ordinal indicator without accent...
1u 1 Strip off non numeric characters, found only a single integer in range.

We could define this test as removing strings that appear to be ordinal indicators and proposing a bare integer, so long as it is in the range 1 to 31 inclusive. Or we could define this test as checking to see if the day contains a single integer in the range 1 to 31, and if so, removing all other characters from the day value. Or...

@Tasilee
Copy link
Collaborator

Tasilee commented Aug 14, 2018

So, where are we with this? Various TAGS have been added, removed, re-added etc.

TIME resolution to day makes this less important than month and year but does this alone make it NOT_CORE? @chicoreus provides examples where we could unambiguously AMEND (e.g., "one", "two", "thirty one", "first", "second", "thirty first","1st", "2nd", "31st".

Comments please.

@ArthurChapman ArthurChapman added the Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT label Aug 30, 2018
chicoreus added a commit to FilteredPush/event_date_qc that referenced this issue Aug 30, 2019
…and added unit tests resulting from Arthur Chapman's ongoing review of results of test runs. DESCRIPTION: Set distinct guid for mechanism for DwCEventDateTG2DQ. Java to 1.8 and added commons-lang3 in pom for StringUtils.

Moved implementation for dateIdentified into DwCOtherDateDQ.  Fix for single digit days and months being recognized as valid ISO date, updates to unit tests in consequence.  DateUtils.extractInterval() and extractDate() now returns null when given single digit day or month values.  Fix for handling of date ranges with end date before start date.  DateUtils.eventDateValid() now returns false on these.  added unit tests for validationYearEmpty, amendmentDayStandardized, amendmentEventdateStandardized, validationDateidentifiedNotstandard.
@Tasilee
Copy link
Collaborator

Tasilee commented Mar 11, 2022

Updated Note based on recent discussions:

If dwc:day contains text that may be interpreted as Roman numerals, the result will be "NOT_AMENDED" as this is not standard. Values such as "3rd" or "12th" can be interpreted as the integers "3" and "12". Text such as "5th Friday" is ambiguous.

chicoreus added a commit to FilteredPush/event_date_qc that referenced this issue Mar 12, 2022
…sts to current specifications. DESCRIPTION: Updating implementation and fixing unit tests for AMENDMENT_MONTH_STANDARDIZED and AMENDMENT_DAY_STANDARDIZED to conform with current (2022-03-10) specification. Fixing return of Response.result from amendments to a empty map instead of null for easier handling by consumers of the method responses.
chicoreus added a commit to FilteredPush/event_date_qc that referenced this issue Jun 8, 2023
…t current (2023-06-04) test descriptions. Adding ProvidesVersion annotations. Removing now empty file stubs for checked methods. Addressed tdwg/bdq#127 AMENDMENT_DAY_STANDARDIZED  and tdwg/bdq#128 AMENDMENT_MONTH_STANDARDIZED.  Removing deprecated wrapper methods.
@Tasilee
Copy link
Collaborator

Tasilee commented Sep 18, 2023

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

@chicoreus chicoreus added the CORE TG2 CORE tests label Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Amendment Conformance CORE TG2 CORE tests Test Tests created by TG2, either CORE, Supplementary or DO NOT IMPLEMENT TG2 TIME
Projects
None yet
Development

No branches or pull requests

5 participants