Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime granularity (Date and DateTime considered in need of improvement) #1748

Closed
VladimirAlexiev opened this issue Sep 22, 2017 · 9 comments
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!).

Comments

@VladimirAlexiev
Copy link

VladimirAlexiev commented Sep 22, 2017

@trypuz, @sopekmir have made very reasonable examples in http://sdo-auto-fix.appspot.com/docs/automotive.html#hybrid_car JSONLD: a car is model "2016", was produced in "2017", and is offered for sale on "2017-05-29".

With the current jsonld context these literals come out as

schema:modelDate          "2016"^^schema:Date ;
schema:productionDate     "2017"^^schema:Date ;
schema:availabilityStarts "2017-05-29"^^schema:DateTime ;

because the respective 3 props are declared to have the respective datatypes.

@danbri I see these problems:

  1. The example literals are realistic, but don't have the precision required by the datatype (the first two need to be Year or YearMonth which schema doesn't have, and the third one needs to have a timestamp). So the ranges need to be relaxed (same is needed for most other props designating datetime!):
    • modelDate: Date or text
    • productionDate: Date or text
    • availabilityStarts: DateTime or Date
  2. the context cannot specify any datatype because it would misinterpret some literals, as above.
  3. xsd:Date and xsd:DateTime are handled specially by most triplestores (in a literal index for fast comparison). But I don't know any triple store to handle schema:Date and schema:DateTime in a similar way. So from a pragmatic sense, these datatypes are not useful and it's best to leave literals as plain strings. 2 would take care of that for JSONLD, but how about examples in other formats?
@VladimirAlexiev
Copy link
Author

Someone may suggest "fixing" issue 1 by fake completion of datetimes:

schema:modelDate          "2016-01-01"^^schema:Date;
schema:availabilityStarts "2017-05-29T00:00:00"^^schema:DateTime ;

This is a bad idea since it constitutes lying, and most html authors and webmasters will never get such "fix" right.

@RichardWallis
Copy link
Contributor

Note that both schema:Date and schema:DateTime are described as being values in ISO 8601 date format for which Dates such as "2016" and "2016-01-01" are valid.

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Sep 22, 2017

@RichardWallis Where does that Wikipedia page say a date can consist of only a year?

And DateTime is described as

A combination of date and time of day in the form [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]

which doesn't sound like the time part is optional.
If you were right ,DateTime subsumes Date, so the latter should be removed.

@RichardWallis
Copy link
Contributor

The reference is "ISO 8601:2000 allowed truncation (by agreement), where leading components of a date or time are omitted."

This should be taken into account along side general practice and use of date values in the wild. Many systems do not have the source data to fully fill out a complete date - publication year for books etc.

Pragmatically it is not possible to enforce adherence to strict conformance on such things for all Schema.org mark up - for many it is difficult to even loosely apply ISO 8601 to historical data stored in differing formats.

This situation can be frustrating for data consumers who cannot guarantee detailed adherence to such standards and have apply algorithms to identify what might be meant.

A potential alternative would be to introduce individual data types for Year, Month, Day, etc. which at this stage may well cause more confusion.

I believe issues such as this need to be approached in the spirit espoused in this quote from Data Model documentation:

The type/properties associations of schema.org are closer to "guidelines" than to formal rules, and improvements to the guidelines are always welcome.
See also: Postel's Law

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Sep 29, 2017

@RichardWallis
If these datatypes don't indicate the granularity of the literal, what is the purpose of having them at all? In particular, what is the purpose of having two of them?

IMHO a datatype should indicate precisely the format of a literal and thus guide its processing. If schema:Date and schema:DateTime don't do that, people should (and do!) use datatypes that do:

  • xsd:string for unformatted dates or dates with unclear characteristics
  • xsd:gYear for years
  • xsd:gYearMonth for years & months
  • xsd:date for dates
  • xsd:dateTime for date-time stamps
  • xsd:dateTimeStamp for date-time stamps with timezone

I think schema should recommend this

@VladimirAlexiev
Copy link
Author

VladimirAlexiev commented Sep 29, 2017

@RichardWallis I think you quoted from https://en.wikipedia.org/wiki/ISO_8601#Truncated_representations?
But the very next sentence is: "This provision was removed in ISO 8601:2004".

I very much like Schema's flexibility, and that most props allow a resource or Text.
But I am against datatypes that are so vague that they are useless.

@RichardWallis
Copy link
Contributor

I may be wrong but I believe "provision removed" is referring to 'two-digit years to be used and the ambiguous formats YY-MM-DD and YYMMDD'.

The allocation of Date as an expected type, along with the defaults of Text & URL, is a guide to someone marking up their data as to the type of information needed, as against a specification that a data consumer can rely on to make decisions.

For example dateCreated for a CreativeWork could validly be expected to contain things such as "c.1655", "1951", "2007-03-01T13:00:00Z", "2017-09-29" dependant on the data behind the page being marked up.

If there is a choice to make things easier, the Schema.org approach tends to making it easier for those adopting the vocabulary for marking up their data as against the consumers of that data.

Having spent many years dealing with many anomalies around dates in the fairly narrow and standards literate world of cultural heritage, I believe that the current situation is probably the best we can hope for across the full breadth of all the domains that Schema supports.

We could recommend more granular data types for Schema.org properties, as you suggest. However I believe it would introduce some confusion for adopters, especially in circumstances like the example I mentioned above. Of more relevance however, once introduced I would expect very little difference in the data actually being marked up on sites across the web.

Especially in the area of dates, my experience has often shown the attitude of "that is the date format in my data [from my system] so that is what I output in Schema". From a data consuming/processing point of view not very satisfactory, but pragmatically it is what we have to deal with.

@VladimirAlexiev
Copy link
Author

The whole quote is consistent because it says "leading components ... are omitted". IMHO there's no provision to omit time from DateTime (which is a trailing component), or if there was, it was repealed in 2004.

I agree with you and Schema's approach to allow any literal as the value of a date or datetime. But I disagree to have two datatypes that say nothing really. Using them as literal datatypes (eg ^^schema:Date) is actually harmful since no known repository processes such in any special way, and indeed it would be hard to, given the imprecision.

So @danbri my proposal is this:

  • remove DateTime since it's redundant
  • keep Date but recommend that xsd: datatypes should be used (see my bullets above)
  • remove Date from jsonld context
  • fix examples to use xsd: datatypes of appropriate granularity

So schema:Date will become merely an advisory type to give a good range to props like schema:availabilityStarts, and to answer questions like "what are the date/datetime props in Schema?"

@danbri danbri changed the title datetime granularity (Date and DateTime considered harmful) datetime granularity (Date and DateTime considered in need of improvement) Oct 31, 2017
@github-actions
Copy link

This issue is being tagged as Stale due to inactivity.

@github-actions github-actions bot added the no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). label Jul 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!).
Projects
None yet
Development

No branches or pull requests

2 participants