Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time formats should be more consistent and portable across platforms #487

Closed
OriHoch opened this issue Jul 6, 2017 · 11 comments · Fixed by frictionlessdata/datapackage-v2-draft#23

Comments

@OriHoch
Copy link
Contributor

OriHoch commented Jul 6, 2017

the specs currently support formats for time/date formats which are not portable and might produce different results when called from different platforms / library implementations:

  • any -
    • not defined at all, supports anything thrown at it. Can produce different results depending on the implementation.
  • strptime format string -
    • strptime is not a defined standard, it is an implementation detail that works differently on different platforms and library implementations, some examples:
    • PHP - supports strptime, but silently ignores timezone details
    • Python 2.7 - doesn't support timezone (raises exception when %z part is specified)
    • strptime is platform dependent, and might work differently on the OS level in different platforms (I understand that on Windows it has some major problems)

I think we should drop the additional format and support only the ISO standard format

Another option - support only a small subset of strptime, and validate that only that subset is used.

@ezwelty
Copy link
Contributor

ezwelty commented Jul 15, 2017

@OriHoch Totally agree.

This is an example where limiting choice probably benefits everyone (data authors, consumers, and developers). I'd suggest dropping formats any and default, scratching field types date, time, datetime, year, and yearmonth, and requiring a pattern among the following subset of the core ISO 8601 standard:

  • YYYY-MM-DDThh:mm:ssZ / %Y-%m-%dT%H:%M:%SZ – UTC
  • YYYY-MM-DDThh:mm:ss / %Y-%m-%dT%H:%M:%S – Unknown time zone
  • YYYY-MM-DDThh:mm(Z) / %Y-%m-%dT%H:%M(Z)
  • YYYY-MM-DDThh(Z) / %Y-%m-%dT%H(Z)
  • YYYY-MM-DD / %Y-%m-%d
  • YYYY-MM / %Y-%m
  • YYYY / %Y

In other words, either you know the time zone and convert to UTC (and tag with "Z"), or you don't know the time zone (and drop the "Z").

@rufuspollock
Copy link
Contributor

@ezwelty @OriHoch this was a classic trade-off of supporting publishers in describing data as is vs a desirable strictness for consumers. Remember the specs have to balancing supporting publishers who may be publishing data they don't control and making it easier for tool writers to use the spec.

I'm not sure we've yet got the trade-off right for date formats but what i would say is you see a lot of variety in date formats in the wild (think of what you see in google spreadsheets or excel as options). We want to support that as we can because many publishers may be constrained to use that. At the same time I get that this is problematic for consumers and tool authors and that what we have there may not yet be specific enough. My request here would be see what we can do with v1 as is and consider revisions for v1.1 based on more experience in the wild.

@OriHoch
Copy link
Contributor Author

OriHoch commented Jul 19, 2017

given this constraint, I would suggest the following for v1.1:

  • any format - should be defined explicitly in the spec (can be as simple as a list of date/time formats to try according to priority)
  • strptime - define in the spec the exact format codes supported, tools should validate that only those format codes are used

@pwalsh pwalsh added this to the v1.1 milestone Jul 24, 2017
@pwalsh
Copy link
Member

pwalsh commented Oct 9, 2017

@OriHoch ok, good idea. Let's come back to this when we start working on the v1.1 release.

@Stephen-Gates
Copy link
Contributor

This may also help infer formats frictionlessdata/tableschema-js#98

@Stephen-Gates
Copy link
Contributor

As a data publisher, I strongly support Rufus Pollock's comment above. I would like to see time zone support added for ISO8610 formats e.g. 2016-12-25T00:01:01Z+10

@ezwelty
Copy link
Contributor

ezwelty commented Oct 31, 2017

@Stephen-Gates Probably a typo, but just in case: 2016-12-25T00:01:01Z+10 should be written as 2016-12-25T00:01:01+10 (or +10:00 or +1000). The Z is a shorthand for +00:00, aka UTC. See
https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators

It's difficult to implement the entire ISO 8601 spec, but I think it'd be reasonable to extend the list I suggested above (#487 (comment)) with support for non UTC ('Z') time zone designators ±hh:mm, ±hhmm, and ±hh.

@Stephen-Gates
Copy link
Contributor

Thanks @ezwelty. Including the Z is my mistake.

@roll roll added this to Specifications in Frictionless General Mar 19, 2019
@roll roll modified the milestones: v1.1, v2 Apr 14, 2023
@roll roll changed the title tableschema: time formats should be more consistent and portable across platforms Time formats should be more consistent and portable across platforms Jan 3, 2024
@roll
Copy link
Member

roll commented Jan 25, 2024

From @stevage:


"date"/default allows any ISO8601 format, which is incredibly broad (and includes rarely supported features like recurring dates and intervals). Do we intend this?
"date"/datetime requires UTC. Do we not allow times without timezones?
"date": Do we not allow times with milliseconds?


From Rufus:


@pwalsh i agree we should restrict date to yyyy-mm-dd[+T stuff +TZ stuff] strictly. wdyt.
Agree basically on all these simplications.

@roll
Copy link
Member

roll commented Jan 26, 2024

From @peterdesmet


Frictionless Framework will correctly parse the following datetimes with a format = default (datapackage.json.zip):

2013-11-23T08:30:00      # No timezone
2013-11-23T08:30:00Z     # UTC time
2013-11-23T06:30:00-0200 # Timezone offset

This is great! But according to the specs only UTC times should be supported (excluding offsets or no timezone):

default: An ISO8601 format string e.g. YYYY-MM-DDThh:mm:ssZ in UTC time

Is the more broad support intentional? Should the specs be updated to drop the in UTC time?

Originating issue: tdwg/camtrap-dp#333

@OriHoch
Copy link
Contributor Author

OriHoch commented Feb 20, 2024

🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Frictionless General
  
Specifications
Development

Successfully merging a pull request may close this issue.

6 participants