New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: as.ISO8601 #629

Closed
billdenney opened this Issue Jan 29, 2018 · 7 comments

Comments

Projects
None yet
2 participants
@billdenney
Copy link
Contributor

billdenney commented Jan 29, 2018

as.ISO8601 methods would be useful for all the date, date/time, period, duration, and interval functions.

To make this as useful as possible, I think it would benefit to not use as.character and make a new as.ISO8601 generic function with arguments:

  • x: The object to convert
  • format: This would depend on what is being represented:
    • With dates, it could be "ymd" (year month day), "ywd" (year week day), "yd" (year day in YYYY-DDD format), "y" (year only), "m" (month only), "md" (month day only)
    • With times, it could be "hms", "hm", or "h"
    • With date/times, it could be any pair of the options for dates and times separated by a "T" (default "ymdThms"). I chose "T" rather than something else to align with the expected output as defined in the ISO standard.
    • Periods would share the format specifier with date, times, or date/times.
  • include_tz: if NULL (default), include/exclude the time zone from the date/time based on the input data. If FALSE, exclude the time zone from the representation, and if TRUE include the time zone in the representation. (It would be ignored if time zone does not apply to the output format.)
  • repeat: (default: NULL) For intervals, repeating could be specified by setting to a non-NULL, positive integer or Inf for indefinite, future repeats.
  • repeat_start: (default: NULL) For intervals, repeat start date/time. (Must be a date, time, or date/time object type.)

(As initially discussed in #362)

@vspinu

This comment has been minimized.

Copy link
Member

vspinu commented Jan 29, 2018

We should name it something else I guess. There is no class ISO8601, so as methods would be a misnomer. I think the format is not really needed because it's a conversion from lubridate objects to characters which is generally unambiguous. Not sure about repeat either. Lubridate's intervals don't support repetitions.

@billdenney

This comment has been minimized.

Copy link
Contributor

billdenney commented Jan 30, 2018

For a name, as.character_ISO8601? Or, we could put it into the as.character methods with an argument of format="ISO8601"?

For the format argument thoughts, while I personally wouldn't use them, there are different representations of the standard for dates, "YYYY-MM-DD" and "YYYY-WWW-D" and "YYYY-DDD". Those seem like real formats. For the other date formats, does lubridate have a way to represent a missing year with a specified month and day (I've not seen it, but maybe I'm missing something). If not, it can still be helpful to have that as an output format; if it is possible to represent, then I agree that using the simpler 1:1 character relationship to the object would make sense and can you please point me to the way to represent it?

For repeat, no lubridate's intervals don't support repetition, but that would be added information at the time of character conversion. If not preferred, those are pretty simple to add manually, so I don't feel strongly.

@vspinu

This comment has been minimized.

Copy link
Member

vspinu commented Jan 30, 2018

we could put it into the as.character methods with an argument of format="ISO8601"?

This can work but the drawback is that you cannot add methods to classes which you don't own (like difftime). Maybe format_iso or format_ISO8601, or just ISO8601?

For the other date formats, does lubridate have a way to represent a missing year with a specified month and day

Nope. There are two core date-time classes - Date and POSIX.

The final implementation should consider the parser - it should allow for the round trip. Currently not everything in ISO8601 is supported and implementing all those ISO tweaks and tricks is, quite frankly, out of scope.

here are different representations of the standard for dates, "YYYY-MM-DD" and "YYYY-WWW-D" and "YYYY-DDD".

With small effort anyone can write their own formatter with format directly. So I would suggest that we first aim at standard 1:1 representations and see if we really need anything else once that's done.

@billdenney

This comment has been minimized.

Copy link
Contributor

billdenney commented Jan 30, 2018

I like format_ISO8601. It's clear in both intent format and target ISO8601.

The final implementation should consider the parser - it should allow for the round trip. [...]

Perfectly fair. If I find myself needing something more esoteric, I may write an ISO8601 library. For now, I'm happier with it living in lubridate.

I'll just keep the x and include_tz arguments described in the original comment for this issue. I'll add a ... argument so that someone (maybe me) could later extend the generic in another package.

@vspinu

This comment has been minimized.

Copy link
Member

vspinu commented Jan 30, 2018

Maybe with_tz or add_tz to make the argument a bit shorter. If it's UTC then z should always be added I think.

@billdenney

This comment has been minimized.

Copy link
Contributor

billdenney commented Jan 30, 2018

with_tz sounds like a good name to me. I hesitate to always add the time zone if UTC or if it is the local time zone since those tend to be added accidentally or by default to a lot of datasets (at least ones that I receive).

@billdenney

This comment has been minimized.

Copy link
Contributor

billdenney commented Jul 26, 2018

I wrote this feature tonight. A few differences relative to our conversation above:

  1. Instead of with_tz, I named the argument usetz to align with the base R as.character method argument.
  2. I didn't do the additional formatting for adding Z instead of -0000 for UTC because that would have added a notable amount more code. I can do it, but the longer-term maintenance didn't seem like it was worthwhile. If you'd like Z, I can make that update.
  3. I added a precision argument to allow the user to request the output precision (I often need "ymdhm" without seconds). It would be a simple modification to make that formatting match the function naming (e.g. ymd_hm). Let me know if you have a preference there.
  4. An odd combination of precision="y", usetz=TRUE gives unusual results (https://github.com/billdenney/lubridate/blob/a3b13ec6813397abae6fff3792062e2be17299fc/tests/testthat/test-format_ISO8601.R#L36-L40). I left it as is because it's what the user is requesting-- even if it doesn't make much sense.

@vspinu vspinu closed this in #700 Jul 27, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment