Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upDate rounding with daylight saving time #640
Comments
|
Reproducible example? Version of R, lubridate, OS? I cannot reproduce the SO issues. Very likely the problem is no longer there in the newer versions of lubridate: > dt_1 <- lubridate::ymd("2017-10-01", tz = "Australia/Adelaide") %>%
+ magrittr::add(lubridate::hours(c(0,1,23,24)))
>
> dt_2 <- lubridate::ymd("2017-04-02", tz = "Australia/Adelaide") %>%
+ magrittr::add(lubridate::hours(c(0,1,23,24)))
>
> lubridate::ceiling_date(dt_1, unit = "days")
[1] "2017-10-01 ACST" "2017-10-02 ACDT" "2017-10-02 ACDT" "2017-10-02 ACDT"
> lubridate::ceiling_date(dt_2, unit = "days")
[1] "2017-04-02 ACDT" "2017-04-03 ACST" "2017-04-03 ACST" "2017-04-03 ACST" |
|
Well that SO post is only 1 month old, and I'm using v1.7.2 on R 3.4.2 on Windows 10. I cannot reproduce the SO problem either, nor can I reproduce my own problem on a contrived dataset. But let me illustrate what's happening with force_tz... Creating a new date and then changing the timezone works as expected:
However, performing the same operations on the same value in my dataset, force_tz fails:
And these values are identical:
Crazy, no? |
|
Okay, I've fixed the problem. Here's the solution:
Now dateRange[1] behaves the same as example. The timezone for dateRange[1] was "", despite showing up as BST when printing the value. Maybe this isn't a bug as such, but it is unexpected behaviour that can wrong-foot new users. Is it worth adding a note in the documentation about explicitly setting time zones, even when they're apparently already set? (And of course this doesn't answer the original SO question, where time zones were explicitly set by the user.) |
|
I see. Could you please post the output of the following: Sys.timezone()
Sys.getenv("TZ")
lubridate:::C_local_tz()
as.POSIXct("2017-08-13", tz = "")
ymd("2017-08-13", tz = "")
force_tz(as.POSIXct("2017-08-13", tz = ""), "UTC")I bet this one comes from the discrepancy between In nutshell, there is no way to determine the current time zone (aka the time zone used by R when tz="") neither from R itself nor from the C code. I pointed this to R folks but no-one seem to give a damn. So from next lubridate version I will try to remove all the dependency on as.POSIXlt just to avoid dealing with this issue. |
|
Certainly, here you go:
I wasn't expecting the last line to work, so I'm now thoroughly confused... |
|
This is strange indeed. 1st and 3rd indicate that lubridate's internal and R's timezone's match so there should be no problem. Could it be the funky |
|
Could you please check the master on your vector? Thanks! |
|
Hey there, sorry - I haven't checked back on GitHub all week - apologies for the delay. And, second apology, I don't understand what you mean by the master on my vector? |
|
I meant to check the github master with |
|
Hi, I'm having a similar problem, with DST not being consistently applied. An example:
The data I receive comes with local time stamps, but a column for repeated hour in case of DST. Ideally, I would convert to UTC and add one hour for the repeated hours. But 1 am through 1:30 converts correctly, but 1:45 converts with one hour more. |
|
By the nature of the problem one cannot know if the hour is repeated or not without extra flag. You seem to have such a flag so you could simply add an hour yourself: timeSeries[repeatedHourFlag == T, Time := Time + hours(1)]BTW, on my system US/CENTRAL is not a valid time zone. Results are unpredictable with base R and you don't get the warning. With lubridate's functions you will get an error: > time <- ymd_hms(c('2017-11-05 01:00:00',
+ '2017-11-05 01:15:00',
+ '2017-11-05 01:30:00',
+ '2017-11-05 01:45:00',
+ '2017-11-05 01:00:00',
+ '2017-11-05 01:15:00',
+ '2017-11-05 01:30:00',
+ '2017-11-05 01:45:00'),
+ tz = "US/CENTRAL")
Error in C_force_tz(time, tz = tzone, roll) :
CCTZ: Unrecognized output timezone: "US/CENTRAL"
> ymd_hms(c('2017-11-05 01:00:00',
+ '2017-11-05 01:15:00',
+ '2017-11-05 01:30:00',
+ '2017-11-05 01:45:00',
+ '2017-11-05 01:00:00',
+ '2017-11-05 01:15:00',
+ '2017-11-05 01:30:00',
+ '2017-11-05 01:45:00'),
+ tz = "America/New_York")
[1] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[3] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST"
[5] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[7] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST" |
y <- floor_date(x, unit = "month")
Returns values for y at 00:00 during GMT, but 01:00 during BST.
This poster on Stack Overflow seems to be experiencing similar issues with ceiling_date: https://stackoverflow.com/questions/48125040/lubridate-ceiling-date-bug-with-daylight-savings.
My workaround has been to convert all datetimes to GMT, but I think this means that any payments between 00:00 and 01:00 during Summer will be allocated to the wrong date.
Another workaround was this:
y <- floor_date(x, unit = "month") %>% floor_date(unit = "day")
but I'm worried it'll suffer from the same problem.