Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date rounding with daylight saving time #640

Closed
tomwwagstaff opened this issue Feb 20, 2018 · 12 comments

Comments

@tomwwagstaff
Copy link

commented Feb 20, 2018

y <- floor_date(x, unit = "month")

Returns values for y at 00:00 during GMT, but 01:00 during BST.

This poster on Stack Overflow seems to be experiencing similar issues with ceiling_date: https://stackoverflow.com/questions/48125040/lubridate-ceiling-date-bug-with-daylight-savings.

My workaround has been to convert all datetimes to GMT, but I think this means that any payments between 00:00 and 01:00 during Summer will be allocated to the wrong date.

Another workaround was this:
y <- floor_date(x, unit = "month") %>% floor_date(unit = "day")
but I'm worried it'll suffer from the same problem.

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 20, 2018

Reproducible example? Version of R, lubridate, OS?

I cannot reproduce the SO issues. Very likely the problem is no longer there in the newer versions of lubridate:

> dt_1 <- lubridate::ymd("2017-10-01", tz = "Australia/Adelaide") %>%
+     magrittr::add(lubridate::hours(c(0,1,23,24)))
> 
> dt_2 <- lubridate::ymd("2017-04-02", tz = "Australia/Adelaide") %>%
+     magrittr::add(lubridate::hours(c(0,1,23,24)))
> 
> lubridate::ceiling_date(dt_1, unit = "days")
[1] "2017-10-01 ACST" "2017-10-02 ACDT" "2017-10-02 ACDT" "2017-10-02 ACDT"
> lubridate::ceiling_date(dt_2, unit = "days")
[1] "2017-04-02 ACDT" "2017-04-03 ACST" "2017-04-03 ACST" "2017-04-03 ACST"
@tomwwagstaff

This comment has been minimized.

Copy link
Author

commented Feb 21, 2018

Well that SO post is only 1 month old, and I'm using v1.7.2 on R 3.4.2 on Windows 10.

I cannot reproduce the SO problem either, nor can I reproduce my own problem on a contrived dataset. But let me illustrate what's happening with force_tz...

Creating a new date and then changing the timezone works as expected:

example <- as.POSIXct("2017-08-13", "Europe/London")
example
[1] "2017-08-13 BST"
example %>% force_tz(tzone = "Etc/GMT")
[1] "2017-08-13 GMT"
example %>% with_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"

However, performing the same operations on the same value in my dataset, force_tz fails:

dateRange[1]
[1] "2017-08-13 BST"
dateRange[1] %>% force_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"
dateRange[1] %>% with_tz(tzone = "Etc/GMT")
[1] "2017-08-12 23:00:00 GMT"

And these values are identical:

example == dateRange[1]
[1] TRUE
example %>% force_tz(tzone = "Etc/GMT") == dateRange[1] %>% force_tz(tzone = "Etc/GMT")
[1] FALSE

Crazy, no?

@tomwwagstaff

This comment has been minimized.

Copy link
Author

commented Feb 21, 2018

Okay, I've fixed the problem. Here's the solution:

attr(dateRange, "tzone") <- "Europe/London"

Now dateRange[1] behaves the same as example. The timezone for dateRange[1] was "", despite showing up as BST when printing the value.

Maybe this isn't a bug as such, but it is unexpected behaviour that can wrong-foot new users. Is it worth adding a note in the documentation about explicitly setting time zones, even when they're apparently already set?

(And of course this doesn't answer the original SO question, where time zones were explicitly set by the user.)

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 21, 2018

I see. Could you please post the output of the following:

Sys.timezone()
Sys.getenv("TZ")
lubridate:::C_local_tz()
as.POSIXct("2017-08-13", tz = "")
ymd("2017-08-13", tz = "")
force_tz(as.POSIXct("2017-08-13", tz = ""), "UTC")

I bet this one comes from the discrepancy between Sys.timezone and the timezone inferred by R when tz="". If so it's a version of #619. I thought this is an issue with R-devel only but looks like it goes back to R3.4.2 at the least.

In nutshell, there is no way to determine the current time zone (aka the time zone used by R when tz="") neither from R itself nor from the C code. I pointed this to R folks but no-one seem to give a damn. So from next lubridate version I will try to remove all the dependency on as.POSIXlt just to avoid dealing with this issue.

@tomwwagstaff

This comment has been minimized.

Copy link
Author

commented Feb 22, 2018

Certainly, here you go:

> Sys.timezone()
[1] "Europe/London"
> Sys.getenv("TZ")
[1] ""
> lubridate:::C_local_tz()
[1] "Europe/London"
> as.POSIXct("2017-08-13", tz = "")
[1] "2017-08-13 BST"
> ymd("2017-08-13", tz = "")
[1] "2017-08-13 BST"
> force_tz(as.POSIXct("2017-08-13", tz = ""), "UTC")
[1] "2017-08-13 UTC"

I wasn't expecting the last line to work, so I'm now thoroughly confused...

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 22, 2018

This is strange indeed. 1st and 3rd indicate that lubridate's internal and R's timezone's match so there should be no problem.

Could it be the funky "Etc/GMT" time zone string which you use? I would really need a minimal example here to understand what's going on.

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 22, 2018

The underlying cause is probably the same as in #642 and #643.

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 22, 2018

Could you please check the master on your vector? Thanks!

@tomwwagstaff

This comment has been minimized.

Copy link
Author

commented Feb 28, 2018

Hey there, sorry - I haven't checked back on GitHub all week - apologies for the delay.

And, second apology, I don't understand what you mean by the master on my vector?

@vspinu

This comment has been minimized.

Copy link
Member

commented Feb 28, 2018

I meant to check the github master with devtools::install_github("tydiverse/lubridate").

@leobarlach

This comment has been minimized.

Copy link

commented Mar 20, 2018

Hi, I'm having a similar problem, with DST not being consistently applied.

An example:

Time <- c(as.POSIXct('2017-11-05 01:00:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:15:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:30:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:45:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:00:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:15:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:30:00',tz = 'US/CENTRAL'),
       as.POSIXct('2017-11-05 01:45:00',tz = 'US/CENTRAL'))

repeatedHourFlag <- c(F,F,F,F,T,T,T,T)
timeSeries <- data.table(Time,repeatedHourFlag)

timeSeries[, Time := lubridate::with_tz(Time,tzone = 'UTC')]
timeSeries

The data I receive comes with local time stamps, but a column for repeated hour in case of DST. Ideally, I would convert to UTC and add one hour for the repeated hours. But 1 am through 1:30 converts correctly, but 1:45 converts with one hour more.

@vspinu

This comment has been minimized.

Copy link
Member

commented Mar 22, 2018

By the nature of the problem one cannot know if the hour is repeated or not without extra flag. You seem to have such a flag so you could simply add an hour yourself:

timeSeries[repeatedHourFlag == T, Time := Time + hours(1)]

BTW, on my system US/CENTRAL is not a valid time zone. Results are unpredictable with base R and you don't get the warning. With lubridate's functions you will get an error:

 > time <- ymd_hms(c('2017-11-05 01:00:00',
+                   '2017-11-05 01:15:00',
+                   '2017-11-05 01:30:00',
+                   '2017-11-05 01:45:00',
+                   '2017-11-05 01:00:00',
+                   '2017-11-05 01:15:00',
+                   '2017-11-05 01:30:00',
+                   '2017-11-05 01:45:00'),
+                 tz = "US/CENTRAL")
Error in C_force_tz(time, tz = tzone, roll) : 
  CCTZ: Unrecognized output timezone: "US/CENTRAL"

> ymd_hms(c('2017-11-05 01:00:00',
+           '2017-11-05 01:15:00',
+           '2017-11-05 01:30:00',
+           '2017-11-05 01:45:00',
+           '2017-11-05 01:00:00',
+           '2017-11-05 01:15:00',
+           '2017-11-05 01:30:00',
+           '2017-11-05 01:45:00'),
+         tz = "America/New_York")
[1] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[3] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST"
[5] "2017-11-05 01:00:00 EST" "2017-11-05 01:15:00 EST"
[7] "2017-11-05 01:30:00 EST" "2017-11-05 01:45:00 EST"

@vspinu vspinu closed this Apr 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.