Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] CJ() looses timezone of POSIXct vector #2029

Closed
MarkusBonsch opened this issue Feb 16, 2017 · 1 comment · Fixed by #2150
Closed

[Bug] CJ() looses timezone of POSIXct vector #2029

MarkusBonsch opened this issue Feb 16, 2017 · 1 comment · Fixed by #2150
Labels
bug
Milestone

Comments

@MarkusBonsch
Copy link
Contributor

@MarkusBonsch MarkusBonsch commented Feb 16, 2017

I have a problem when using CJ() with POSIXct objects.

If CJ is used to construct a data.table from a POSIXct vector with timezone UTC and a second grouping vector, the resulting POSIXct column in the data.table has local timezone.
If the POSIXct vector is the only input to CJ, the timezone is preserved.
Also, plain data.table() preserves the timezone.

See below for a reproducible example and my sessionInfo(). Maybe related to (#1778)

library(data.table)

## get groups and dates as input for CJ()
input_groups <- letters[1:3]
input_times  <- as.POSIXct(sprintf("2016-%02d-01", 1:12), tz = "UTC")

## input_times timezone is UTC
all(lapply(input_times, attr, "tzone") == "UTC")
# [1] TRUE

## the following shows that CJ changes timezones to NULL (local time).
test <- CJ(group = input_groups, test_time = input_times)
## test$test_time timezone is not UTC but NULL
all(lapply(test$test_time, attr, "tzone") == "UTC")
# [1] FALSE
unique(lapply(test$test_time, attr, "tzone"))
# [[1]]
# NULL

## if only a single POSIXct vector is supplied, timezone is preserved.
test2 <- CJ(test_time = input_times)
all(lapply(test2$test_time, attr, "tzone") == "UTC")
# [1] TRUE

## also, original data.table() behaves as expected.
test2 <- data.table(group = rep(input_groups, length(input_times)), date = rep(input_times, length(input_groups)))
setkey(test2, group, date)
all(lapply(test2$date, attr, "tzone") == "UTC")
# [1] TRUE

Here is my sessionInfo()

R version 3.3.0 (2016-05-03)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252    LC_MONETARY=German_Germany.1252
[4] LC_NUMERIC=C                    LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] tools_3.3.0

Kind regards,
Markus Bonsch

@RoyalTS

This comment has been minimized.

Copy link
Contributor

@RoyalTS RoyalTS commented Apr 19, 2017

Can confirm this.

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.4

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] tools_3.3.2

As far as I can tell the reason behind this is mentioned explicitly in the comments for the CJ() function definition:

# using rep.int instead of rep speeds things up considerably (but attributes are dropped).

Couldn't one simply re-apply the attributes of the input columns to the output columns after the Cartesian product has been constructed, right around line 356 in https://github.com/Rdatatable/data.table/blob/fb03ad184de08535e28d772dfe172f5ddf384a52/R/setkey.R?

This was referenced May 8, 2017
@mattdowle mattdowle added this to the v1.10.6 milestone Aug 4, 2017
@mattdowle mattdowle added the bug label Aug 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.