opened this issue Jul 10, 2019 · 2 comments · Fixed by #3837
# calculating time difference by group might get the units messed up#3694

opened this issue Jul 10, 2019 · 2 comments · Fixed by #3837
## Comments

### oliver-oliver commented Jul 10, 2019

 `#` `Minimal reproducible example` The objective is to calculate the time between events grouped by some id. Here is an example: ``````library(data.table) library(lubridate) dt <- data.table(id = c(1,1:3), start = c("2015-01-01 12:00:00", "2015-12-01 12:00:00", "2019-01-01 12:00:00", NA), end = c("2016-01-01 12:00:01", "2016-01-01 12:00:01", "2019-01-01 12:00:01", "2019-01-01 12:00:02")) dt[, start := ymd_hms(start)] dt[, end := ymd_hms(end)] dt[, time_diff_1 := min(end) - max(start), by = .(id)] dt[, time_diff_2 := end - start] `````` which results in: `````` id start end time_diff_1 time_diff_2 1: 1 2015-01-01 12:00:00 2016-01-01 12:00:01 31.00001 secs 31536001 secs 2: 1 2015-12-01 12:00:00 2016-01-01 12:00:01 31.00001 secs 2678401 secs 3: 2 2019-01-01 12:00:00 2019-01-01 12:00:01 1.00000 secs 1 secs 4: 3 2019-01-01 12:00:02 NA secs NA secs `````` Both columns time_diff_1 and time_diff_2 display the time difference in seconds. However the time_diff_1 which resulted from the grouped calculation mixed up the units. The result for id == 1 is 31 days and one second. It seems as if the units were choosen automatically by group and then gotten overwritten. To prevent this one can use `difftime()`. However I think there is room for improvment, e.g. a warning message when units do not match for different groups. `#` `Output of sessionInfo()` ``````> sessionInfo() R version 3.4.0 (2017-04-21) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 Matrix products: default locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 [4] LC_NUMERIC=C LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lubridate_1.6.0 data.table_1.10.4 RevoUtilsMath_10.0.0 loaded via a namespace (and not attached): [1] compiler_3.4.0 magrittr_1.5 RevoUtils_10.0.4 tools_3.4.0 stringi_1.1.5 stringr_1.2.0 `````` The text was updated successfully, but these errors were encountered:

### oliver-oliver commented Jul 10, 2019

 Just saw on my stack overflow question that this issue is known.

### MichaelChirico commented Jul 10, 2019

 this is a problem of -.time classes. it picks a unit within each group. always use difftime when doing grouped time differences and explicitly set a unit. I usually wrap it all in as.double. alternatively, convert your time to integer (or numeric) first and do your differences on that representation. quite a common case though, maybe we should add it to the FAQ and/or catch this particular case & warm or report in verbose. … On Thu, Jul 11, 2019, 3:40 AM oliver-oliver ***@***.***> wrote: Just saw on my stack overflow question that this issue is known. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#3694?email_source=notifications&email_token=AB2BA5NZJEW5PBSCV7QJPYDP6Y3MBA5CNFSM4H7SQ3P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZUQQYQ#issuecomment-510199906>, or mute the thread .

