Wrong units displayed in data.table with POSIXct arithmetic #761

Open
boethian opened this Issue Aug 9, 2014 · 2 comments

Comments

Projects
None yet
4 participants

boethian commented Aug 9, 2014

Recently posted on StackOverflow because didn't know how to report bug report / make feature request here.

http://stackoverflow.com/questions/25214170/wrong-units-displayed-in-data-table-with-posixct-arithmetic

Problem reproduced here for completeness, I can take down SO "question" if this bug is deemed legitimate here:

When durations are computed in data.table (v1.9.2), the wrong units can be printed with POSIXct arithmetic. It seems the first units are chosen.

require("data.table")
dt <- data.table(id=c(1,1,2,2), 
                  event=rep(c("start", "end"), times=2), 
                  time=c(as.POSIXct(c("2014-01-31 06:05:30", 
                                      "2014-01-31 06:45:30", 
                                      "2014-01-31 08:10:00", 
                                      "2014-01-31 09:30:00"))))
dt$time[2] - dt$time[1]  # in minutes
dt$time[4] - dt$time[3]  # in hours
dt[ , max(time) - min(time), by=id]  # wrong units printed for id 2

I realize that one of these is the correct way to do it to get expected behavior, but wanted to report this behavior. Not sure if it is really a data.table problem or POSIXct problem.

dt[ , difftime(max(time), min(time), units="mins"), by=id]  # both mins
dt[ , difftime(max(time), min(time), units="hours"), by=id]  # both hours

@mattdowle mattdowle added the bug label Aug 11, 2014

Owner

mattdowle commented Aug 11, 2014

Agreed it's a bug. Thanks for the nice report. I tested in latest dev just now and it occurs there too.

Good to have the S.O. question as well i.e. don't take it down. More chance people will find it and see the workarounds etc.

As a first step, the attributes on each group result could be checked for consistency and a warning issued if not. I thought it did that already but clearly not. Tagged High for the this detection and warning aspect which should be relatively straightforward and could be an issue in other cases unrelated to POSIXct. Making it return the right result without speed penalty could be hard, so likely best to have user call difftime(,units=), and that could be a suggestion in the warning message.

@mattdowle mattdowle added the High label Aug 11, 2014

@arunsrinivasan arunsrinivasan added this to the v1.9.6 milestone Sep 25, 2014

@arunsrinivasan arunsrinivasan modified the milestones: v1.9.6, v1.9.8 Oct 10, 2014

Contributor

MichaelChirico commented Aug 2, 2016

Was just trying to figure out if I couldn't fix this, but it seems the mistake is done entirely in C (everything's fine up until .Call(Cdogroups, ...) AFAICT), so it's above my weight class for now... basically the group concatenation must be ignoring the attr of the difftime objects

@arunsrinivasan arunsrinivasan modified the milestones: v2.0.0, v1.9.8 Aug 26, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment