Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upforder should detect and switch to integer type on columns that aren't really double #1738
Comments
arunsrinivasan
commented
Jun 10, 2016
•
|
|
|
This looks OK for Date where we don't have to verify that real is actually integer. For ordinary numeric fields we would need to call |
I asked R-core about this once. The reason they made Date
Agree. How about just emitting a message for now. Here are current timings: > set.seed(1L)
> dt = data.table(d=sample(seq(as.Date("2015-01-01"), as.Date("2015-12-31"), by="days"),
1e7, TRUE))
> typeof(dt$d)
[1] "double"
> dt[, i:=as.integer(d)]
> dt
d i
<Date> <int>
1: 2015-04-07 16532
2: 2015-05-16 16571
3: 2015-07-29 16645
4: 2015-11-28 16767
5: 2015-03-15 16509
---
9999996: 2015-05-04 16559
9999997: 2015-11-19 16758
9999998: 2015-12-31 16800
9999999: 2015-01-24 16459
10000000: 2015-08-08 16655
> system.time(dt[, .N, by=d])
user system elapsed
0.953 0.072 0.377
> system.time(dt[, .N, by=i])
user system elapsed
0.425 0.072 0.172
> system.time(dt[, .N, by=as.integer(d)])
user system elapsed
0.385 0.061 0.183
> isReallyReal(dt$d) # full scan needed here to confirm it is not really real
[1] 0
> system.time(isReallyReal(dt$d))
user system elapsed
0.036 0.000 0.037
> system.time(as.integer(dt$d))
user system elapsed
0.044 0.000 0.043 with 10x the rows (1e8) [ one timing shown here to keep it simple; I confirmed timing was stable locally ] > dt = data.table(d=sample(seq(as.Date("2015-01-01"), as.Date("2015-12-31"), by="days"),
1e8, TRUE))
> dt[, i:=as.integer(d)]
> system.time(dt[, .N, by=d])
user system elapsed
7.929 0.778 2.801
> system.time(dt[, .N, by=i])
user system elapsed
3.415 0.647 1.206
> system.time(dt[, .N, by=as.integer(d)])
user system elapsed
3.745 0.715 1.630
> system.time(isReallyReal(dt$d))
user system elapsed
0.364 0.000 0.364
> system.time(as.integer(dt$d))
user system elapsed
0.348 0.076 0.424 |