Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upBug: split.data.table can not split on a factor column if it is named 'x' #3151
Comments
|
The previous report was not totally correct. 'x' must not be the splitting variable: temp5 <- data.table(y = factor("a"), x = 1)
split(temp5, by = "y")
# Error: is.data.table(x) is not TRUEHowever, I think the diagnosis given above is still correct, and the proposed fix eliminates the bug. I can submit a PR but maybe you guys have a less hackish solution for this. |
|
Thanks for the report, and thanks for finding solution. PR is very welcome. library(data.table)
temp5 <- data.table(y = factor("a"), x = 1)
split(temp5, by = "y", verbose=TRUE)
#Processing split.data.table with: x[i = make.levels(x, cols = "y", sorted = FALSE), j = list(.ll.tech.split = list(if (.N == 0L) .SD[0L] else .SD)), by = .EACHI, .SDcols = c("y", "x"), on = "y"]
#Error: is.data.table(x) is not TRUE |
|
OK, and thanks for the pointer to |
Just found the following bug:
So the problem only emerges if the splitting variable is called 'x' and it is a factor. The problem can be rooted back to this
temp = eval(dtq)call insplit.data.table. In thedtqunevaluated call,make.levels(x, cols=.cols, sorted=.sorted)finds the 'x' variable instead of the 'x' data.table. A quick fix is to writemake.levels(.___x, cols=.cols, sorted=.sorted)instead, and do a temporary assignment.___x <- xbeforetemp = eval(dtq).SessionInfo (the bug is also present in 1.11.8):