Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upfsetequal() fails for single character columns #2318
Comments
This comment has been minimized.
This comment has been minimized.
Thanks for report. I don't have R at the moment so cannot check... maybe it is related to the way that rolling join handle single character column? https://github.com/Rdatatable/data.table/blob/master/R/setops.R#L235 |
This comment has been minimized.
This comment has been minimized.
Installing the latest dev version from the repo here (
data.table version: |
This comment has been minimized.
This comment has been minimized.
The case is somewhat broader -- it happens whenever the final column is a string:
I'm on a months-old version: data.table 1.11.9 IN DEVELOPMENT built 2018-10-06 03:32:28 UTC |
This comment has been minimized.
This comment has been minimized.
Lines 233 to 244 in 5b16cc5 those rolling joins reports matching rows for @franknarf1 example I tried to play with rollends , below setup solve the particular issue but makes existing unit tests failing
ans1 = target[current, roll=tolerance, rollends=c(TRUE,FALSE), which=TRUE, on=jn.on]
ans2 = target[current, roll=-tolerance, rollends=c(TRUE,FALSE), which=TRUE, on=jn.on]
#...
ans1 = current[target, roll=tolerance, rollends=c(FALSE,TRUE), which=TRUE, on=jn.on]
ans2 = current[target, roll=-tolerance, rollends=c(FALSE,TRUE), which=TRUE, on=jn.on] If issue occurs only when there are character columns in a data.table then we don't have to do rolling join at all. It is there only to handle floating point tolerance. |
fsetequal() on 2 data.tables only containing a single character column fails to detect that they are not equal.
The following reprex:
The issue seems to be with
ignore.row.order
inall.equal
:When adding an additional column, numeric or character, fsetequal works as expected.
data.table version: 1.10.4
Session Info