Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid row order after join #1766

Closed
mllg opened this issue Jul 7, 2016 · 2 comments
Closed

Invalid row order after join #1766

mllg opened this issue Jul 7, 2016 · 2 comments
Assignees
Labels
Milestone

Comments

@mllg
Copy link
Contributor

mllg commented Jul 7, 2016

MWE:

library(data.table)

A = data.table(i = 1:6, j = rep(1:2, 3), x = letters[1:6], key = "i")
B = data.table(j = 1:2, y = letters[1:2], key = "j")

res = A[B, on = "j"]

res has key i but is unordered:

   i j x y
1: 1 1 a a
2: 3 1 c a
3: 5 1 e a
4: 2 2 b b
5: 4 2 d b
6: 6 2 f b

Setting the key manually (setkeyv(res, "i")) yields a warning.

Session info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

locale:
 [1] LC_CTYPE=de_DE.UTF-8       LC_NUMERIC=C               LC_TIME=de_DE.UTF-8        LC_COLLATE=C               LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=de_DE.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] data.table_1.9.6 nvimcom_0.9-19   setwidth_1.0-4

loaded via a namespace (and not attached):
[1] tools_3.3.1    parallel_3.3.1 chron_2.3-47
@mllg
Copy link
Contributor Author

mllg commented Jul 7, 2016

Side note: You can still index the resulting table, but the lookup on i just does not work. E.g.,

res[.(2), nomatch = 0]

returns a data.table with 0 rows.

@jangorecki jangorecki added the bug label Jul 7, 2016
@jangorecki
Copy link
Member

jangorecki commented Jul 7, 2016

Thanks for reporting. Reproducible on 1.9.7. The actual warning:

In setkeyv(x, cols, verbose = verbose, physical = physical) :
  Already keyed by this key but had invalid row order, key rebuilt. If you didn't go under the hood please let datatable-help know so the root cause can be fixed.

Looks like key is not removed during join, it should be removed. Optionally results could have key set on j column, as the key of B.

@jangorecki jangorecki added this to the v1.9.8 milestone Jul 7, 2016
@arunsrinivasan arunsrinivasan self-assigned this Jul 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants