Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] .shallow messes up keys and indices #2336

MarkusBonsch opened this issue Sep 7, 2017 · 1 comment

[BUG] .shallow messes up keys and indices #2336

MarkusBonsch opened this issue Sep 7, 2017 · 1 comment


Copy link

@MarkusBonsch MarkusBonsch commented Sep 7, 2017

.shallow can retain keys with the argument retain.key = TRUE. However, this feature suffers from two problems that need to be solved before the function can be exported as requested in #2323:

  1. Key gets lost if more columns than key are chosen with the cols argument.
DT <- data.table(x1 = sample(1:10, 100, replace = TRUE),
                 x2 = sample(1:10, 100, replace = TRUE),
                 x3 = sample(1:10, 100, replace = TRUE),
                 y  = rnorm(100), 
                 key = c("x1", "x2", "x3"))
## as expected
identical(key(data.table:::.shallow(DT, retain.key = TRUE)), c("x1", "x2", "x3"))
identical(key(data.table:::.shallow(DT, cols = c("x1", "x2", "x3"), retain.key = TRUE)), c("x1", "x2", "x3"))
## key gets lost for no reason
test <- data.table:::.shallow(DT, cols = c("x1", "x2", "x3", "y"), retain.key = TRUE)
identical(key(test), c("x1", "x2", "x3"))
## although the copy is of course sorted properly
data.table:::is.sorted(test, by = c("x1", "x2", "x3"))
  1. Indices remain, even if the columns are dropped
setindex(DT, x2, y)
setindex(DT, x1, y)

## index just gets copied, no matter if the columns are not present anymore
test <- data.table:::.shallow(DT, cols = c("x3"), retain.key = TRUE)
names(attributes(attr(test, "index")))
# "__x2__y" "__x1__y"
# "x3"
Copy link
Contributor Author

@MarkusBonsch MarkusBonsch commented Oct 4, 2017

Due to my slight misperception of secondary indices (see also #2372, my comment on 2017-10-04) , I need to fix my implementation (minor fix):

  • if not all columns of an existing secondary index are present, drop it completely.

I will fix ASAP and create a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

No branches or pull requests

2 participants