Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] .shallow messes up keys and indices #2336

Closed
MarkusBonsch opened this issue Sep 7, 2017 · 1 comment
Closed

[BUG] .shallow messes up keys and indices #2336

MarkusBonsch opened this issue Sep 7, 2017 · 1 comment
Milestone

Comments

@MarkusBonsch
Copy link
Contributor

.shallow can retain keys with the argument retain.key = TRUE. However, this feature suffers from two problems that need to be solved before the function can be exported as requested in #2323:

  1. Key gets lost if more columns than key are chosen with the cols argument.
library(data.table)
DT <- data.table(x1 = sample(1:10, 100, replace = TRUE),
                 x2 = sample(1:10, 100, replace = TRUE),
                 x3 = sample(1:10, 100, replace = TRUE),
                 y  = rnorm(100), 
                 key = c("x1", "x2", "x3"))
## as expected
identical(key(data.table:::.shallow(DT, retain.key = TRUE)), c("x1", "x2", "x3"))
# TRUE
identical(key(data.table:::.shallow(DT, cols = c("x1", "x2", "x3"), retain.key = TRUE)), c("x1", "x2", "x3"))
# TRUE
## key gets lost for no reason
test <- data.table:::.shallow(DT, cols = c("x1", "x2", "x3", "y"), retain.key = TRUE)
identical(key(test), c("x1", "x2", "x3"))
# FALSE
## although the copy is of course sorted properly
data.table:::is.sorted(test, by = c("x1", "x2", "x3"))
# TRUE
  1. Indices remain, even if the columns are dropped
setindex(DT, x2, y)
setindex(DT, x1, y)

## index just gets copied, no matter if the columns are not present anymore
test <- data.table:::.shallow(DT, cols = c("x3"), retain.key = TRUE)
names(attributes(attr(test, "index")))
# "__x2__y" "__x1__y"
names(test)
# "x3"
@MarkusBonsch
Copy link
Contributor Author

Due to my slight misperception of secondary indices (see also #2372, my comment on 2017-10-04) , I need to fix my implementation (minor fix):

  • if not all columns of an existing secondary index are present, drop it completely.

I will fix ASAP and create a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants