Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed .shallow to consistently retain keys and indices. #2337

Merged
merged 10 commits into from Sep 11, 2017

Conversation

MarkusBonsch
Copy link
Contributor

@MarkusBonsch MarkusBonsch commented Sep 7, 2017

Closes #2336.

Keys and indices are retained correctly by shallow if retain.key = TRUE and cols != NULL.
The largest possible part of an index is retained.
If the original index was on x1, x2, x3, and `cols = c("x1", "x2"),
the shallow copy gets the same index for c("x1", "x2") (if there was a native index on x1 and x2, this is kept)

Tests have been added.

A benchmark (code at end of this post) shows that there is no negative speed impact:

master PR
33 ms 33 ms

The improved key retainment makes it possible to clean setattr calls in foverlaps.R while still passing all tests.

Code for benchmark

library(data.table)
library(microbenchmark)

nrow <- 1e6
ncol <- 20
nindex <- 20
ncopycol <- 10

DT <- data.table(x1 = rnorm(nrow))
setindex(DT, x1)
index <- c("x1")
for(col in seq_len(ncol-1)){
  DT[, paste0("x", col + 1) := rnorm(nrow)]
  if(col < nindex){
    index <- c(index, paste0("x", col+1))
    setindexv(DT, index)
  }
}

test <- microbenchmark(shallow = data.table:::.shallow(DT, 
                                                       cols = paste0("x", seq_len(ncopycol)), 
                                                       retain.key = TRUE), 
                       times = 100, 
                       unit = "ms")

@MarkusBonsch
Copy link
Contributor Author

@MarkusBonsch MarkusBonsch commented Sep 8, 2017

Gets a problem on foverlaps test in travis that I don't get locally. Will need to investigate.

@MarkusBonsch MarkusBonsch reopened this Sep 8, 2017
@MarkusBonsch MarkusBonsch reopened this Sep 9, 2017
@codecov-io
Copy link

@codecov-io codecov-io commented Sep 9, 2017

Codecov Report

Merging #2337 into master will increase coverage by 0.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2337      +/-   ##
==========================================
+ Coverage   91.09%   91.14%   +0.05%     
==========================================
  Files          61       61              
  Lines       11785    11804      +19     
==========================================
+ Hits        10735    10759      +24     
+ Misses       1050     1045       -5
Impacted Files Coverage Δ
R/foverlaps.R 94.3% <100%> (-0.11%) ⬇️
R/data.table.R 97.45% <100%> (+0.02%) ⬆️
src/rbindlist.c 88.93% <0%> (+0.19%) ⬆️
src/forder.c 94.47% <0%> (+0.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c1b6d0...52622cc. Read the comment docs.

@MarkusBonsch
Copy link
Contributor Author

@MarkusBonsch MarkusBonsch commented Sep 9, 2017

Initially, my local tests didn't include fOverlaps since genomicRanges was not installed. Now, everything should be fixed.

mattdowle added a commit that referenced this issue Sep 11, 2017
… to logic changes only (concerns shallow and keys)
@mattdowle mattdowle merged commit 61a25ba into Rdatatable:master Sep 11, 2017
4 checks passed
@mattdowle mattdowle added this to the v1.10.6 milestone Sep 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants