Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved key handling when assigning with ':=' #2389

merged 5 commits into from Sep 28, 2017


Copy link

@MarkusBonsch MarkusBonsch commented Sep 27, 2017

This closes #2372.

Now, when updating columns with :=, existing keys and indices are retained up to the first modified column (exclusive).
Example: key on x1, x2, x3 --> assignment on x2 --> key on x1.

The tests 1400 ff had to be modified since they were testing for the old behaviour.
Additional tests have been added to make sure, all cases are handled, also on empty data.tables.

I tested for C memory allocation problems with R -d valgrind and found no issues.
There is an issue with a segmentataion fault on unloading the package but I discovered that this is a problem of the master (issue [#2388])

A small benchmark (code at the end of this post) shows that there is no speed penalty associated with the new implementation:

master PR
0.25 ms 0.27 ms

nrow <- 1
ncol <- 20
nindex <- 20
nassigncol <- 10

DT <- data.table(x1 = rnorm(nrow))
setindex(DT, x1)
index <- c("x1")
for(col in seq_len(ncol-1)){
  DT[, paste0("x", col + 1) := rnorm(nrow)]
  if(col < nindex){
    index <- c(index, paste0("x", col+1))
    setindexv(DT, index)

test <- microbenchmark(assign = copy(DT)[, paste0("x", (ncol - nassigncol):ncol) := 1], 
                       times = 100, 
                       unit = "ms")

Copy link

@codecov-io codecov-io commented Sep 27, 2017

Codecov Report

Merging #2389 into master will increase coverage by 0.04%.
The diff coverage is 94.93%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2389      +/-   ##
+ Coverage   91.24%   91.29%   +0.04%     
  Files          62       62              
  Lines       11978    12023      +45     
+ Hits        10929    10976      +47     
+ Misses       1049     1047       -2
Impacted Files Coverage Δ
src/assign.c 94.12% <94.93%> (-0.24%) ⬇️
src/forder.c 94.47% <0%> (+0.78%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f422f5e...5badddf. Read the comment docs.

I added a few more comments in the file at the top explaining the thinking here.
Copy link

@mattdowle mattdowle commented Sep 27, 2017

Sorry - I guess I broke it by removing shift alias. I didn't think that was needed as shift is exported. I'll fix.

I've added you as a project member so that you can create branches inside the main repo if you'd like. This will mean I can pull, test and push to your branch, rather than me using the online editor as I'm doing now.

Copy link
Contributor Author

@MarkusBonsch MarkusBonsch commented Sep 28, 2017

Thank you very much for the invitation! I am happy to join.

mattdowle added 3 commits Sep 28, 2017
And tidied up that section at the top with hopefully clearer comments.
Currently, we have to skip test numbers ending zero because, for example, if 1419.10 fails, it prints as 1419.1 the same as 1419.1 and we can't distinguish.  Haven't thought of a solution yet.  For now, I just think of it as a bit of culture/personality in the code,  like missing unlucky number 13.
@mattdowle mattdowle added this to the v1.10.6 milestone Sep 28, 2017
@mattdowle mattdowle merged commit 6968a0e into Rdatatable:master Sep 28, 2017
4 checks passed
Copy link
Contributor Author

@MarkusBonsch MarkusBonsch commented Sep 29, 2017

Thank you for cleaning up. I am still learning :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet

Successfully merging this pull request may close these issues.

3 participants