Skip to content

in result of non-equi join, table is not actually correctly sorted even though it's keyed #4603

Closed
@myoung3

Description

@myoung3
library(data.table)

x <- setDT(
  structure(list(id1 = c(1L, 1L, 1L, 1L, 1L),
start_date = c(10590L, 10597L, 10601L, 10604L, 
10608L), end_date = c(10596L, 
10600L, 10603L, 10607L, 10610L), 
value1 = c(NA, NA, 2.63896640369205, 2.03415674303523, 1.42934708237841
), value2 = c(NA, NA, -0.577368046118137, -0.211425473706347, 
0.154517098705444)), row.names = c(NA, -5L), class = c( 
"data.frame"))
)

y <- setDT(structure(list(id1 = c(1L, 1L, 1L, 1L, 1L),
V1 = c(TRUE, TRUE, TRUE, TRUE, TRUE), start_date = c(16585L, 
14541L, 13429L, 16139L, 13915L), 
end_date = c(16589L, 14544L, 13436L, 16144L, 13919L
)), row.names = c(NA, -5L), class = c("data.frame")))



setkey(x,id1,start_date,end_date)
setkey(y,id1)
z <- x[y, list(V1),
       on=c("id1","end_date>=start_date","start_date<=end_date"),by=.EACHI]


##note that id is all the same value so the second key (end_date) should be sorted
stopifnot(identical(z$end_date,sort(z$end_date)))

key(z)
setkeyv(z,key(z)) #doesn't fix it

#setting key null then resetting key fixes it
setkey(z,NULL)
setkey(z,id1,end_date,start_date) 

stopifnot(identical(z$end_date,sort(z$end_date)))

Reproducible in both release and development data.table

sessionInfo

R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS:   /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.19.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.12.8

loaded via a namespace (and not attached):
[1] compiler_3.6.2 tools_3.6.2  

Metadata

Metadata

Assignees

Labels

bugnon-equi joinsrolling, overlapping, non-equi joins

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions