Skip to content

'join' causes R to crash on empty-string suffix  #2228

@simon-anders

Description

@simon-anders

Two tibbles with identical column names are joined. If one uses an empty string as one of the two suffixes for the column names, left_join runs into an end-less loop, causing R to crash:

library(  dplyr )

tbl1 <- tibble( key = LETTERS[1:10], val = 1:10 )
tbl2 <- tibble( key = LETTERS[1:10], val = 11:20 )

# This here works as expected
left_join( tbl1, tbl2, by="key", suffix = c( ".A", ".B" ) )

# This here crashes R
left_join( tbl1, tbl2, by="key", suffix = c( "", ".B" ) )

I guess the implementation somewhere assumes that a column without suffix is the original one, but not the joined one, and so misses some stop condition.


Session info:

R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=fi_FI.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dplyr_0.5.0

loaded via a namespace (and not attached):
[1] magrittr_1.5   R6_2.1.2       assertthat_0.1 tools_3.3.2    DBI_0.4-1     
[6] tibble_1.1     Rcpp_0.12.6   

Metadata

Metadata

Assignees

Labels

bugan unexpected problem or unintended behavior

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions