Skip to content

Error in [.data.table due to lapply #5685

@pj-paul

Description

@pj-paul

Minimal reproducible example

See post on StackOverflow . The bug diagnosis is from Roland. I'm just sharing it here since I asked the original question on SO.

mtDT <- as.data.table(mtcars)

tmp <- mtDT[, .(plot = list(ggplot(data = .SD) + geom_point(aes(x=wt, y=mpg)))), 
     by=.(carb,am)]

> Finding groups using forderv ... forder.c received 32 rows and 2 columns
> 0.000s elapsed (0.001s cpu) 
> Finding group sizes from the positions (can be avoided to save RAM) ... 0.000s elapsed (0.000s cpu) 
> Getting back original order ... forder.c received a vector type 'integer' length 9
> 0.000s elapsed (0.000s cpu) 
> lapply optimization is on, j unchanged as 'list(list(ggplot(data = .SD) + geom_point(aes(x = wt, y = mpg))))'
> GForce is on, left j unchanged
> Old mean optimization is on, left j unchanged.
> Making each group and running j (GForce FALSE) ... 
>   collecting discontiguous groups took 0.000s for 9 groups
>   eval(j) took 0.026s for 9 calls
> 0.027s elapsed (0.027s cpu) 

a <- tmp[,.(list(patchwork::wrap_plots(plot)))][[1]]

>Detected that j uses these columns: plot 

MAX_DEPTH = 5L

runlock = function(x, current_depth = 1L) {
  if (is.list(x) && current_depth <= MAX_DEPTH) {  # is.list() used to be is.recursive(), #4814
    if (inherits(x, 'data.table')) .Call(data.table:::C_unlock, x)
    else return(lapply(x, runlock, current_depth = current_depth + 1L))
  }
  return(invisible())
}

runlock(a)
#Error in `X[[i]]`:
#! Index out of bounds
> Called from: `[[.patchwork`(X, i)
#Run `rlang::last_trace()` to see where the error occurred.

If I replace the lapply loop with a for loop, it appears to work:

runlock = function(x, current_depth = 1L) {
  if (is.list(x) && current_depth <= MAX_DEPTH) {  # is.list() used to be is.recursive(), #4814
    if (inherits(x, 'data.table')) .Call(data.table:::C_unlock, x)
    else return(for(y in x) runlock(y, current_depth = current_depth + 1L))
  }
  return(invisible())
}

runlock(a)
#no error

I suspect the issue results from this code in lapply:

if (!is.vector(X) || is.object(X)) 
        X <- as.list(X)

Output of sessionInfo()

R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 
LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3;  LAPACK version 3.9.0

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8    
 [5] LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8    LC_PAPER=C.UTF-8       LC_NAME=C             
 [9] LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   

time zone: UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] data.table_1.14.8 htmlwidgets_1.6.2 ggwordcloud_0.5.0 wordcloud2_0.2.1  tm_0.7-11        
 [6] NLP_0.2-1         sentimentr_2.9.0  patchwork_1.1.2   magrittr_2.0.3    openxlsx_4.2.5.2 
[11] lubridate_1.9.2   forcats_1.0.0     stringr_1.5.0     dplyr_1.1.2       purrr_1.0.1      
[16] readr_2.1.4       tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.2     tidyverse_2.0.0  

loaded via a namespace (and not attached):
 [1] utf8_1.2.3        generics_0.1.3    xml2_1.3.5        slam_0.1-50       stringi_1.7.12   
 [6] hms_1.1.3         digest_0.6.33     evaluate_0.21     grid_4.3.1        timechange_0.2.0 
[11] fastmap_1.1.1     rprojroot_2.0.3   zip_2.3.0         syuzhet_1.0.6     fansi_1.0.4      
[16] scales_1.2.1      cli_3.6.1         rlang_1.1.1       munsell_0.5.0     yaml_2.3.7       
[21] withr_2.5.0       parallel_4.3.1    tools_4.3.1       tzdb_0.4.0        textclean_0.9.3  
[26] colorspace_2.1-0  png_0.1-8         vctrs_0.6.3       R6_2.5.1          lifecycle_1.0.3  
[31] pkgconfig_2.0.3   pillar_1.9.0      gtable_0.3.3      glue_1.6.2        Rcpp_1.0.11      
[36] xfun_0.39         tidyselect_1.2.0  rstudioapi_0.15.0 knitr_1.43        htmltools_0.5.5  
[41] rmarkdown_2.23    compiler_4.3.1    qdapRegex_0.7.5   lexicon_1.2.1  

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions