Skip to content

write_block() method chokes on all-zero block when TileDBRealizationSink is sparse #33

@hpages

Description

@hpages

Trying to write an all-zero matrix with writeTileDBArray():

library(TileDBArray)

m0 <- matrix(0L, nrow=50, ncol=80)

M0 <- writeTileDBArray(m0, sparse=TRUE)
# Error: [TileDB::Query] Error: Fix-Sized input attribute/dimension 'd2' is not set correctly. 
# Data buffer is not set.
# In addition: Warning messages:
# 1: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 2: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 3: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)

This works fine if the matrix has at least one non-zero value:

m1 <- matrix(c(0L, -5L, integer(3998)), nrow=50, ncol=80)
M1 <- writeTileDBArray(m1, sparse=TRUE)  # ok!

However, if we reduce the block size in order to trigger block processing:

setAutoBlockSize(5000)
M1 <- writeTileDBArray(m1, sparse=TRUE)
# Error: [TileDB::Query] Error: Fix-Sized input attribute/dimension 'd2' is not set correctly. 
# Data buffer is not set.
# In addition: Warning messages:
# 1: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 2: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 3: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)

So it looks like all-zero blocks are causing problems.

Trying to write an all-zero block with write_block():

sink <- TileDBRealizationSink(c(50L, 80L), type="integer", sparse=TRUE)
viewport <- ArrayViewport(dim(sink), IRanges(c("21-24", "31-40")))

block <- matrix(rpois(40, lambda=0.2), nrow=4, ncol=10)
write_block(sink, viewport, block)  # ok

block <- matrix(0L, nrow=4, ncol=10)  # all-zero block
write_block(sink, viewport, block)
# Error: [TileDB::Query] Error: Fix-Sized input attribute/dimension 'd2' is not set correctly. 
# Data buffer is not set.
# In addition: Warning messages:
# 1: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 2: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)
# 3: In libtiledb_query_buffer_assign_ptr(buflist[[k]], alltypes[k],  :
#   subscript out of bounds (index 0 >= vector size 0)

A quick debug() session indicates that the error happens here:

obj[] <- data.frame(store)

So it seems that the root of the problem is that the subassigment ([<-) method for tiledb_array objects chokes on a zero-row data.frame.

The easy workaround would be for write_block() to not do anything when sink@sparse is TRUE and length(idx) is 0.

Thanks,
H.

sessionInfo():

> sessionInfo()
R version 4.6.0 alpha (2026-04-05 r89793)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.4 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-4.6.r89793/lib/libRblas.so 
LAPACK: /home/hpages/R/R-4.6.r89793/lib/libRlapack.so;  LAPACK version 3.12.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_GB              LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] RcppSpdlog_0.0.28     TileDBArray_1.21.1    DelayedArray_0.37.1  
 [4] SparseArray_1.11.13   S4Arrays_1.11.1       IRanges_2.45.0       
 [7] abind_1.4-8           S4Vectors_0.49.1      MatrixGenerics_1.23.0
[10] matrixStats_1.5.0     BiocGenerics_0.57.0   generics_0.1.4       
[13] Matrix_1.7-5         

loaded via a namespace (and not attached):
 [1] nanotime_0.3.13 bit_4.6.0       lattice_0.22-9  zoo_1.8-15     
 [5] spdl_0.0.5      RcppCCTZ_0.2.14 XVector_0.51.0  bit64_4.6.0-1  
 [9] nanoarrow_0.8.0 grid_4.6.0      compiler_4.6.0  tools_4.6.0    
[13] tiledb_0.33.0   Rcpp_1.1.1     

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions