Skip to content

Coercion from SparseArraySeed to sparse matrix fails with duplicates #88

@LTLA

Description

@LTLA

Using an example in ?SparseArraySeed to illustrate:

library(DelayedArray)
set.seed(10)

## A big very sparse DelayedMatrix object:
nzindex4 <- cbind(sample(25000, 600000, replace=TRUE),
                   sample(195000, 600000, replace=TRUE))
nzdata4 <- runif(600000)
sas4 <- SparseArraySeed(c(25000, 195000), nzindex4, nzdata4)
X <- as(sas4, "sparseMatrix")
## Error in validObject(.Object) : 
##   invalid class “dgCMatrix” object: slot i is not *strictly* increasing inside a column

The problem seems to be caused by duplicate non-zero coordinates (i.e., same i and j) in the SAS itself. One could argue that they should not be allowed to exist in the SAS at all - what does it even mean to have a duplicate entry?

The constructor should probably warn/error/remove duplicates, though this may be an expensive operation. Perhaps add a check=TRUE option that can be set to FALSE for all internal uses where duplicates cannot exist.

Session information
R Under development (unstable) (2021-01-24 r79876)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.5 LTS

Matrix products: default
BLAS:   /home/luna/Software/R/trunk/lib/libRblas.so
LAPACK: /home/luna/Software/R/trunk/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] DelayedArray_0.17.7  IRanges_2.25.6       S4Vectors_0.29.7    
[4] MatrixGenerics_1.3.1 matrixStats_0.58.0   BiocGenerics_0.37.1 
[7] Matrix_1.3-2        

loaded via a namespace (and not attached):
[1] compiler_4.1.0  grid_4.1.0      lattice_0.20-41

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions