Sparse to dense coercion when running merge on two Seurat objects. #9125

joshuak94 · 2024-07-19T09:51:37Z

I was trying to create a reproducible example of another issue I'm having with JoinLayers() taking an indefinite amount of time (killed manually after ~12 hours).

The dataset I used is from here, I used the gene_count_cleaned_sampled_100k.rds file along with the cell_annotation.csv file for metadata.

I split the gene matrix into two groups: E11.5 cells and E13.5 cells. When merging, I get the following warnings, and then eventually an error:

Warning message in asMethod(object):
“sparse->dense coercion: allocating vector of size 6.6 GiB”
Warning message in asMethod(object):
“sparse->dense coercion: allocating vector of size 3.6 GiB”

Error: cannot allocate vector of size 5.1 Gb
Traceback:

1. merge(data_115, data_135, add.cell.ids = c("115", "135"))
2. merge(data_115, data_135, add.cell.ids = c("115", "135"))
3. merge.default(data_115, data_135, add.cell.ids = c("115", "135"))
4. merge(as.data.frame(x), as.data.frame(y), ...)
5. merge.data.frame(as.data.frame(x), as.data.frame(y), ...)
6. cbind(x[ij[, 1L], , drop = FALSE], y[ij[, 2L], , drop = FALSE])
7. x[ij[, 1L], , drop = FALSE]
8. `[.data.frame`(x, ij[, 1L], , drop = FALSE)

My memory usage also skyrockets to 400+ GB.

Source code:

library(Seurat)

data = readRDS("/project/moca/gene_count_cleaned_sampled_100k.RDS")
metadata = read.csv("/project/moca/cell_annotate.csv")
rownames(data) = gsub("\\.\\d+$", "", rownames(data))

metadata_subset115 = metadata[which(metadata$sample %in% colnames(data) & metadata$development_stage == 11.5), ]
metadata_subset135 = metadata[which(metadata$sample %in% colnames(data) & metadata$development_stage == 13.5), ]

data_115 = data[, which(colnames(data) %in% metadata_subset115$sample)]
data_seurat_115 = CreateSeuratObject(data_115, meta.data = metadata_subset115)

data_135 = data[, which(colnames(data) %in% metadata_subset135$sample)]
data_seurat_135 = CreateSeuratObject(data_135, meta.data = metadata_subset135)

merged_data = merge(data_115, data_135, add.cell.ids=c("115", "135"))

sessionInfo():

R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: MarIuX64 2.0 GNU/Linux

Matrix products: default
BLAS:   /pkg/R-4.4.0-0/lib/R/lib/libRblas.so 
LAPACK: /usr/lib/liblapack.so.3.10.1

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

time zone: Europe/Berlin
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Seurat_5.1.0       SeuratObject_5.0.2 sp_2.1-4          

loaded via a namespace (and not attached):
  [1] deldir_2.0-4           pbapply_1.7-2          gridExtra_2.3         
  [4] rlang_1.1.4            magrittr_2.0.3         RcppAnnoy_0.0.22      
  [7] spatstat.geom_3.3-2    matrixStats_1.3.0      ggridges_0.5.6        
 [10] compiler_4.4.0         png_0.1-8              vctrs_0.6.5           
 [13] reshape2_1.4.4         stringr_1.5.1          pkgconfig_2.0.3       
 [16] crayon_1.5.3           fastmap_1.2.0          utf8_1.2.4            
 [19] promises_1.3.0         purrr_1.0.2            jsonlite_1.8.8        
 [22] goftest_1.2-3          later_1.3.2            uuid_1.1-1            
 [25] spatstat.utils_3.0-5   irlba_2.3.5.1          parallel_4.4.0        
 [28] cluster_2.1.6          R6_2.5.1               ica_1.0-3             
 [31] stringi_1.8.4          RColorBrewer_1.1-3     spatstat.data_3.1-2   
 [34] reticulate_1.38.0      spatstat.univar_3.0-0  parallelly_1.37.1     
 [37] lmtest_0.9-40          scattermore_1.2        Rcpp_1.0.12           
 [40] IRkernel_1.3.2         tensor_1.5             future.apply_1.11.2   
 [43] zoo_1.8-12             base64enc_0.1-3        sctransform_0.4.1     
 [46] httpuv_1.6.15          Matrix_1.7-0           splines_4.4.0         
 [49] igraph_2.0.3           tidyselect_1.2.1       abind_1.4-5           
 [52] spatstat.random_3.3-1  codetools_0.2-20       miniUI_0.1.1.1        
 [55] spatstat.explore_3.3-1 listenv_0.9.1          lattice_0.22-6        
 [58] tibble_3.2.1           plyr_1.8.9             shiny_1.8.1.1         
 [61] ROCR_1.0-11            evaluate_0.24.0        Rtsne_0.17            
 [64] future_1.33.2          fastDummies_1.7.3      survival_3.5-8        
 [67] polyclip_1.10-6        fitdistrplus_1.2-1     pillar_1.9.0          
 [70] KernSmooth_2.23-22     plotly_4.10.4          generics_0.1.3        
 [73] RcppHNSW_0.6.0         IRdisplay_1.1          ggplot2_3.5.1         
 [76] munsell_0.5.1          scales_1.3.0           globals_0.16.3        
 [79] xtable_1.8-4           glue_1.7.0             lazyeval_0.2.2        
 [82] tools_4.4.0            data.table_1.15.4      RSpectra_0.16-1       
 [85] pbdZMQ_0.3-10          RANN_2.6.1             leiden_0.4.3.1        
 [88] dotCall64_1.1-1        cowplot_1.1.3          grid_4.4.0            
 [91] tidyr_1.3.1            colorspace_2.1-0       nlme_3.1-164          
 [94] patchwork_1.2.0        repr_1.1.6             cli_3.6.3             
 [97] spatstat.sparse_3.1-0  spam_2.10-0            fansi_1.0.6           
[100] viridisLite_0.4.2      dplyr_1.1.4            uwot_0.2.2            
[103] gtable_0.3.5           digest_0.6.36          progressr_0.14.0      
[106] ggrepel_0.9.5          htmlwidgets_1.6.4      htmltools_0.5.8.1     
[109] lifecycle_1.0.4        httr_1.4.7             mime_0.12             
[112] MASS_7.3-60.2

The text was updated successfully, but these errors were encountered:

rsatija · 2024-07-19T20:04:26Z

Thank you for sending this, which is very helpful for us to debug.

Can you check if the rownames of your metadata matches the column names of your object? i.e., all(rownames(object@meta.data)==colnames(object)) if your object is called object?

This relates to #9125

Let us know , and we will take a look early next week and get back to you ASAP.

joshuak94 · 2024-07-20T08:02:31Z

all(rownames(data_seurat_115@meta.data)==colnames(data_seurat_115)) 
all(rownames(data_seurat_135@meta.data)==colnames(data_seurat_135))

Both yield TRUE.

joshuak94 · 2024-08-12T14:25:31Z

Hi @rsatija, I was wondering if there was an update regarding this issue?

xlucpu · 2024-10-18T16:36:50Z

same issue here but for Xenium data, No idea why and how to resolve it.

xenium.obj <- SCTransform(xenium.obj, assay = "Xenium")
Running SCTransform on assay: Xenium
Running SCTransform on layer: counts
vst.flavor='v2' set. Using model with fixed slope and excluding poisson genes.
Variance stabilizing transformation of count matrix of size 377 by 376392
Model formula is y ~ log_umi
Get Negative Binomial regression parameters per gene
Using 376 genes, 5000 cells
Found 2 outliers - those will be ignored in fitting/regularization step

Second step: Get residuals using fitted parameters for 377 genes
Error in asMethod(object) :
(converted from warning) sparse->dense coercion: allocating vector of size 1.1 GiB

joshuak94 added the bug Something isn't working label Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse to dense coercion when running merge on two Seurat objects. #9125

Sparse to dense coercion when running merge on two Seurat objects. #9125

joshuak94 commented Jul 19, 2024 •

edited

Loading

rsatija commented Jul 19, 2024

joshuak94 commented Jul 20, 2024

joshuak94 commented Aug 12, 2024

xlucpu commented Oct 18, 2024

Sparse to dense coercion when running merge on two Seurat objects. #9125

Sparse to dense coercion when running merge on two Seurat objects. #9125

Comments

joshuak94 commented Jul 19, 2024 • edited Loading

rsatija commented Jul 19, 2024

joshuak94 commented Jul 20, 2024

joshuak94 commented Aug 12, 2024

xlucpu commented Oct 18, 2024

joshuak94 commented Jul 19, 2024 •

edited

Loading