Skip to content

Getting error subsetting object created via saveHDF5SummarizedExperiment on different machine (and different R version) #69

@epurdom

Description

@epurdom

Hello,

I am having difficulty trying to subset a SingleCellExperiment object that was created via saveHDF5SummarizedExperiment on a different machine.

It was created from a SingleCellExperiment object on my computer server which is running R 4.5.0

HDF5Array::saveHDF5SummarizedExperiment(
  sce_object,
  dir = harmony_dir,
  replace = TRUE,
  verbose = TRUE
)

When I copied the saved folder to my laptop running R 4.6.0, I was able to load it with

loadHDF5SummarizedExperiment(file.path(results_dir, "sce_qc_harmony_h5se"))

but if I tried to subset it, I got the error message

> sce_object[1:2,1:4]
Error in validObject(.Object) : invalid class “DelayedSubset” object: 
    the supplied seed must support extract_array()

On my compute server, however, it has no problem loading it via loadHDF5SummarizedExperiment and subsetting it.

I can do other things with the object on my laptop, however, like

assay(sce_object)[1:3,1:4]
reducedDim(sce_object,"PCA")[1:4,1:3]
colData(sce_object)

Indeed, I was able to work with it and extract information perfectly well for a good bit until I tried to subset it.

I tried to make a minimal example object on the computer server to recreate the error ( using an example from saveHDF5SummarizedExperiment ). I saved it and transferred it to my laptop, and I did not have a problem -- the subsetting works fine. So it doesn't seem directly related to the difference in R versions (or much more subtly!)

# This object can be read and subsetted on both machines
library(SingleCellExperiment)

nrow <- 200
ncol <- 6
counts <- matrix(as.integer(runif(nrow * ncol, 1, 1e4)), nrow)
colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
                     row.names=LETTERS[1:6])
se0 <- SingleCellExperiment(assays=list(counts=counts), colData=colData)
reducedDim(se0,"FakePCA")<-matrix(rnorm(ncol*3),ncol)

## Save 'se0' as an HDF5-based SummarizedExperiment object:
dir <- "temph5/test1"
h5_se0 <- HDF5Array::saveHDF5SummarizedExperiment(
  se0,
  dir = dir,
  replace = TRUE,
  verbose = TRUE
)

So I am baffled about how to even go about diagnosing the error. I don't know if this is related to these issues on DelayedArray: Bioconductor/DelayedArray#125 or Bioconductor/DelayedArray#112?

I would appreciate any suggestions! Thanks

Here is the session info on my compute server:

# On my compute server
> sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Los_Angeles
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] SingleCellExperiment_1.32.0 SummarizedExperiment_1.40.0
 [3] Biobase_2.70.0              GenomicRanges_1.62.0       
 [5] Seqinfo_1.0.0               HDF5Array_1.38.0           
 [7] h5mread_1.2.0               rhdf5_2.54.0               
 [9] DelayedArray_0.36.0         SparseArray_1.10.2         
[11] S4Arrays_1.10.0             IRanges_2.44.0             
[13] abind_1.4-8                 S4Vectors_0.48.0           
[15] MatrixGenerics_1.22.0       matrixStats_1.5.0          
[17] BiocGenerics_0.56.0         generics_0.1.4             
[19] Matrix_1.7-4                SCF_4.1.0                  

loaded via a namespace (and not attached):
[1] lattice_0.22-7      rhdf5filters_1.22.0 XVector_0.50.0     
[4] Rhdf5lib_1.32.0     grid_4.5.0          compiler_4.5.0     
[7] tools_4.5.0        
> 

and here is the session Info from my laptop:

# My laptop
> sessionInfo()
R version 4.6.0 (2026-04-24)
Platform: aarch64-apple-darwin23
Running under: macOS Sonoma 14.8.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SingleCellExperiment_1.34.0 SummarizedExperiment_1.42.0
 [3] Biobase_2.72.0              GenomicRanges_1.64.0       
 [5] Seqinfo_1.2.0               HDF5Array_1.40.0           
 [7] h5mread_1.4.0               rhdf5_2.56.0               
 [9] DelayedArray_0.38.1         SparseArray_1.12.2         
[11] S4Arrays_1.12.0             IRanges_2.46.0             
[13] abind_1.4-8                 S4Vectors_0.50.1           
[15] MatrixGenerics_1.24.0       matrixStats_1.5.0          
[17] BiocGenerics_0.58.1         generics_0.1.4             
[19] Matrix_1.7-5               

loaded via a namespace (and not attached):
[1] lattice_0.22-9      rhdf5filters_1.24.0 XVector_0.52.0      Rhdf5lib_2.0.0     
[5] grid_4.6.0          compiler_4.6.0      tools_4.6.0        

and on my laptop, BiocManager::valid() highlights only bit64 as being out-of-date, but I can't actually update it with their command -- nothing happens because it's the most current version available for my machine.

> BiocManager::valid()

* sessionInfo()

R version 4.6.0 (2026-04-24)
Platform: aarch64-apple-darwin23
Running under: macOS Sonoma 14.8.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.6/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SingleCellExperiment_1.34.0 SummarizedExperiment_1.42.0
 [3] Biobase_2.72.0              GenomicRanges_1.64.0       
 [5] Seqinfo_1.2.0               HDF5Array_1.40.0           
 [7] h5mread_1.4.0               rhdf5_2.56.0               
 [9] DelayedArray_0.38.1         SparseArray_1.12.2         
[11] S4Arrays_1.12.0             IRanges_2.46.0             
[13] abind_1.4-8                 S4Vectors_0.50.1           
[15] MatrixGenerics_1.24.0       matrixStats_1.5.0          
[17] BiocGenerics_0.58.1         generics_0.1.4             
[19] Matrix_1.7-5               

loaded via a namespace (and not attached):
[1] lattice_0.22-9      rhdf5filters_1.24.0 XVector_0.52.0      Rhdf5lib_2.0.0     
[5] grid_4.6.0          compiler_4.6.0      tools_4.6.0         BiocManager_1.30.27

Bioconductor version '3.23'

  * 1 packages out-of-date
  * 0 packages too new

create a valid installation with

  BiocManager::install("bit64", update = TRUE, ask = FALSE, force = TRUE)

more details: BiocManager::valid()$too_new, BiocManager::valid()$out_of_date

Warning message:
1 packages out-of-date; 0 packages too new 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions