can't read mudata created with muon (python) #9

bio-la · 2022-07-28T14:21:58Z

Hello, thanks for working on interoperability between seurat and mudata!

I can't read a mudata that I created following your multimodal tutorial using ReadH5MU

test<-ReadH5MU("data_test.dir/pbmc_w3_teaseq.h5mu")
Error in dataset[[name]]$read() : attempt to apply non-function

I have no problems loading the object with muon in python

import muon as mu
mu.read_h5mu("data_test.dir/pbmc_w3_teaseq.h5mu")
MuData object with n_obs × n_vars = 5805 × 113187
  obs:  'sample', 'well', 'leiden_multiplex', 'leiden_mofa', 'leiden_wnn'
  var:  'highly_variable', 'gene_ids', 'feature_types', 'genome', 'interval'
  obsm: 'X_mofa', 'X_umap', 'X_wnn_umap'
  varm: 'LFs'
  obsp: 'mofa_connectivities', 'mofa_distances', 'wnn_connectivities', 'wnn_distances'
  3 modalities
    rna:        5805 x 16381
      obs:      'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt', 'leiden'
      var:      'gene_ids', 'feature_types', 'genome', 'interval', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
      uns:      'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'lognorm'
      obsp:     'connectivities', 'distances'
    atac:       5805 x 96760
      obs:      'n_fragments', 'n_duplicate', 'n_mito', 'n_unique', 'altius_count', 'altius_frac', 'gene_bodies_count', 'gene_bodies_frac', 'peaks_count', 'peaks_frac', 'tss_count', 'tss_frac', 'barcodes', 'cell_name', 'well_id', 'chip_id', 'batch_id', 'pbmc_sample_id', 'DoubletScore', 'DoubletEnrichment', 'TSSEnrichment', 'n_genes_by_counts', 'total_counts', 'n_counts', 'leiden'
      var:      'gene_ids', 'feature_types', 'genome', 'interval', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'mean', 'std'
      uns:      'hvg', 'leiden', 'leiden_colors', 'log1p', 'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'counts', 'lognorm'
      obsp:     'connectivities', 'distances'
    prot:       5805 x 46
      obs:      'total_counts'
      var:      'highly_variable'
      uns:      'neighbors', 'pca', 'umap'
      obsm:     'X_pca', 'X_umap'
      varm:     'PCs'
      layers:   'counts'
      obsp:     'connectivities', 'distances'

I can explore the h5 but it breaks where the error says. it also seems to expect some attributes that I don't have in the mudata

h5 <- open_and_check_mudata("~/Documents/devel/data_test.dir/pbmc_w3_teaseq.h5mu")
metadata <- read_with_index(h5[["obs"]])
dataset = h5[['obs']]
dataset_attr <- tryCatch({
  h5attributes(dataset)
  }, error = function(e) {
  list("_index" = "_index")
  })
  indexcol <- "_index"
if ("_index" %in% names(dataset_attr)) {
  indexcol <- dataset_attr$`_index`
}
dataset_attr
columns <- names(dataset)
columns <- columns[columns != "__categories"]
columns

dataset[["sample"]]$read()

Error in dataset[[name]]$read() : attempt to apply non-function

values_attr <-h5attributes(dataset)
values_attr 
$`column-order`
[1] "sample" "well"  

$`_index`
[1] "_index"

$`encoding-type`
[1] "dataframe"

$`encoding-version`
[1] "0.2.0"

# so the following line will be NULL
# values_attr$categories

any suggestions?

thanks!

R version 4.1.2 (2021-11-01)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] bmcite.SeuratData_0.3.0 pbmc3k.SeuratData_3.1.4 SeuratData_0.2.2        hdf5r_1.3.5             MuDataSeurat_0.0.0.9000 magrittr_2.0.3          datapasta_3.1.0        
 [8] forcats_0.5.1           stringr_1.4.0           dplyr_1.0.8             purrr_0.3.4             readr_2.1.2             tidyr_1.2.0             tibble_3.1.6           
[15] ggplot2_3.3.5           tidyverse_1.3.1        

loaded via a namespace (and not attached):
  [1] readxl_1.4.0          backports_1.4.1       plyr_1.8.7            igraph_1.3.0          lazyeval_0.2.2        splines_4.1.2         listenv_0.8.0         scattermore_0.8      
  [9] digest_0.6.29         htmltools_0.5.2       fansi_1.0.3           tensor_1.5            cluster_2.1.3         ROCR_1.0-11           tzdb_0.3.0            remotes_2.4.2        
 [17] globals_0.14.0        modelr_0.1.8          matrixStats_0.62.0    spatstat.sparse_2.1-0 prettyunits_1.1.1     colorspace_2.0-3      rappdirs_0.3.3        rvest_1.0.2          
 [25] ggrepel_0.9.1         haven_2.4.3           callr_3.7.0           crayon_1.5.1          jsonlite_1.8.0        spatstat.data_2.1-4   survival_3.3-1        zoo_1.8-9            
 [33] glue_1.6.2            polyclip_1.10-0       gtable_0.3.0          leiden_0.3.9          clipr_0.8.0           pkgbuild_1.3.1        future.apply_1.8.1    abind_1.4-5          
 [41] scales_1.1.1          DBI_1.1.2             spatstat.random_2.2-0 miniUI_0.1.1.1        Rcpp_1.0.8.3          viridisLite_0.4.0     xtable_1.8-4          reticulate_1.24      
 [49] spatstat.core_2.4-2   bit_4.0.4             htmlwidgets_1.5.4     httr_1.4.2            anndata_0.7.5.3       RColorBrewer_1.1-3    ellipsis_0.3.2        Seurat_4.1.0         
 [57] ica_1.0-2             pkgconfig_2.0.3       uwot_0.1.11           dbplyr_2.1.1          deldir_1.0-6          utf8_1.2.2            tidyselect_1.1.2      rlang_1.0.2          
 [65] reshape2_1.4.4        later_1.3.0           munsell_0.5.0         cellranger_1.1.0      tools_4.1.2           cli_3.3.0             generics_0.1.2        broom_0.7.12         
 [73] ggridges_0.5.3        fastmap_1.1.0         goftest_1.2-3         processx_3.5.3        bit64_4.0.5           fs_1.5.2              fitdistrplus_1.1-8    RANN_2.6.1           
 [81] pbapply_1.5-0         future_1.24.0         nlme_3.1-157          mime_0.12             formatR_1.12          xml2_1.3.3            compiler_4.1.2        rstudioapi_0.13      
 [89] plotly_4.10.0         curl_4.3.2            png_0.1-7             spatstat.utils_2.3-0  reprex_2.0.1          stringi_1.7.6         ps_1.6.0              lattice_0.20-45      
 [97] Matrix_1.4-1          SeuratDisk_0.0.0.9019 vctrs_0.3.8           pillar_1.7.0          lifecycle_1.0.1       spatstat.geom_2.4-0   lmtest_0.9-40         RcppAnnoy_0.0.19     
[105] addinexamples_0.1.0   data.table_1.14.2     cowplot_1.1.1         irlba_2.3.5           httpuv_1.6.5          patchwork_1.1.1       R6_2.5.1              promises_1.2.0.1     
[113] KernSmooth_2.23-20    gridExtra_2.3         parallelly_1.31.0     codetools_0.2-18      MASS_7.3-56           assertthat_0.2.1      rprojroot_2.0.3       withr_2.5.0          
[121] SeuratObject_4.0.4    sctransform_0.3.3     mgcv_1.8-40           parallel_4.1.2        hms_1.1.1             grid_4.1.2            rpart_4.1.16          Rtsne_0.15           
[129] shiny_1.7.1           lubridate_1.8.0

The text was updated successfully, but these errors were encountered:

gtca · 2022-07-28T14:45:37Z

Hey @bio-la, thanks for the detailed report, I think the issue is just with anndata v0.8 breaking forward compatibility, i.e. AnnData files from versions >=0.8 cannot be read with packages that were targeting earlier versions of files.
This is of course something we're implementing next for MuDataMAE/MuDataSeurat packages but I don't have a solid timeline to share yet...

bio-la · 2022-07-28T14:56:12Z

thanks for answering this! do you mean that mudataSeurat was built expecting objects created with anndata <0.8, and which version of muon/mudata ?
in the meantime do you think I should be able to create a version of the mudata by downgrading muon ? (I'm currently running latest muon )
thanks!
f

gtca · 2022-07-28T15:18:50Z

Sorry for being unclear. That should be a better way to phrase it: MuDataSeurat was built before AnnData v0.8 spec and hasn't been updated to handle new versions of AnnData files yet.
Basically, it is the same issue as using anndata <0.8 in Python to read new files (scverse/anndata#698, scverse/anndata#739).

I believe using mudata <0.2.0 (e.g. 0.1.2) together with anndata <0.8.0 (e.g. 0.7.8) should help with that!
muon relies on mudata for serialisation so its version shouldn't be of a problem here.

gtca · 2022-08-04T13:30:17Z

Hey @bio-la, I added new categorical values handling in the latest commits so you might be able to read files written by the latest anndata/mudata. This is not fully compliant with the rest of the new encoding spec but relying on rhdf5 to figure out all the other types seems to do the trick so far for the files I've tested it on.

I.e. this is a small fix to remedy the current issue but also the first step in the direction of this library becoming v0.8-compliant. 😃

tea <- ReadH5MU("muon-tutorials/data/teaseq/pbmc_w3_teaseq.h5mu")
tea
# An object of class Seurat 
# 113187 features across 5805 samples within 3 assays 
# Active assay: rna (16381 features, 2910 variable features)
#  2 other assays present: atac, prot
#  10 dimensional reductions calculated: MOFA, MOFA_UMAP, UMAP, WNN_UMAP, rnaPCA, rnaUMAP, atacPCA, atacUMAP, protPCA, protUMAP

You can also note that it respects mod-order attribute now and keeps the rna as the default assay.

rcannood · 2022-09-29T16:57:16Z

Hi @gtca !

I was encountering the above issue as well. I installed the latest development version of MuDataSeurat and am now experiencing the following error:

library(MuDataSeurat)

## VIASH START
par <- list(
  input = "resources_test/pbmc_1k_protein_v3/pbmc_1k_protein_v3_mms.h5mu",
  output = "output.rds"
)
## VIASH END

obj <- ReadH5MU(par$input)

Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 't': attempt to apply non-function
In addition: Warning messages:
1: In missing_on_read("/var", paste0("global variables metadata (",  :
  Missing on read: /var. Seurat does not support global variables metadata (genome, feature_types, gene_symbol).
2: In missing_on_read(paste0("some of mod/", modalityname, "/layers"),  :
  Missing on read: some of mod/rna/layers. Seurat does not support custom layers, unless labeled 'counts'.

Files which trigger this error:

…ntly broken, see [PMBio/MuDataSeurat#9.

* Add initial config generator for cellranger * add back lanes * fix variable * Untar reference input and add fastqs as symlinks * make fixes to pipeline execution script * fix processing of input files * make sure fastqs parameter is added to the csv * fix localcores parameter * fix directory traversal issue * silly commit message * fix pebkac issues * Add beginnings of tests, fix output * os.path.listdir() to os.listdir() * update script * update workflowhelper * clean up bin * test bd rhap using new reference, untar if need be * remove old resources scripts * update bdrhap_5kjrt script * update to new references * painstakingly fix bd rhapsody component * fix bd 5kjrt script * allow velocyto to work with compressed gtf files * fix velocyto * add links to original pages * fix test * fix pbmc script * add something to assert * fix test * Move to pathlib * Fix checking of ouput files. * Shutil doest not accept Path objects on python < 3.9 * Fix paths in tests * Fix paths in tests * CI Only download viash (ci force) * fix names * update changelog #ci force * Allow push run when using: ci force * Undo changes to labels * Use correct conditional check (ci force) * Fix typo (ci force) * Correct conditional (ci force) * Fix typo (ci force) * obsm location was changed in mms test resource * feature_type -> feature_types * `convert/from_h5mu_to_seurat`: Disabled because MuDataSeurat is currently broken, see [PMBio/MuDataSeurat#9. Co-authored-by: Dries Schaumont <5946712+DriesSchaumont@users.noreply.github.com>

DriesSchaumont · 2023-11-09T07:47:45Z

Hi @gtca, I am still observing the issue @rcannood described. I was wondering if there were any updates regarding supporting objects created with anndata > 0.8? Thanks!

DriesSchaumont · 2023-11-14T07:02:19Z

Thanks a lot, @ilia-kats 🙇

gtca added the 0.8 AnnData v0.8 spec label Aug 3, 2022

rcannood added a commit to openpipelines-bio/openpipeline that referenced this issue Sep 29, 2022

convert/from_h5mu_to_seurat: Disabled because MuDataSeurat is curre…

fed4702

…ntly broken, see [PMBio/MuDataSeurat#9.

ilia-kats closed this as completed in 48a1a10 Nov 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

can't read mudata created with muon (python) #9

can't read mudata created with muon (python) #9

bio-la commented Jul 28, 2022

gtca commented Jul 28, 2022

bio-la commented Jul 28, 2022

gtca commented Jul 28, 2022

gtca commented Aug 4, 2022 •

edited

Loading

rcannood commented Sep 29, 2022

DriesSchaumont commented Nov 9, 2023

DriesSchaumont commented Nov 14, 2023

can't read mudata created with muon (python) #9

can't read mudata created with muon (python) #9

Comments

bio-la commented Jul 28, 2022

gtca commented Jul 28, 2022

bio-la commented Jul 28, 2022

gtca commented Jul 28, 2022

gtca commented Aug 4, 2022 • edited Loading

rcannood commented Sep 29, 2022

DriesSchaumont commented Nov 9, 2023

DriesSchaumont commented Nov 14, 2023

gtca commented Aug 4, 2022 •

edited

Loading