Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error loading h5ad file from ScanPy. #1689

Closed
dkeitley opened this issue Jun 14, 2019 · 11 comments
Closed

Error loading h5ad file from ScanPy. #1689

dkeitley opened this issue Jun 14, 2019 · 11 comments

Comments

@dkeitley
Copy link

I'm trying to load a dataset into R (Wagner et al. 2018) that has been exported from ScanPy.

The exported file is an AnnData object in the h5ad file format, so I've tried to run the ReadH5AD Seurat function, but I get the following error:

> data = ReadH5AD(file = "scanpy_data.h5ad")
Pulling expression matrices and metadata
Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][],  : all(dims >= dims.min) is not TRUE

What does this mean? Is there something I can do to edit the file in python so that it can be imported into R and used with Seurat?

SessionInfo:

> sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] Seurat_3.0.1

loaded via a namespace (and not attached):
 [1] httr_1.4.0          tidyr_0.8.3         bit64_0.9-7
 [4] hdf5r_1.2.0         jsonlite_1.6        viridisLite_0.3.0
 [7] splines_3.6.0       lsei_1.2-0          R.utils_2.9.0
[10] gtools_3.8.1        Rdpack_0.11-0       assertthat_0.2.1
[13] ggrepel_0.8.1       globals_0.12.4      pillar_1.4.0
[16] lattice_0.20-38     glue_1.3.1          reticulate_1.12
[19] digest_0.6.19       RColorBrewer_1.1-2  SDMTools_1.1-221.1
[22] colorspace_1.4-1    cowplot_0.9.4       htmltools_0.3.6
[25] Matrix_1.2-17       R.oo_1.22.0         plyr_1.8.4
[28] pkgconfig_2.0.2     bibtex_0.4.2        tsne_0.1-3
[31] listenv_0.7.0       purrr_0.3.2         scales_1.0.0
[34] RANN_2.6.1          gdata_2.18.0        Rtsne_0.15
[37] tibble_2.1.1        ggplot2_3.1.1       ROCR_1.0-7
[40] pbapply_1.4-0       lazyeval_0.2.2      survival_2.43-3
[43] magrittr_1.5        crayon_1.3.4        R.methodsS3_1.7.1
[46] future_1.13.0       nlme_3.1-140        MASS_7.3-51.1
[49] gplots_3.0.1.1      ica_1.0-2           tools_3.6.0
[52] fitdistrplus_1.0-14 data.table_1.12.2   gbRd_0.4-11
[55] stringr_1.4.0       plotly_4.9.0        munsell_0.5.0
[58] cluster_2.0.9       irlba_2.3.3         compiler_3.6.0
[61] rsvd_1.0.1          caTools_1.17.1.2    rlang_0.3.4
[64] grid_3.6.0          ggridges_0.5.1      htmlwidgets_1.3
[67] igraph_1.2.4.1      bitops_1.0-6        npsurv_0.4-0
[70] gtable_0.3.0        codetools_0.2-16    reshape2_1.4.3
[73] R6_2.4.0            gridExtra_2.3       zoo_1.8-6
[76] dplyr_0.8.1         bit_1.1-14          future.apply_1.2.0
[79] KernSmooth_2.23-15  metap_1.1           ape_5.3
[82] stringi_1.4.3       parallel_3.6.0      Rcpp_1.0.1
[85] sctransform_0.2.0   png_0.1-7           tidyselect_0.2.5
[88] lmtest_0.9-37
@ChengTao2017
Copy link

Have you solved this problem?

@dkeitley
Copy link
Author

Have you solved this problem?

I've implemented a workaround that converts the data into another format. I loaded the data in ScanPy and exported to a loom file, which I'm loading with loomR. However, I'm still interested in what causes this error and if the h5ad file can be read directly.

@mojaveazure
Copy link
Member

Hi Dan,

Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.

If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

@qi825
Copy link

qi825 commented Oct 15, 2019

would you please tell me how to export to a loom file ? I have loaded the data in Scanpy.
My code:
data=sc.read_h5ad("abc.h5ad", backed=None, chunk_size=6000)
I will be appreciate if you can provide me with the following code ! @dkeitley

@Counts-Xin
Copy link

Have you solved this problem?

I've implemented a workaround that converts the data into another format. I loaded the data in ScanPy and exported to a loom file, which I'm loading with loomR. However, I'm still interested in what causes this error and if the h5ad file can be read directly.

sorry to bother you, in scanpy I convert to loom file,but I still cannot load in Seurat,how do you solve that?

@dkeitley
Copy link
Author

dkeitley commented May 9, 2020

would you please tell me how to export to a loom file ? I have loaded the data in Scanpy.
My code:
data=sc.read_h5ad("abc.h5ad", backed=None, chunk_size=6000)
I will be appreciate if you can provide me with the following code ! @dkeitley

Sorry @qi825 I must have missed this.

I must have lost my script for exporting/importing as a loom file, however I still have this which exports as a mtx file.

import scanpy as sc
import scipy as sp
zd = sc.read(path)
sp.io.mmwrite("matrix.mtx",zd.X)

...which you can then read into R with

library(Matrix)
counts = t(readMM(paste0(path,"matrix.mtx")))

Hope this helps @kzxkzx.

@AIBio
Copy link

AIBio commented Sep 21, 2020

Hi Dan,

Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.

If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi,
I met the same problem. But I am not familar with python and scanpy. I have load my H5AD file into python and data seem intact. Could you teach me how to remove the h5sparse_shape attribute or transpose it.
Thank you very much.

@AIBio
Copy link

AIBio commented Sep 21, 2020

Hi Dan,

Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.

If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi Dan,
Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.
If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi,
I met the same problem. But I am not familar with python and scanpy. I have load my H5AD file into python and data seem intact. Could you teach me how to remove the h5sparse_shape attribute or transpose it.
Thank you very much.

Hi,
After reading the document of package 'anndata', I figured it out.

Here is my code:
adata.T.write_h5ad(dat_dir + "/brie_quant_cell_trans.h5ad")

After that, I succeed in loading H5AD file into R by 'SeuratDisk'! I wish my experiences can help other people.
Thank you for your reply.

Hanwen

@pjlmac
Copy link

pjlmac commented Nov 18, 2020

Hi Dan,
Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.
If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi Dan,
Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.
If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi,
I met the same problem. But I am not familar with python and scanpy. I have load my H5AD file into python and data seem intact. Could you teach me how to remove the h5sparse_shape attribute or transpose it.
Thank you very much.

Hi,
After reading the document of package 'anndata', I figured it out.

Here is my code:
adata.T.write_h5ad(dat_dir + "/brie_quant_cell_trans.h5ad")

After that, I succeed in loading H5AD file into R by 'SeuratDisk'! I wish my experiences can help other people.
Thank you for your reply.

Hanwen

Hi Hanwen, I just tried your solution and I am getting the following error:
Error: Cannot add more or fewer meta.features information without values being named with feature names

Any thoughts? Thanks!
-Phil

@JulieBaker1
Copy link

JulieBaker1 commented Mar 20, 2022

Hi Dan,
Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.
If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi Dan,
Sorry for the delay. This dataset seems a bit off, I don't think this was made in the standard cells-as-columns, genes-as-rows format. It has a shape listed where the H5AD file was made from a matrix that had 63,530 rows (should be genes) and 30,677 columns (should be cells). I would double check to ensure that the matrix is as you'd expect.
If the data seem intact, you can either remove the h5sparse_shape attribute from the matrix (the X group in the H5AD file) or reverse it.

Hi,
I met the same problem. But I am not familar with python and scanpy. I have load my H5AD file into python and data seem intact. Could you teach me how to remove the h5sparse_shape attribute or transpose it.
Thank you very much.

Hi, After reading the document of package 'anndata', I figured it out.

Here is my code: adata.T.write_h5ad(dat_dir + "/brie_quant_cell_trans.h5ad")

After that, I succeed in loading H5AD file into R by 'SeuratDisk'! I wish my experiences can help other people. Thank you for your reply.

Hanwen

adata.T.T.write_h5ad(dat_dir + "/brie_quant_cell_trans.h5ad") solve my problem. Transposing once will transpose the counts matrix, so transposing twice will work.

@denvercal1234GitHub
Copy link

@JulieBaker1 -- Can you help me explain why we need to do transpose twice? I downgraded anndata module and rewrote the h5ad object without any T as in mojaveazure/seurat-disk#109 and it appeared to work. Am I missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants