Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10x_HDF5 format support #5

Closed
radio1988 opened this issue Mar 21, 2018 · 6 comments
Closed

10x_HDF5 format support #5

radio1988 opened this issue Mar 21, 2018 · 6 comments

Comments

@radio1988
Copy link

Can you add support for 10x_HD5 format? Thanks!

https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/advanced/h5_matrices

@mohuangx
Copy link
Owner

Hi,

It seems that if you use the cellrangerRkit package, you can load the matrix as a GeneBCMatrix class. cellrangerRkit then has a function called exprs which takes in the GeneBCMatrix class and outputs a sparse matrix. The sparse matrix can then be provided as an input to the saver function.

Unfortunately, since cellrangerRkit is not on CRAN, it would be a bit cumbersome to have SAVER import functions from cellrangerRkit. Thus, I recommend you extract the matrix according the the cellrangerRkit vignette and use that as the input to SAVER.

Let me know if you have any additional questions.

Mo

@radio1988
Copy link
Author

Thanks, Mo!

This method worked on a smaller test dataset.

Probably because the target dataset is huge: 1.3M cells, 27k genes, with more than 2.6 billion non-zero values, get_matrix_from_h5() function from cellrangerRkit reported an error and failed to read sparse data matrix. How can I solve this problem?

fname = '~/imputation/data/10x_mouse_brain_1.3M/1M_neurons_filtered_gene_bc_matrices_h5.h5'
genome = 'mm10'
GeneBCMatrix = get_matrix_from_h5(fname ,genome)
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem, :
the dims contain negative values
Error in H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem, :
the dims contain negative values
Error in if ((lp <- length(p)) < 1 || p[1] != 0 || any((dp <- p[-1] - :
missing value where TRUE/FALSE needed
Calls: get_matrix_from_h5 -> sparseMatrix
In addition: Warning message:
In H5Dread(h5dataset = h5dataset, h5spaceFile = h5spaceFile, h5spaceMem = h5spaceMem, :
NAs produced by integer overflow while converting 64-bit integer or unsigned 32-bit integer from HDF5 to a 32-bit integer in R. Choose bit64conversion='bit64' or bit64conversion='double' to avoid data loss and see the vignette 'rhdf5' for more details about 64-bit integers.
Execution halted

@mohuangx
Copy link
Owner

Hi,

I'm not exactly sure what is causing the error but the get_matrix_from_h5() function seems to be calling functions from the package rhdf5. I would take a look at that package to see how to resolve the error.

Since this is not directly related to SAVER, I'm going to close the issue.

Best,
Mo

@radio1988
Copy link
Author

Thanks for your help!

@paupuigdevall
Copy link

Hi @radio1988 . Did you find a workaround to solve this function limitation?

@radio1988
Copy link
Author

Hello, paupuigdevall, Mo's solution worked for smaller datasets. It ran into issues for larger datasets two years ago. I did not find an easy solution at that time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants