Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDF5 file output is not compatible with cellranger aggr v3 #36

Closed
whelena opened this issue Oct 10, 2019 · 9 comments
Closed

HDF5 file output is not compatible with cellranger aggr v3 #36

whelena opened this issue Oct 10, 2019 · 9 comments

Comments

@whelena
Copy link

whelena commented Oct 10, 2019

Hi I am trying to test out cellbender on my dataset and ran into some problem when aggregating the output files (.h5). I am wondering if they are not compatible with cellranger v3 or if they are not compatible with cellranger aggr.

Here's the error message that came out:
The molecule info HDF5 file (/mnt/data/20190923_R317_Run2_CRV3/R317_r2_Basal.h5) was produced by an older version of Cell Ranger. Reading these files is unsupported.

@davidlieb
Copy link

I am also about to test out CellBender remove-background on a couple 10X channels that I will want to aggregate, so I'm also interested in this issue. Thanks.

@whelena
Copy link
Author

whelena commented Nov 5, 2019

I tried running CellBender after cellranger aggr and it works fine

@sjfleming
Copy link
Member

I am not sure if I totally understand, but I would be a bit cautious here... CellBender is really meant to be run on a per-sample level. If you are aggregating reads from different flow cell runs of the exact same library prep, you might be okay. But if you are aggregating anything else, I would urge you to run CellBender on the separate datasets first, and then aggregate them downstream.

@sjfleming
Copy link
Member

@whelena Was your first post about trying to run CellBender on the .h5 output of cellranger aggr (this should run okay), or were you trying to run cellranger aggr on the outputs of CellBender (this has not been tested, but there are other ways to aggregate data from two experiments downstream, using scanpy or Seurat for instance)?

@whelena
Copy link
Author

whelena commented Nov 6, 2019

@sjfleming I was aggregating reads from the same library prep on different flow cells (I ran CellBender i=on the .h5 output of cellranger aggr)

My original post is about using cellranger aggr on the outputs of CellBender. I ended up integrating them on Seurat.

Overall, what I found was running CellBender on cellranger aggr output and running CellBender on individual datasets then integrating in Seurat leads to almost identical output.

@gauravgadhvi
Copy link

Hi,
I am wondering if this feature has been added to newer releases of CellBender? Can I aggregate multiple outputs of CellBender (raw.h5) using cellranger aggr?

Thank you for your help.

@sjfleming
Copy link
Member

Hi @gauravgadhvi , sorry for the delayed response. I do not envision adding this functionality, because I do not think we would be able to. As I understand it, cellranger aggr now runs on the molecule_info.h5 files. These contain read-level information (as opposed to count matrix information collapsed to the level of UMIs). The count matrix (which is the input and output of CellBender) has less information than the molecule_info.h5 files which are needed for cellranger aggr to run.

I would recommend integrating separate samples, which have separately been run through CellBender, using scanpy or Seurat or whatever downstream single cell analysis tools you prefer. I personally never have a use case for cellranger aggr.

@gvassart
Copy link

gvassart commented Mar 4, 2023

I am trying Cellbender on data from cell ranger 7.1.0 and I get the following error message (even after having checked the correct presence and compatibility of h5py and hdf5)/
(CellBender) C:\Users\vassa>cellbender remove-background --input "D:\Documents\Resultats\LGR\Maryam\scRNASeq\Second-bis\out-230223_cr-mm_MMA2-230110-2-pp\outs" --output "C:\Users\vassa\Downloads\raw_feature_bc_matrix.h5" --expected-cells 5000 --total-droplets-included 20000 --fpr 0.01 --epochs 5
Traceback (most recent call last):
File "C:\Users\vassa\miniconda3\envs\CellBender\Scripts\cellbender-script.py", line 33, in
sys.exit(load_entry_point('cellbender', 'console_scripts', 'cellbender')())
File "c:\users\vassa\cellbender\cellbender\base_cli.py", line 91, in main
cli_dict = generate_cli_dictionary()
File "c:\users\vassa\cellbender\cellbender\base_cli.py", line 52, in generate_cli_dictionary
module_cli = importlib.import_module('.'.join(module_cli_str_list))
File "C:\Users\vassa\miniconda3\envs\CellBender\lib\importlib_init_.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1006, in _gcd_import
File "", line 983, in _find_and_load
File "", line 967, in _find_and_load_unlocked
File "", line 677, in _load_unlocked
File "", line 728, in exec_module
File "", line 219, in call_with_frames_removed
File "c:\users\vassa\cellbender\cellbender\remove_background\cli.py", line 4, in
from cellbender.remove_background.data.dataset import SingleCellRNACountsDataset
File "c:\users\vassa\cellbender\cellbender\remove_background\data\dataset.py", line 10, in
import anndata
File "C:\Users\vassa\miniconda3\envs\CellBender\lib\site-packages\anndata_init
.py", line 7, in
from .core.anndata import AnnData
File "C:\Users\vassa\miniconda3\envs\CellBender\lib\site-packages\anndata_core\anndata.py", line 17, in
import h5py
File "C:\Users\vassa\miniconda3\envs\CellBender\lib\site-packages\h5py_init
.py", line 33, in
from . import version
File "C:\Users\vassa\miniconda3\envs\CellBender\lib\site-packages\h5py\version.py", line 15, in
from . import h5 as _h5
File "h5py\h5.pyx", line 1, in init h5py.h5
ImportError: DLL load failed: the procedure specified is not available.
Thanks for helping.

@sjfleming
Copy link
Member

Hi @gvassart , based on the traceback, I gather you're running on a windows machine.

It seems to me like you have a problem with your installation of anndata. I'm seeing that the traceback seems to point to the line import anndata in cellbender/remove_background/data/dataset.py

Ultimately the problem points at your h5py installation not working correctly.

I would recommend trying this in your CellBender conda environment:

conda install -c conda-forge anndata

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants