Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal slicing argument for scalar dataspace when attempting to read 10x_h5 with version 1.9.0 #2203

Closed
nadavyayon opened this issue Apr 2, 2022 · 5 comments
Labels
Area - IO Reading and writing Bug 🐛 Needs info❔ More information needed

Comments

@nadavyayon
Copy link

nadavyayon commented Apr 2, 2022

Hi

When attempting so simply read a h5 file with:

Python version - 3.8.8
# results_file = path to 10X h5 file 
# adata = sc.read_10x_h5(results_file)

I get the following error which is fixed when rolling back to scanpy=1.8.2

ValueError                                Traceback (most recent call last)
<ipython-input-3-8ddd0a13aab2> in <module>
      8     print(results_file)
----> 9     adata = sc.read_10x_h5(results_file)
     10     adata.var_names_make_unique()
     11     adata.obs.index = meta.iloc[idx,2] + '-' + adata.obs.index

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in read_10x_h5(filename, genome, gex_only, backup_url)
    181         v3 = '/matrix' in f
    182     if v3:
--> 183         adata = _read_v3_10x_h5(filename, start=start)
    184         if genome:
    185             if genome not in adata.var['genome'].values:

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in _read_v3_10x_h5(filename, start)
    266         try:
    267             dsets = {}
--> 268             _collect_datasets(dsets, f["matrix"])
    269 
    270             from scipy.sparse import csr_matrix

/opt/conda/lib/python3.8/site-packages/scanpy/readwrite.py in _collect_datasets(dsets, group)
    254     for k, v in group.items():
    255         if isinstance(v, h5py.Dataset):
--> 256             dsets[k] = v[:]
    257         else:
    258             _collect_datasets(dsets, v)

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

/opt/conda/lib/python3.8/site-packages/h5py/_hl/dataset.py in __getitem__(self, args, new_dtype)
    767         if self.shape == ():
    768             fspace = self.id.get_space()
--> 769             selection = sel2.select_read(fspace, args)
    770             if selection.mshape is None:
    771                 arr = numpy.ndarray((), dtype=new_dtype)

/opt/conda/lib/python3.8/site-packages/h5py/_hl/selections2.py in select_read(fspace, args)
     99     """
    100     if fspace.shape == ():
--> 101         return ScalarReadSelection(fspace, args)
    102 
    103     raise NotImplementedError()

/opt/conda/lib/python3.8/site-packages/h5py/_hl/selections2.py in __init__(self, fspace, args)
     84             self.mshape = ()
     85         else:
---> 86             raise ValueError("Illegal slicing argument for scalar dataspace")
     87 
     88         self.mspace = h5s.create(h5s.SCALAR)

ValueError: Illegal slicing argument for scalar dataspace

Thanks!!

Nadav

@ivirshup
Copy link
Member

ivirshup commented Apr 4, 2022

Can you share the output of sc.logging.print_versions() in the environment that's causing you problems?

I'm unable to reproduce with recent cellranger outputs.

@ivirshup ivirshup added Bug 🐛 Needs info❔ More information needed Area - IO Reading and writing labels Apr 4, 2022
@nadavyayon
Copy link
Author

nadavyayon commented Apr 8, 2022

Hey sorry for the delay:

-----
anndata     0.7.5
scanpy      1.9.0
-----
PIL                 8.1.2
anyio               NA
attr                20.3.0
babel               2.9.0
backcall            0.2.0
brotli              NA
cairo               1.20.0
certifi             2020.12.05
cffi                1.14.5
chardet             4.0.0
cloudpickle         1.6.0
colorama            0.4.4
cycler              0.10.0
cython_runtime      NA
cytoolz             0.11.0
dask                2021.03.1
dateutil            2.8.1
decorator           4.4.2
fsspec              0.8.7
google              NA
h5py                3.1.0
idna                2.10
igraph              0.8.3
ipykernel           5.5.0
ipython_genutils    0.2.0
jedi                0.18.0
jinja2              2.11.3
joblib              1.0.1
json5               NA
jsonschema          3.2.0
jupyter_server      1.4.1
jupyterlab_server   2.3.0
kiwisolver          1.3.1
leidenalg           0.8.3
llvmlite            0.34.0
louvain             0.7.0
markupsafe          1.1.1
matplotlib          3.3.4
mpl_toolkits        NA
natsort             7.1.1
nbclassic           NA
nbformat            5.1.2
numba               0.51.2
numpy               1.20.1
packaging           20.9
pandas              1.2.3
parso               0.8.1
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prometheus_client   NA
prompt_toolkit      3.0.16
psutil              5.8.0
ptyprocess          0.7.0
pvectorc            NA
pyarrow             0.16.0
pygments            2.8.0
pyparsing           2.4.7
pyrsistent          NA
pytoml              NA
pytz                2021.1
requests            2.25.1
ruamel              NA
scipy               1.6.1
send2trash          NA
session_info        1.0.0
setuptools_scm      NA
six                 1.15.0
sklearn             0.24.1
sniffio             1.2.0
socks               1.7.1
sphinxcontrib       NA
storemagic          NA
tblib               1.7.0
texttable           1.6.3
tlz                 0.11.0
toolz               0.11.1
tornado             6.1
traitlets           5.0.5
typing_extensions   NA
urllib3             1.26.3
wcwidth             0.2.5
yaml                5.3.1
zmq                 22.0.3
-----
IPython             7.21.0
jupyter_client      6.1.11
jupyter_core        4.7.1
jupyterlab          3.0.9
notebook            6.2.0
-----
Python 3.8.8 | packaged by conda-forge | (default, Feb 20 2021, 16:22:27) [GCC 9.3.0]
Linux-4.15.0-112-generic-x86_64-with-glibc2.10
-----
Session information updated at 2022-04-08 14:58

@beetlejuice007
Copy link

beetlejuice007 commented May 17, 2022

I got similar error when I was trying to use .h5 file from cellbender output. I have multiome data.

`>>> adata = scanpy.read_10x_h5("/sc/arion/projects/hmDNAmap/snHeroin/analysis/ARC_TD005235-354/outs/cellbender/cb_feature_bc_matrix_filtered.h5", gex_only=False)`
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 183, in read_10x_h5
    adata = _read_v3_10x_h5(filename, start=start)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 268, in _read_v3_10x_h5
    _collect_datasets(dsets, f["matrix"])
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/scanpy/readwrite.py", line 256, in _collect_datasets
    dsets[k] = v[:]
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/dataset.py", line 738, in __getitem__
    selection = sel2.select_read(fspace, args)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/selections2.py", line 101, in select_read
    return ScalarReadSelection(fspace, args)
  File "/sc/arion/work/gujarh01/software/anaconda3/lib/python3.9/site-packages/h5py/_hl/selections2.py", line 86, in __init__
    raise ValueError("Illegal slicing argument for scalar dataspace")

> **ValueError: Illegal slicing argument for scalar dataspace**

>>> scanpy.logging.print_versions()

anndata 0.8.0
scanpy 1.9.1

PIL 8.4.0
beta_ufunc NA
binom_ufunc NA
bottleneck 1.3.2
cffi 1.14.6
cloudpickle 2.0.0
colorama 0.4.4
concurrent NA
cycler 0.10.0
cython_runtime NA
cytoolz 0.11.0
dask 2021.10.0
dateutil 2.8.2
defusedxml 0.7.1
encodings NA
fsspec 2021.08.1
genericpath NA
h5py 3.3.0
igraph 0.9.6
jinja2 2.11.3
joblib 1.1.0
kiwisolver 1.3.1
leidenalg 0.8.7
llvmlite 0.37.0
markupsafe 1.1.1
matplotlib 3.4.3
mkl 2.4.0
mpl_toolkits NA
natsort 7.1.1
nbinom_ufunc NA
ntpath NA
numba 0.54.1
numexpr 2.7.3
numpy 1.20.3
opcode NA
packaging 21.0
pandas 1.3.4
pkg_resources NA
posixpath NA
psutil 5.8.0
pyexpat NA
pyparsing 3.0.4
pytz 2021.3
scipy 1.7.1
scrublet NA
session_info 1.0.0
six 1.16.0
sklearn 0.24.2
sphinxcontrib NA
sre_compile NA
sre_constants NA
sre_parse NA
tblib 1.7.0
texttable 1.6.4
tlz 0.11.0
toolz 0.11.1
typing_extensions NA
wcwidth 0.2.5
yaml 6.0
zope NA

Python 3.9.7 (default, Sep 16 2021, 13:09:58) [GCC 7.5.0]
Linux-3.10.0-957.10.1.el7.x86_64-x86_64-with-glibc2.17

Session information updated at 2022-05-17 14:56

@erankotler
Copy link

erankotler commented May 24, 2022

I'm getting the same error using the CellBender tutorial output. Attaching the file to make it easier to reproduce.

tiny_10x_pbmc_filtered.h5.zip

sc.logging.print_versions()

-----
anndata     0.7.8
scanpy      1.9.1
-----
PIL                 9.0.1
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
cffi                1.15.0
cycler              0.10.0
cython_runtime      NA
dateutil            2.8.2
debugpy             1.6.0
decorator           5.1.1
defusedxml          0.7.1
doubletdetection    4.2
entrypoints         0.4
executing           0.8.3
google              NA
h5py                3.6.0
hypergeom_ufunc     NA
igraph              0.9.9
ipykernel           6.10.0
ipython_genutils    0.2.0
ipywidgets          7.7.0
jedi                0.18.1
joblib              1.1.0
kiwisolver          1.4.2
leidenalg           0.8.9
llvmlite            0.38.0
louvain             0.7.1
matplotlib          3.5.1
matplotlib_inline   NA
mkl                 2.4.0
mpl_toolkits        NA
mudata              0.1.1
muon                0.1.2
natsort             8.1.0
nbinom_ufunc        NA
numba               0.55.1
numexpr             2.8.1
numpy               1.21.2
organize_metadata   NA
packaging           21.3
pandas              1.4.1
parso               0.8.3
pexpect             4.8.0
phenograph          1.5.7
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.28
psutil              5.9.0
ptyprocess          0.7.0
pure_eval           0.2.2
pycparser           2.21
pydev_ipython       NA
pydevconsole        NA
pydevd              2.8.0
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.11.2
pynndescent         0.5.6
pyparsing           3.0.7
pytz                2022.1
scikits             NA
scipy               1.8.0
seaborn             0.11.2
session_info        1.0.0
setuptools          62.0.0
setuptools_scm      NA
six                 1.16.0
sklearn             1.0.2
stack_data          0.2.0
statsmodels         0.13.2
tables              3.7.0
texttable           1.6.4
threadpoolctl       3.1.0
tornado             6.1
tqdm                4.63.1
traitlets           5.1.1
typing_extensions   NA
umap                0.5.2
wcwidth             0.2.5
yaml                6.0
zipp                NA
zmq                 22.3.0
-----
IPython             8.2.0
jupyter_client      7.1.2
jupyter_core        4.9.2
notebook            6.4.10
-----
Python 3.9.11 (main, Mar 28 2022, 10:10:35) [GCC 7.5.0]
Linux-4.15.0-142-generic-x86_64-with-glibc2.27
-----
Session information updated at 2022-05-24 15:05

@chris-rands
Copy link
Contributor

chris-rands commented Oct 10, 2022

Was this fixed by #2344 ? Edit: Yes

gokceneraslan added a commit that referenced this issue Oct 10, 2022
* Handle scalar datasets too

After @ivirshup's pytables PR (#2064) we started having issues with loading h5 files with scalar datasets, such as those created by CellBender (broadinstitute/CellBender#128). It is currently not an issue for the 10X h5 files for now since they don't have any scalars, however it'd be good to just handle scalars as well as arrays 1- to fix the cellbender file loading problem 2- to fix potential problems we might end up having if 10X h5 format includes scalar datasets.

* Add a scalar to the multiple_genomes.h5 test file

* Fixes #2203
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - IO Reading and writing Bug 🐛 Needs info❔ More information needed
Projects
None yet
Development

No branches or pull requests

5 participants