Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot open example data files with current spatialdata #129

Closed
VolkerH opened this issue Mar 27, 2024 · 10 comments
Closed

Cannot open example data files with current spatialdata #129

VolkerH opened this issue Mar 27, 2024 · 10 comments

Comments

@VolkerH
Copy link

VolkerH commented Mar 27, 2024

Hi,
I tried to load some of the example datasets that are linked to here:
https://spatialdata.scverse.org/en/latest/tutorials/notebooks/datasets/README.html

  • sd.__version__ is 0.1.0, installed via pip as per the instructions.
  • python is 3.11 on Windows (miniconda)

Steinbock dataset

sdata = sd.read_zarr("C:/Users/xxx/Downloads/steinbock_io/data.zarr/")

Traceback: (expand)

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[20], line 1
----> 1 sdata = sd.read_zarr("C:/Users/xxx/Downloads/steinbock_io/data.zarr/")
      2 print(sdata)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\spatialdata\_io\io_zarr.py:126, in read_zarr(store, selection)
    124 if "tables" in selector and "tables" in f:
    125     group = f["tables"]
--> 126     tables = read_table_and_validate(f_store_path, f, group, tables)
    128 if "table" in selector and "table" in f:
    129     warnings.warn(
    130         f"Table group found in zarr store at location {f_store_path}. Please update the zarr store"
    131         f"to use tables instead.",
    132         DeprecationWarning,
    133         stacklevel=2,
    134     )

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\spatialdata\_io\_utils.py:349, in read_table_and_validate(zarr_store_path, group, subgroup, tables)
    344     tables[table_name] = read_elem(f_elem)
    345     # we can replace read_elem with read_anndata_zarr after this PR gets into a release (>= 0.6.5)
    346     # https://github.com/scverse/anndata/pull/1057#pullrequestreview-1530623183
    347     # table = read_anndata_zarr(f_elem)
    348 else:
--> 349     tables[table_name] = read_anndata_zarr(f_elem_store)
    350 if TableModel.ATTRS_KEY in tables[table_name].uns:
    351     # fill out eventual missing attributes that has been omitted because their value was None
    352     attrs = tables[table_name].uns[TableModel.ATTRS_KEY]

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\zarr.py:87, in read_zarr(store)
     84         return _read_legacy_raw(f, func(elem), read_dataframe, func)
     85     return func(elem)
---> 87 adata = read_dispatched(f, callback=callback)
     89 # Backwards compat (should figure out which version)
     90 if "raw.X" in f:

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\experimental\_dispatch_io.py:48, in read_dispatched(elem, callback)
     44 from anndata._io.specs import _REGISTRY, Reader
     46 reader = Reader(_REGISTRY, callback=callback)
---> 48 return reader.read_elem(elem)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\utils.py:207, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    205     raise ValueError("No element found in args.")
    206 try:
--> 207     return func(*args, **kwargs)
    208 except Exception as e:
    209     path, key = _get_display_path(store).rsplit("/", 1)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\registry.py:256, in Reader.read_elem(self, elem, modifiers)
    254 if self.callback is None:
    255     return read_func(elem)
--> 256 return self.callback(read_func, elem.name, elem, iospec=iospec)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\zarr.py:72, in read_zarr.<locals>.callback(func, elem_name, elem, iospec)
     69 def callback(func, elem_name: str, elem, iospec):
     70     if iospec.encoding_type == "anndata" or elem_name.endswith("/"):
     71         return AnnData(
---> 72             **{
     73                 k: read_dispatched(v, callback)
     74                 for k, v in elem.items()
     75                 if not k.startswith("raw.")
     76             }
     77         )
     78     elif elem_name.startswith("/raw."):
     79         return None

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\zarr.py:73, in <dictcomp>(.0)
     69 def callback(func, elem_name: str, elem, iospec):
     70     if iospec.encoding_type == "anndata" or elem_name.endswith("/"):
     71         return AnnData(
     72             **{
---> 73                 k: read_dispatched(v, callback)
     74                 for k, v in elem.items()
     75                 if not k.startswith("raw.")
     76             }
     77         )
     78     elif elem_name.startswith("/raw."):
     79         return None

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\experimental\_dispatch_io.py:48, in read_dispatched(elem, callback)
     44 from anndata._io.specs import _REGISTRY, Reader
     46 reader = Reader(_REGISTRY, callback=callback)
---> 48 return reader.read_elem(elem)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\utils.py:207, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    205     raise ValueError("No element found in args.")
    206 try:
--> 207     return func(*args, **kwargs)
    208 except Exception as e:
    209     path, key = _get_display_path(store).rsplit("/", 1)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\registry.py:256, in Reader.read_elem(self, elem, modifiers)
    254 if self.callback is None:
    255     return read_func(elem)
--> 256 return self.callback(read_func, elem.name, elem, iospec=iospec)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\zarr.py:81, in read_zarr.<locals>.callback(func, elem_name, elem, iospec)
     79     return None
     80 elif elem_name in {"/obs", "/var"}:
---> 81     return read_dataframe(elem)
     82 elif elem_name == "/raw":
     83     # Backwards compat
     84     return _read_legacy_raw(f, func(elem), read_dataframe, func)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\utils.py:207, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    205     raise ValueError("No element found in args.")
    206 try:
--> 207     return func(*args, **kwargs)
    208 except Exception as e:
    209     path, key = _get_display_path(store).rsplit("/", 1)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\zarr.py:141, in read_dataframe(group)
    139     return read_dataframe_legacy(group)
    140 else:
--> 141     return read_elem(group)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\registry.py:332, in read_elem(elem)
    320 def read_elem(elem: StorageType) -> Any:
    321     """
    322     Read an element from a store.
    323 
   (...)
    330         The stored element.
    331     """
--> 332     return Reader(_REGISTRY).read_elem(elem)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\utils.py:207, in report_read_key_on_error.<locals>.func_wrapper(*args, **kwargs)
    205     raise ValueError("No element found in args.")
    206 try:
--> 207     return func(*args, **kwargs)
    208 except Exception as e:
    209     path, key = _get_display_path(store).rsplit("/", 1)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\registry.py:255, in Reader.read_elem(self, elem, modifiers)
    250 read_func = partial(
    251     self.registry.get_reader(type(elem), iospec, modifiers),
    252     _reader=self,
    253 )
    254 if self.callback is None:
--> 255     return read_func(elem)
    256 return self.callback(read_func, elem.name, elem, iospec=iospec)

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\methods.py:713, in read_dataframe(elem, _reader)
    710 columns = list(_read_attr(elem.attrs, "column-order"))
    711 idx_key = _read_attr(elem.attrs, "_index")
    712 df = pd.DataFrame(
--> 713     {k: _reader.read_elem(elem[k]) for k in columns},
    714     index=_reader.read_elem(elem[idx_key]),
    715     columns=columns if len(columns) else None,
    716 )
    717 if idx_key != "_index":
    718     df.index.name = idx_key

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\anndata\_io\specs\methods.py:713, in <dictcomp>(.0)
    710 columns = list(_read_attr(elem.attrs, "column-order"))
    711 idx_key = _read_attr(elem.attrs, "_index")
    712 df = pd.DataFrame(
--> 713     {k: _reader.read_elem(elem[k]) for k in columns},
    714     index=_reader.read_elem(elem[idx_key]),
    715     columns=columns if len(columns) else None,
    716 )
    717 if idx_key != "_index":
    718     df.index.name = idx_key

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\zarr\hierarchy.py:500, in Group.__getitem__(self, item)
    498         raise KeyError(item)
    499 else:
--> 500     raise KeyError(item)

KeyError: 'Final Concentration / Dilution'
Error raised while reading key 'var' of <class 'zarr.hierarchy.Group'> from /

Results for xenium_rep_1_io

sdata = sd.read_zarr("C:/Users/hilsenstein/Downloads/xenium_rep1_io/data.zarr/")

Traceback (expand):

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[22], line 1
----> 1 sdata = sd.read_zarr("C:/Users/xxx/Downloads/xenium_rep1_io/data.zarr/")

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\spatialdata\_io\io_zarr.py:121, in read_zarr(store, selection)
    119     f_elem = group[subgroup_name]
    120     f_elem_store = os.path.join(f_store_path, f_elem.path)
--> 121     shapes[subgroup_name] = _read_shapes(f_elem_store)
    122     count += 1
    123 logger.debug(f"Found {count} elements in {group}")

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\spatialdata\_io\io_shapes.py:35, in _read_shapes(store, fmt)
     33 coords = np.array(f["coords"])
     34 index = np.array(f["Index"])
---> 35 typ = fmt.attrs_from_dict(f.attrs.asdict())
     36 if typ.name == "POINT":
     37     radius = np.array(f["radius"])

File ~\AppData\Local\miniconda3\envs\spatialdata\Lib\site-packages\spatialdata\_io\format.py:108, in ShapesFormatV01.attrs_from_dict(self, metadata)
    106 def attrs_from_dict(self, metadata: dict[str, Any]) -> GeometryType:
    107     if Shapes_s.ATTRS_KEY not in metadata:
--> 108         raise KeyError(f"Missing key {Shapes_s.ATTRS_KEY} in shapes metadata.")
    109     metadata_ = metadata[Shapes_s.ATTRS_KEY]
    110     if Shapes_s.GEOS_KEY not in metadata_:

KeyError: 'Missing key spatialdata_attrs in shapes metadata.'
@LucaMarconato
Copy link
Member

Hi @VolkerH thanks for reporting this. We have an automated testing system (Ubuntu) for checking that the example datasets can be written and read, and the latest run from last night did not raise error. Also I have just tried and I could not reproduce it on my macOS machine. My guess is that the error is due to the / character and it happens only on Windows machines.

@melonora could you please try this on your Windows machine?
@VolkerH do the other datasets work? AFAIK, only the Steinbock dataset uses the / character for one of the names.

@VolkerH
Copy link
Author

VolkerH commented Mar 27, 2024

As mentioned, I installed spatialdata from pip. Should I try with the latest version in the Github repo?

Note: I haven't tried all the other datasets, as I was downloading via slow DSL. Will try and download a few more ...

@melonora
Copy link
Collaborator

I can confirm with python 3.11:
afbeelding

@melonora
Copy link
Collaborator

same when tried with python 3.10.

@melonora
Copy link
Collaborator

perhaps related: scverse/spatialdata#516

@LucaMarconato
Copy link
Member

LucaMarconato commented Mar 27, 2024

As mentioned, I installed spatialdata from pip. Should I try with the latest version in the Github repo?

I don't this this would help unfortunately as I believe it's really a Windows specific behavior. @melonora I don't think it's related to scverse/spatialdata#516.

steinbock

Looking more into this I see that also in macOS, the Zarr substore for the anndata var column called 'Final Concentration / Dilution' is actually written on Disk in a subdirectory, but the difference is that it can also be read correctly.

image

So to sum up:

  • for this specific bug, it seems to be an edge case of anndata appearing only on Windows; I reported it here:
  • in general, we should address the handling of / in spatialdata

xenium

The xenium_rep_1_io is heavily tested on macOS and Linux, I believe that this is also an error due to Windows. I will add a warning on the readme that Windows support is currently not tested, and we should try to prioritize adding a Windows CI: scverse/spatialdata#413. The problem was that none of the core devs, until @melonora joined the team, uses Windows, so we postponed this.

@melonora
Copy link
Collaborator

melonora commented Mar 27, 2024

I tried reproducing the error for xenium_rep1_io using a fresh python 3.11 environment and installing spatialdata via pip, but this works correctly for me on the windows machine.

This was also using data that I freshly downloaded.

@melonora
Copy link
Collaborator

I think this can be closed unless you still have the problem. I can't reproduce on my side anymore. Feel free to reopen if the issue persists.

@VolkerH
Copy link
Author

VolkerH commented Apr 24, 2024

Thanks for looking into it. Will try again.

@melonora
Copy link
Collaborator

Thanks! let us know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants