Skip to content

Transfering annotations from AnnData to Spatial Data (overlap of annotations with HE image) and error saving cropped SpatialData object #1094

@Rafael-Silva-Oliveira

Description

@Rafael-Silva-Oliveira

Hello,

I am trying to pass annotations from a AnnData file (filtered):

AnnData object with n_obs × n_vars = 12218 × 17731
    obs: 'region', 'slide', 'cell_id', 'area', 'n_counts', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'novae_sid', 'neighborhood_valid', 'novae_leaves', 'novae_domains_5', 'novae_domains_6', 'novae_domains_7', 'novae_domains_8', 'novae_domains_9', 'novae_domains_10', 'novae_domains_11', 'novae_domains_12', 'novae_domains_13', 'novae_domains_14', 'novae_domains_15', 'novae_domains_16', 'novae_domains_17', 'novae_domains_18', 'novae_domains_19', 'novae_domains_20', 'novae_domains_21', 'novae_domains_22', 'novae_domains_23', 'novae_domains_24', 'novae_domains_25', 'novae_domains_26', 'novae_domains_27', 'novae_domains_28', 'novae_domains_29', 'novae_domains_30', 'novae_domains_31', 'novae_domains_32', 'novae_domains_33', 'novae_domains_34', 'novae_domains_35', 'novae_domains_36', 'novae_domains_37', 'novae_domains_38', 'novae_domains_39', 'novae_domains_40', 'Tumor_score', 'Fibroblasts_score', 'Macrophages_score', 'Neutrophils_score', 'Goblet_cells_score', 'cell_type_enrichmap', 'cell_type_domain', 'instance_id'
    var: 'gene_ids', 'feature_types', 'genome', 'n_counts', 'mt', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'novae_use_gene', 'in_vocabulary'
    uns: 'cell_type_domain_colors', 'gene_contributions', 'hvg', 'log1p', 'novae_attrs', 'novae_domains_10_colors', 'novae_domains_11_colors', 'novae_domains_12_colors', 'novae_domains_13_colors', 'novae_domains_14_colors', 'novae_domains_15_colors', 'novae_domains_16_colors', 'novae_domains_17_colors', 'novae_domains_18_colors', 'novae_domains_19_colors', 'novae_domains_20_colors', 'novae_domains_21_colors', 'novae_domains_22_colors', 'novae_domains_23_colors', 'novae_domains_24_colors', 'novae_domains_25_colors', 'novae_domains_26_colors', 'novae_domains_27_colors', 'novae_domains_28_colors', 'novae_domains_29_colors', 'novae_domains_30_colors', 'novae_domains_31_colors', 'novae_domains_32_colors', 'novae_domains_33_colors', 'novae_domains_34_colors', 'novae_domains_35_colors', 'novae_domains_36_colors', 'novae_domains_37_colors', 'novae_domains_38_colors', 'novae_domains_39_colors', 'novae_domains_40_colors', 'novae_domains_5_colors', 'novae_domains_6_colors', 'novae_domains_7_colors', 'novae_domains_8_colors', 'novae_domains_9_colors', 'pca', 'sopa_attrs', 'spatial_neighbors', 'spatialdata_attrs'
    obsm: 'X_pca', 'bins_assignments', 'novae_latent', 'novae_latent_10', 'novae_latent_11', 'novae_latent_12', 'novae_latent_13', 'novae_latent_14', 'novae_latent_15', 'novae_latent_16', 'novae_latent_17', 'novae_latent_18', 'novae_latent_19', 'novae_latent_20', 'novae_latent_21', 'novae_latent_22', 'novae_latent_23', 'novae_latent_24', 'novae_latent_25', 'novae_latent_26', 'novae_latent_27', 'novae_latent_28', 'novae_latent_29', 'novae_latent_30', 'novae_latent_31', 'novae_latent_32', 'novae_latent_33', 'novae_latent_34', 'novae_latent_35', 'novae_latent_36', 'novae_latent_37', 'novae_latent_38', 'novae_latent_39', 'novae_latent_40', 'novae_latent_5', 'novae_latent_6', 'novae_latent_7', 'novae_latent_8', 'novae_latent_9', 'spatial'
    varm: 'PCs'
    layers: 'counts', 'lognorm_counts'
    obsp: 'spatial_connectivities', 'spatial_distances', 'spatial_distances_local', 'spatial_distances_view'


Which come from this segmented sdata using stardist and proseg:

SpatialData object
├── Images
│     ├── 'Visium_HD_Human_Colon_Cancer_full_image': DataTree[cyx] (3, 5000, 5000), (3, 2500, 2500), (3, 1250, 1250), (3, 625, 625), (3, 313, 313)
│     ├── 'Visium_HD_Human_Colon_Cancer_hires_image': DataArray[cyx] (3, 398, 399)
│     └── 'Visium_HD_Human_Colon_Cancer_lowres_image': DataArray[cyx] (3, 40, 40)
├── Shapes
│     ├── 'Visium_HD_Human_Colon_Cancer_square_002um': GeoDataFrame shape: (470416, 1) (2D shapes)
│     ├── 'Visium_HD_Human_Colon_Cancer_square_008um': GeoDataFrame shape: (29659, 1) (2D shapes)
│     ├── 'Visium_HD_Human_Colon_Cancer_square_016um': GeoDataFrame shape: (7500, 1) (2D shapes)
│     ├── 'image_patches': GeoDataFrame shape: (9, 3) (2D shapes)
│     ├── 'proseg_boundaries': GeoDataFrame shape: (12394, 2) (2D shapes)
│     ├── 'region_of_interest': GeoDataFrame shape: (1, 1) (2D shapes)
│     └── 'stardist_boundaries': GeoDataFrame shape: (12419, 1) (2D shapes)
└── Tables
      ├── 'square_002um': AnnData (470416, 18085)
      ├── 'square_008um': AnnData (29659, 18085)
      ├── 'square_016um': AnnData (7500, 18085)
      └── 'table': AnnData (12394, 18085)
with coordinate systems:
    ▸ 'Visium_HD_Human_Colon_Cancer', with elements:
        Visium_HD_Human_Colon_Cancer_full_image (Images), Visium_HD_Human_Colon_Cancer_hires_image (Images), Visium_HD_Human_Colon_Cancer_lowres_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), image_patches (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes), stardist_boundaries (Shapes)
    ▸ 'Visium_HD_Human_Colon_Cancer_downscaled_hires', with elements:
        Visium_HD_Human_Colon_Cancer_hires_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)
    ▸ 'Visium_HD_Human_Colon_Cancer_downscaled_lowres', with elements:
        Visium_HD_Human_Colon_Cancer_lowres_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)
    ▸ 'microns', with elements:
        Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)


This SD object has been cropped the following way first (before the segmentation):

sdata_sub = sd.bounding_box_query(
    sdata,
    min_coordinate=[51000, 9000],
    max_coordinate=[56000, 14000],
    axes=("x", "y"),
    target_coordinate_system=sample_id,
    filter_table=True,
)

However, I can't save this segmented SpatialData that has been cropped:

sdata_sub.write("test.zarr")  # save it
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 sdata_sub.write("test.zarr")  # save it

File c:\venv\segmentation\Lib\site-packages\spatialdata\_utils.py:263, in _deprecation_alias.<locals>.deprecation_decorator.<locals>.wrapper(*args, **kwargs)
    261     raise ValueError("version for deprecation must be specified")
    262 rename_kwargs(f.__name__, kwargs, alias_copy, class_name, library, version)
--> 263 return f(*args, **kwargs)

File c:\venv\segmentation\Lib\site-packages\spatialdata\_core\spatialdata.py:1176, in SpatialData.write(self, file_path, overwrite, consolidate_metadata, update_sdata_path, sdata_formats, shapes_geometry_encoding)
   1173 store.close()
   1175 for element_type, element_name, element in self.gen_elements():
-> 1176     self._write_element(
   1177         element=element,
   1178         zarr_container_path=file_path,
   1179         element_type=element_type,
   1180         element_name=element_name,
   1181         overwrite=False,
   1182         parsed_formats=parsed,
   1183         shapes_geometry_encoding=shapes_geometry_encoding,
   1184     )
   1186 if self.path != file_path and update_sdata_path:
   1187     self.path = file_path

File c:\venv\segmentation\Lib\site-packages\spatialdata\_core\spatialdata.py:1229, in SpatialData._write_element(self, element, zarr_container_path, element_type, element_name, overwrite, parsed_formats, shapes_geometry_encoding)
   1226     parsed_formats = _parse_formats(formats=parsed_formats)
   1228 if element_type == "images":
-> 1229     write_image(
   1230         image=element,
   1231         group=element_group,
   1232         name=element_name,
   1233         element_format=parsed_formats["raster"],
   1234     )
   1235 elif element_type == "labels":
   1236     write_labels(
   1237         labels=element,
   1238         group=root_group,
   1239         name=element_name,
   1240         element_format=parsed_formats["raster"],
   1241     )

File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:371, in write_image(image, group, name, element_format, storage_options, **metadata)
    363 def write_image(
    364     image: DataArray | DataTree,
    365     group: zarr.Group,
   (...)    369     **metadata: str | JSONDict | list[JSONDict],
    370 ) -> None:
--> 371     _write_raster(
    372         raster_type="image",
    373         raster_data=image,
    374         group=group,
    375         name=name,
    376         raster_format=element_format,
    377         storage_options=storage_options,
    378         **metadata,
    379     )

File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:207, in _write_raster(raster_type, raster_data, group, name, raster_format, storage_options, label_metadata, **metadata)
    197     _write_raster_dataarray(
    198         raster_type,
    199         group,
   (...)    204         **metadata,
    205     )
    206 elif isinstance(raster_data, DataTree):
--> 207     _write_raster_datatree(
    208         raster_type,
    209         group,
    210         name,
    211         raster_data,
    212         raster_format,
    213         storage_options,
    214         **metadata,
    215     )
    216 else:
    217     raise ValueError("Not a valid labels object")

File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:339, in _write_raster_datatree(raster_type, group, element_name, raster_data, raster_format, storage_options, **metadata)
    337 storage_options = [{"chunks": chunk} for chunk in chunks]
    338 ome_zarr_format = get_ome_zarr_format(raster_format)
--> 339 dask_delayed = write_multi_scale_ngff(
    340     pyramid=data,
    341     group=group,
    342     fmt=ome_zarr_format,
    343     axes=parsed_axes,
    344     coordinate_transformations=None,
    345     storage_options=storage_options,
    346     **metadata,
    347     compute=False,
    348 )
    349 # Compute all pyramid levels at once to allow Dask to optimize the computational graph.
    350 # Optimize_graph is set to False for now as this causes permission denied errors when during atomic writes
    351 # os.replace is called. These can also be alleviated by using 'single-threaded' scheduler.
    352 da.compute(*dask_delayed, optimize_graph=False)

File c:\venv\segmentation\Lib\site-packages\ome_zarr\writer.py:325, in write_multiscale(pyramid, group, fmt, axes, coordinate_transformations, storage_options, name, compute, **metadata)
    319 axes = _get_valid_axes(dims, axes, fmt)
    321 pyramid = [
    322     da.from_array(level) if not isinstance(level, da.Array) else level
    323     for level in pyramid
    324 ]
--> 325 dask_delayed = _write_pyramid_to_zarr(
    326     pyramid,
    327     group,
    328     fmt=fmt,
    329     axes=axes,
    330     coordinate_transformations=coordinate_transformations,
    331     storage_options=storage_options,
    332     name=name,
    333     compute=compute,
    334     **metadata,
    335 )
    337 return dask_delayed

File c:\venv\segmentation\Lib\site-packages\ome_zarr\writer.py:754, in _write_pyramid_to_zarr(pyramid, group, fmt, axes, coordinate_transformations, storage_options, name, compute, **metadata)
    745     shapes.append(level_image.shape)
    747     LOGGER.debug(
    748         "write dask.array to_zarr shape: %s, dtype: %s",
    749         level_image.shape,
    750         level_image.dtype,
    751     )
    753     delayed.append(
--> 754         da.to_zarr(
    755             arr=level_image,
    756             url=group.store,
    757             component=str(Path(group.path, f"s{idx}")),
    758             compute=False,
    759             zarr_array_kwargs=zarr_array_kwargs,
    760         )
    761     )
    762     datasets.append({"path": f"s{idx}"})
    764 # Computing delayed jobs if necessary

File c:\venv\segmentation\Lib\site-packages\dask\array\core.py:4101, in to_zarr(arr, url, component, storage_options, region, compute, return_stored, zarr_array_kwargs, zarr_read_kwargs, **kwargs)
   4099 root = zarr.open_group(store=zarr_store, mode="a") if array_name else None
   4100 if array_name:
-> 4101     z = root.create_array(name=array_name, **zarr_array_kwargs)
   4102 else:
   4103     zarr_array_kwargs["store"] = zarr_store

File c:\venv\segmentation\Lib\site-packages\zarr\core\group.py:2715, in Group.create_array(self, name, shape, dtype, data, chunks, shards, filters, compressors, compressor, serializer, fill_value, order, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
   2612 """Create an array within this group.
   2613 
   2614 This method lightly wraps [zarr.core.array.create_array][].
   (...)   2709 AsyncArray
   2710 """
   2711 compressors = _parse_deprecated_compressor(
   2712     compressor, compressors, zarr_format=self.metadata.zarr_format
   2713 )
   2714 return Array(
-> 2715     self._sync(
   2716         self._async_group.create_array(
   2717             name=name,
   2718             shape=shape,
   2719             dtype=dtype,
   2720             data=data,
   2721             chunks=chunks,
   2722             shards=shards,
   2723             fill_value=fill_value,
   2724             attributes=attributes,
   2725             chunk_key_encoding=chunk_key_encoding,
   2726             compressors=compressors,
   2727             serializer=serializer,
   2728             dimension_names=dimension_names,
   2729             order=order,
   2730             filters=filters,
   2731             overwrite=overwrite,
   2732             storage_options=storage_options,
   2733             config=config,
   2734             write_data=write_data,
   2735         )
   2736     )
   2737 )

File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:204, in SyncMixin._sync(self, coroutine)
    201 def _sync(self, coroutine: Coroutine[Any, Any, T]) -> T:
    202     # TODO: refactor this to to take *args and **kwargs and pass those to the method
    203     # this should allow us to better type the sync wrapper
--> 204     return sync(
    205         coroutine,
    206         timeout=config.get("async.timeout"),
    207     )

File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:159, in sync(coro, loop, timeout)
    156 return_result = next(iter(finished)).result()
    158 if isinstance(return_result, BaseException):
--> 159     raise return_result
    160 else:
    161     return return_result

File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:119, in _runner(coro)
    114 """
    115 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
    116 exception, the exception will be returned.
    117 """
    118 try:
--> 119     return await coro
    120 except Exception as ex:
    121     return ex

File c:\venv\segmentation\Lib\site-packages\zarr\core\group.py:1142, in AsyncGroup.create_array(self, name, shape, dtype, data, chunks, shards, filters, compressors, compressor, serializer, fill_value, order, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
   1041 """Create an array within this group.
   1042 
   1043 This method lightly wraps [zarr.core.array.create_array][].
   (...)   1137 
   1138 """
   1139 compressors = _parse_deprecated_compressor(
   1140     compressor, compressors, zarr_format=self.metadata.zarr_format
   1141 )
-> 1142 return await create_array(
   1143     store=self.store_path,
   1144     name=name,
   1145     shape=shape,
   1146     dtype=dtype,
   1147     data=data,
   1148     chunks=chunks,
   1149     shards=shards,
   1150     filters=filters,
   1151     compressors=compressors,
   1152     serializer=serializer,
   1153     fill_value=fill_value,
   1154     order=order,
   1155     zarr_format=self.metadata.zarr_format,
   1156     attributes=attributes,
   1157     chunk_key_encoding=chunk_key_encoding,
   1158     dimension_names=dimension_names,
   1159     storage_options=storage_options,
   1160     overwrite=overwrite,
   1161     config=config,
   1162     write_data=write_data,
   1163 )

File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:4941, in create_array(store, name, shape, dtype, data, chunks, shards, filters, compressors, serializer, fill_value, order, zarr_format, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
   4936 mode: Literal["a"] = "a"
   4938 store_path = await make_store_path(
   4939     store, path=name, mode=mode, storage_options=storage_options
   4940 )
-> 4941 return await init_array(
   4942     store_path=store_path,
   4943     shape=shape_parsed,
   4944     dtype=dtype_parsed,
   4945     chunks=chunks,
   4946     shards=shards,
   4947     filters=filters,
   4948     compressors=compressors,
   4949     serializer=serializer,
   4950     fill_value=fill_value,
   4951     order=order,
   4952     zarr_format=zarr_format,
   4953     attributes=attributes,
   4954     chunk_key_encoding=chunk_key_encoding,
   4955     dimension_names=dimension_names,
   4956     overwrite=overwrite,
   4957     config=config,
   4958 )

File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:4755, in init_array(store_path, shape, dtype, chunks, shards, filters, compressors, serializer, fill_value, order, zarr_format, attributes, chunk_key_encoding, dimension_names, overwrite, config)
   4752     if order is not None:
   4753         _warn_order_kwarg()
-> 4755     meta = AsyncArray._create_metadata_v3(
   4756         shape=shape_parsed,
   4757         dtype=zdtype,
   4758         fill_value=fill_value,
   4759         chunk_shape=chunks_out,
   4760         chunk_key_encoding=chunk_key_encoding_parsed,
   4761         codecs=codecs_out,
   4762         dimension_names=dimension_names,
   4763         attributes=attributes,
   4764     )
   4766 arr = AsyncArray(metadata=meta, store_path=store_path, config=config)
   4767 await arr._save_metadata(meta, ensure_parents=True)

File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:777, in AsyncArray._create_metadata_v3(shape, dtype, chunk_shape, fill_value, chunk_key_encoding, codecs, dimension_names, attributes)
    774 else:
    775     fill_value_parsed = fill_value
--> 777 chunk_grid_parsed = RegularChunkGrid(chunk_shape=chunk_shape)
    778 return ArrayV3Metadata(
    779     shape=shape,
    780     data_type=dtype,
   (...)    786     attributes=attributes or {},
    787 )

File c:\venv\segmentation\Lib\site-packages\zarr\core\chunk_grids.py:180, in RegularChunkGrid.__init__(self, chunk_shape)
    179 def __init__(self, *, chunk_shape: ShapeLike) -> None:
--> 180     chunk_shape_parsed = parse_shapelike(chunk_shape)
    182     object.__setattr__(self, "chunk_shape", chunk_shape_parsed)

File c:\venv\segmentation\Lib\site-packages\zarr\core\common.py:200, in parse_shapelike(data)
    198 if not all(isinstance(v, int) for v in data_tuple):
    199     msg = f"Expected an iterable of integers. Got {data} instead."
--> 200     raise TypeError(msg)
    201 if not all(v > -1 for v in data_tuple):
    202     msg = f"Expected all values to be non-negative. Got {data} instead."

TypeError: Expected an iterable of integers. Got ((1, 1, 1), (1024, 1024, 1024, 1024, 904), (1024, 1024, 1024, 1024, 904)) instead.

Therefore,what I am trying to do is: once I have the segmented table using StarDist and Proseg, I extract it from sdata_sub.tables["table"]. I then do cell annotation and now I would like to overlap the cell type annotations with the cropped HE image.

I got to something that seems to work, but it is way overcomplicated, so there must be a better way of doing this.

A section of the image I got:

Image

My spatialdata version is 0.7.2.

Is there an approach to transfer these annotations and plot them on top of the cropped HE image correctly? Since I can't save the segmented sdata, I will not have the "stardist_boundaries" nor "proseg_boundaries" as seen above (I am essentially loading the full sdata and then cropping it again). So, my guess is that we would need to create new points/shapes to represent the spatial coordinates of these annotated cell IDs, but what would be the easiest way of accomplishing this?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions