-
Notifications
You must be signed in to change notification settings - Fork 86
Description
Hello,
I am trying to pass annotations from a AnnData file (filtered):
AnnData object with n_obs × n_vars = 12218 × 17731
obs: 'region', 'slide', 'cell_id', 'area', 'n_counts', 'n_genes_by_counts', 'log1p_n_genes_by_counts', 'total_counts', 'log1p_total_counts', 'total_counts_mt', 'log1p_total_counts_mt', 'pct_counts_mt', 'novae_sid', 'neighborhood_valid', 'novae_leaves', 'novae_domains_5', 'novae_domains_6', 'novae_domains_7', 'novae_domains_8', 'novae_domains_9', 'novae_domains_10', 'novae_domains_11', 'novae_domains_12', 'novae_domains_13', 'novae_domains_14', 'novae_domains_15', 'novae_domains_16', 'novae_domains_17', 'novae_domains_18', 'novae_domains_19', 'novae_domains_20', 'novae_domains_21', 'novae_domains_22', 'novae_domains_23', 'novae_domains_24', 'novae_domains_25', 'novae_domains_26', 'novae_domains_27', 'novae_domains_28', 'novae_domains_29', 'novae_domains_30', 'novae_domains_31', 'novae_domains_32', 'novae_domains_33', 'novae_domains_34', 'novae_domains_35', 'novae_domains_36', 'novae_domains_37', 'novae_domains_38', 'novae_domains_39', 'novae_domains_40', 'Tumor_score', 'Fibroblasts_score', 'Macrophages_score', 'Neutrophils_score', 'Goblet_cells_score', 'cell_type_enrichmap', 'cell_type_domain', 'instance_id'
var: 'gene_ids', 'feature_types', 'genome', 'n_counts', 'mt', 'n_cells_by_counts', 'mean_counts', 'log1p_mean_counts', 'pct_dropout_by_counts', 'total_counts', 'log1p_total_counts', 'highly_variable', 'means', 'dispersions', 'dispersions_norm', 'novae_use_gene', 'in_vocabulary'
uns: 'cell_type_domain_colors', 'gene_contributions', 'hvg', 'log1p', 'novae_attrs', 'novae_domains_10_colors', 'novae_domains_11_colors', 'novae_domains_12_colors', 'novae_domains_13_colors', 'novae_domains_14_colors', 'novae_domains_15_colors', 'novae_domains_16_colors', 'novae_domains_17_colors', 'novae_domains_18_colors', 'novae_domains_19_colors', 'novae_domains_20_colors', 'novae_domains_21_colors', 'novae_domains_22_colors', 'novae_domains_23_colors', 'novae_domains_24_colors', 'novae_domains_25_colors', 'novae_domains_26_colors', 'novae_domains_27_colors', 'novae_domains_28_colors', 'novae_domains_29_colors', 'novae_domains_30_colors', 'novae_domains_31_colors', 'novae_domains_32_colors', 'novae_domains_33_colors', 'novae_domains_34_colors', 'novae_domains_35_colors', 'novae_domains_36_colors', 'novae_domains_37_colors', 'novae_domains_38_colors', 'novae_domains_39_colors', 'novae_domains_40_colors', 'novae_domains_5_colors', 'novae_domains_6_colors', 'novae_domains_7_colors', 'novae_domains_8_colors', 'novae_domains_9_colors', 'pca', 'sopa_attrs', 'spatial_neighbors', 'spatialdata_attrs'
obsm: 'X_pca', 'bins_assignments', 'novae_latent', 'novae_latent_10', 'novae_latent_11', 'novae_latent_12', 'novae_latent_13', 'novae_latent_14', 'novae_latent_15', 'novae_latent_16', 'novae_latent_17', 'novae_latent_18', 'novae_latent_19', 'novae_latent_20', 'novae_latent_21', 'novae_latent_22', 'novae_latent_23', 'novae_latent_24', 'novae_latent_25', 'novae_latent_26', 'novae_latent_27', 'novae_latent_28', 'novae_latent_29', 'novae_latent_30', 'novae_latent_31', 'novae_latent_32', 'novae_latent_33', 'novae_latent_34', 'novae_latent_35', 'novae_latent_36', 'novae_latent_37', 'novae_latent_38', 'novae_latent_39', 'novae_latent_40', 'novae_latent_5', 'novae_latent_6', 'novae_latent_7', 'novae_latent_8', 'novae_latent_9', 'spatial'
varm: 'PCs'
layers: 'counts', 'lognorm_counts'
obsp: 'spatial_connectivities', 'spatial_distances', 'spatial_distances_local', 'spatial_distances_view'
Which come from this segmented sdata using stardist and proseg:
SpatialData object
├── Images
│ ├── 'Visium_HD_Human_Colon_Cancer_full_image': DataTree[cyx] (3, 5000, 5000), (3, 2500, 2500), (3, 1250, 1250), (3, 625, 625), (3, 313, 313)
│ ├── 'Visium_HD_Human_Colon_Cancer_hires_image': DataArray[cyx] (3, 398, 399)
│ └── 'Visium_HD_Human_Colon_Cancer_lowres_image': DataArray[cyx] (3, 40, 40)
├── Shapes
│ ├── 'Visium_HD_Human_Colon_Cancer_square_002um': GeoDataFrame shape: (470416, 1) (2D shapes)
│ ├── 'Visium_HD_Human_Colon_Cancer_square_008um': GeoDataFrame shape: (29659, 1) (2D shapes)
│ ├── 'Visium_HD_Human_Colon_Cancer_square_016um': GeoDataFrame shape: (7500, 1) (2D shapes)
│ ├── 'image_patches': GeoDataFrame shape: (9, 3) (2D shapes)
│ ├── 'proseg_boundaries': GeoDataFrame shape: (12394, 2) (2D shapes)
│ ├── 'region_of_interest': GeoDataFrame shape: (1, 1) (2D shapes)
│ └── 'stardist_boundaries': GeoDataFrame shape: (12419, 1) (2D shapes)
└── Tables
├── 'square_002um': AnnData (470416, 18085)
├── 'square_008um': AnnData (29659, 18085)
├── 'square_016um': AnnData (7500, 18085)
└── 'table': AnnData (12394, 18085)
with coordinate systems:
▸ 'Visium_HD_Human_Colon_Cancer', with elements:
Visium_HD_Human_Colon_Cancer_full_image (Images), Visium_HD_Human_Colon_Cancer_hires_image (Images), Visium_HD_Human_Colon_Cancer_lowres_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), image_patches (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes), stardist_boundaries (Shapes)
▸ 'Visium_HD_Human_Colon_Cancer_downscaled_hires', with elements:
Visium_HD_Human_Colon_Cancer_hires_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)
▸ 'Visium_HD_Human_Colon_Cancer_downscaled_lowres', with elements:
Visium_HD_Human_Colon_Cancer_lowres_image (Images), Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)
▸ 'microns', with elements:
Visium_HD_Human_Colon_Cancer_square_002um (Shapes), Visium_HD_Human_Colon_Cancer_square_008um (Shapes), Visium_HD_Human_Colon_Cancer_square_016um (Shapes), proseg_boundaries (Shapes), region_of_interest (Shapes)
This SD object has been cropped the following way first (before the segmentation):
sdata_sub = sd.bounding_box_query(
sdata,
min_coordinate=[51000, 9000],
max_coordinate=[56000, 14000],
axes=("x", "y"),
target_coordinate_system=sample_id,
filter_table=True,
)
However, I can't save this segmented SpatialData that has been cropped:
sdata_sub.write("test.zarr") # save it
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[16], line 1
----> 1 sdata_sub.write("test.zarr") # save it
File c:\venv\segmentation\Lib\site-packages\spatialdata\_utils.py:263, in _deprecation_alias.<locals>.deprecation_decorator.<locals>.wrapper(*args, **kwargs)
261 raise ValueError("version for deprecation must be specified")
262 rename_kwargs(f.__name__, kwargs, alias_copy, class_name, library, version)
--> 263 return f(*args, **kwargs)
File c:\venv\segmentation\Lib\site-packages\spatialdata\_core\spatialdata.py:1176, in SpatialData.write(self, file_path, overwrite, consolidate_metadata, update_sdata_path, sdata_formats, shapes_geometry_encoding)
1173 store.close()
1175 for element_type, element_name, element in self.gen_elements():
-> 1176 self._write_element(
1177 element=element,
1178 zarr_container_path=file_path,
1179 element_type=element_type,
1180 element_name=element_name,
1181 overwrite=False,
1182 parsed_formats=parsed,
1183 shapes_geometry_encoding=shapes_geometry_encoding,
1184 )
1186 if self.path != file_path and update_sdata_path:
1187 self.path = file_path
File c:\venv\segmentation\Lib\site-packages\spatialdata\_core\spatialdata.py:1229, in SpatialData._write_element(self, element, zarr_container_path, element_type, element_name, overwrite, parsed_formats, shapes_geometry_encoding)
1226 parsed_formats = _parse_formats(formats=parsed_formats)
1228 if element_type == "images":
-> 1229 write_image(
1230 image=element,
1231 group=element_group,
1232 name=element_name,
1233 element_format=parsed_formats["raster"],
1234 )
1235 elif element_type == "labels":
1236 write_labels(
1237 labels=element,
1238 group=root_group,
1239 name=element_name,
1240 element_format=parsed_formats["raster"],
1241 )
File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:371, in write_image(image, group, name, element_format, storage_options, **metadata)
363 def write_image(
364 image: DataArray | DataTree,
365 group: zarr.Group,
(...) 369 **metadata: str | JSONDict | list[JSONDict],
370 ) -> None:
--> 371 _write_raster(
372 raster_type="image",
373 raster_data=image,
374 group=group,
375 name=name,
376 raster_format=element_format,
377 storage_options=storage_options,
378 **metadata,
379 )
File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:207, in _write_raster(raster_type, raster_data, group, name, raster_format, storage_options, label_metadata, **metadata)
197 _write_raster_dataarray(
198 raster_type,
199 group,
(...) 204 **metadata,
205 )
206 elif isinstance(raster_data, DataTree):
--> 207 _write_raster_datatree(
208 raster_type,
209 group,
210 name,
211 raster_data,
212 raster_format,
213 storage_options,
214 **metadata,
215 )
216 else:
217 raise ValueError("Not a valid labels object")
File c:\venv\segmentation\Lib\site-packages\spatialdata\_io\io_raster.py:339, in _write_raster_datatree(raster_type, group, element_name, raster_data, raster_format, storage_options, **metadata)
337 storage_options = [{"chunks": chunk} for chunk in chunks]
338 ome_zarr_format = get_ome_zarr_format(raster_format)
--> 339 dask_delayed = write_multi_scale_ngff(
340 pyramid=data,
341 group=group,
342 fmt=ome_zarr_format,
343 axes=parsed_axes,
344 coordinate_transformations=None,
345 storage_options=storage_options,
346 **metadata,
347 compute=False,
348 )
349 # Compute all pyramid levels at once to allow Dask to optimize the computational graph.
350 # Optimize_graph is set to False for now as this causes permission denied errors when during atomic writes
351 # os.replace is called. These can also be alleviated by using 'single-threaded' scheduler.
352 da.compute(*dask_delayed, optimize_graph=False)
File c:\venv\segmentation\Lib\site-packages\ome_zarr\writer.py:325, in write_multiscale(pyramid, group, fmt, axes, coordinate_transformations, storage_options, name, compute, **metadata)
319 axes = _get_valid_axes(dims, axes, fmt)
321 pyramid = [
322 da.from_array(level) if not isinstance(level, da.Array) else level
323 for level in pyramid
324 ]
--> 325 dask_delayed = _write_pyramid_to_zarr(
326 pyramid,
327 group,
328 fmt=fmt,
329 axes=axes,
330 coordinate_transformations=coordinate_transformations,
331 storage_options=storage_options,
332 name=name,
333 compute=compute,
334 **metadata,
335 )
337 return dask_delayed
File c:\venv\segmentation\Lib\site-packages\ome_zarr\writer.py:754, in _write_pyramid_to_zarr(pyramid, group, fmt, axes, coordinate_transformations, storage_options, name, compute, **metadata)
745 shapes.append(level_image.shape)
747 LOGGER.debug(
748 "write dask.array to_zarr shape: %s, dtype: %s",
749 level_image.shape,
750 level_image.dtype,
751 )
753 delayed.append(
--> 754 da.to_zarr(
755 arr=level_image,
756 url=group.store,
757 component=str(Path(group.path, f"s{idx}")),
758 compute=False,
759 zarr_array_kwargs=zarr_array_kwargs,
760 )
761 )
762 datasets.append({"path": f"s{idx}"})
764 # Computing delayed jobs if necessary
File c:\venv\segmentation\Lib\site-packages\dask\array\core.py:4101, in to_zarr(arr, url, component, storage_options, region, compute, return_stored, zarr_array_kwargs, zarr_read_kwargs, **kwargs)
4099 root = zarr.open_group(store=zarr_store, mode="a") if array_name else None
4100 if array_name:
-> 4101 z = root.create_array(name=array_name, **zarr_array_kwargs)
4102 else:
4103 zarr_array_kwargs["store"] = zarr_store
File c:\venv\segmentation\Lib\site-packages\zarr\core\group.py:2715, in Group.create_array(self, name, shape, dtype, data, chunks, shards, filters, compressors, compressor, serializer, fill_value, order, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
2612 """Create an array within this group.
2613
2614 This method lightly wraps [zarr.core.array.create_array][].
(...) 2709 AsyncArray
2710 """
2711 compressors = _parse_deprecated_compressor(
2712 compressor, compressors, zarr_format=self.metadata.zarr_format
2713 )
2714 return Array(
-> 2715 self._sync(
2716 self._async_group.create_array(
2717 name=name,
2718 shape=shape,
2719 dtype=dtype,
2720 data=data,
2721 chunks=chunks,
2722 shards=shards,
2723 fill_value=fill_value,
2724 attributes=attributes,
2725 chunk_key_encoding=chunk_key_encoding,
2726 compressors=compressors,
2727 serializer=serializer,
2728 dimension_names=dimension_names,
2729 order=order,
2730 filters=filters,
2731 overwrite=overwrite,
2732 storage_options=storage_options,
2733 config=config,
2734 write_data=write_data,
2735 )
2736 )
2737 )
File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:204, in SyncMixin._sync(self, coroutine)
201 def _sync(self, coroutine: Coroutine[Any, Any, T]) -> T:
202 # TODO: refactor this to to take *args and **kwargs and pass those to the method
203 # this should allow us to better type the sync wrapper
--> 204 return sync(
205 coroutine,
206 timeout=config.get("async.timeout"),
207 )
File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:159, in sync(coro, loop, timeout)
156 return_result = next(iter(finished)).result()
158 if isinstance(return_result, BaseException):
--> 159 raise return_result
160 else:
161 return return_result
File c:\venv\segmentation\Lib\site-packages\zarr\core\sync.py:119, in _runner(coro)
114 """
115 Await a coroutine and return the result of running it. If awaiting the coroutine raises an
116 exception, the exception will be returned.
117 """
118 try:
--> 119 return await coro
120 except Exception as ex:
121 return ex
File c:\venv\segmentation\Lib\site-packages\zarr\core\group.py:1142, in AsyncGroup.create_array(self, name, shape, dtype, data, chunks, shards, filters, compressors, compressor, serializer, fill_value, order, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
1041 """Create an array within this group.
1042
1043 This method lightly wraps [zarr.core.array.create_array][].
(...) 1137
1138 """
1139 compressors = _parse_deprecated_compressor(
1140 compressor, compressors, zarr_format=self.metadata.zarr_format
1141 )
-> 1142 return await create_array(
1143 store=self.store_path,
1144 name=name,
1145 shape=shape,
1146 dtype=dtype,
1147 data=data,
1148 chunks=chunks,
1149 shards=shards,
1150 filters=filters,
1151 compressors=compressors,
1152 serializer=serializer,
1153 fill_value=fill_value,
1154 order=order,
1155 zarr_format=self.metadata.zarr_format,
1156 attributes=attributes,
1157 chunk_key_encoding=chunk_key_encoding,
1158 dimension_names=dimension_names,
1159 storage_options=storage_options,
1160 overwrite=overwrite,
1161 config=config,
1162 write_data=write_data,
1163 )
File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:4941, in create_array(store, name, shape, dtype, data, chunks, shards, filters, compressors, serializer, fill_value, order, zarr_format, attributes, chunk_key_encoding, dimension_names, storage_options, overwrite, config, write_data)
4936 mode: Literal["a"] = "a"
4938 store_path = await make_store_path(
4939 store, path=name, mode=mode, storage_options=storage_options
4940 )
-> 4941 return await init_array(
4942 store_path=store_path,
4943 shape=shape_parsed,
4944 dtype=dtype_parsed,
4945 chunks=chunks,
4946 shards=shards,
4947 filters=filters,
4948 compressors=compressors,
4949 serializer=serializer,
4950 fill_value=fill_value,
4951 order=order,
4952 zarr_format=zarr_format,
4953 attributes=attributes,
4954 chunk_key_encoding=chunk_key_encoding,
4955 dimension_names=dimension_names,
4956 overwrite=overwrite,
4957 config=config,
4958 )
File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:4755, in init_array(store_path, shape, dtype, chunks, shards, filters, compressors, serializer, fill_value, order, zarr_format, attributes, chunk_key_encoding, dimension_names, overwrite, config)
4752 if order is not None:
4753 _warn_order_kwarg()
-> 4755 meta = AsyncArray._create_metadata_v3(
4756 shape=shape_parsed,
4757 dtype=zdtype,
4758 fill_value=fill_value,
4759 chunk_shape=chunks_out,
4760 chunk_key_encoding=chunk_key_encoding_parsed,
4761 codecs=codecs_out,
4762 dimension_names=dimension_names,
4763 attributes=attributes,
4764 )
4766 arr = AsyncArray(metadata=meta, store_path=store_path, config=config)
4767 await arr._save_metadata(meta, ensure_parents=True)
File c:\venv\segmentation\Lib\site-packages\zarr\core\array.py:777, in AsyncArray._create_metadata_v3(shape, dtype, chunk_shape, fill_value, chunk_key_encoding, codecs, dimension_names, attributes)
774 else:
775 fill_value_parsed = fill_value
--> 777 chunk_grid_parsed = RegularChunkGrid(chunk_shape=chunk_shape)
778 return ArrayV3Metadata(
779 shape=shape,
780 data_type=dtype,
(...) 786 attributes=attributes or {},
787 )
File c:\venv\segmentation\Lib\site-packages\zarr\core\chunk_grids.py:180, in RegularChunkGrid.__init__(self, chunk_shape)
179 def __init__(self, *, chunk_shape: ShapeLike) -> None:
--> 180 chunk_shape_parsed = parse_shapelike(chunk_shape)
182 object.__setattr__(self, "chunk_shape", chunk_shape_parsed)
File c:\venv\segmentation\Lib\site-packages\zarr\core\common.py:200, in parse_shapelike(data)
198 if not all(isinstance(v, int) for v in data_tuple):
199 msg = f"Expected an iterable of integers. Got {data} instead."
--> 200 raise TypeError(msg)
201 if not all(v > -1 for v in data_tuple):
202 msg = f"Expected all values to be non-negative. Got {data} instead."
TypeError: Expected an iterable of integers. Got ((1, 1, 1), (1024, 1024, 1024, 1024, 904), (1024, 1024, 1024, 1024, 904)) instead.
Therefore,what I am trying to do is: once I have the segmented table using StarDist and Proseg, I extract it from sdata_sub.tables["table"]. I then do cell annotation and now I would like to overlap the cell type annotations with the cropped HE image.
I got to something that seems to work, but it is way overcomplicated, so there must be a better way of doing this.
A section of the image I got:
My spatialdata version is 0.7.2.
Is there an approach to transfer these annotations and plot them on top of the cropped HE image correctly? Since I can't save the segmented sdata, I will not have the "stardist_boundaries" nor "proseg_boundaries" as seen above (I am essentially loading the full sdata and then cropping it again). So, my guess is that we would need to create new points/shapes to represent the spatial coordinates of these annotated cell IDs, but what would be the easiest way of accomplishing this?
Thanks!