In [None]:
import sparrow as sp

In [None]:
import os
import tempfile
import uuid

from datasets import sdata_resolve

OUTPUT_DIR =  tempfile.gettempdir()

sdata=sdata_resolve( output=os.path.join( OUTPUT_DIR, f"sdata_{uuid.uuid4()}.zarr" ) )

In [None]:
sdata

In [None]:
print( sdata.is_backed() )
print( sdata.path )

In [None]:
print( f"Content of {sdata.path}:" )
! ls {sdata.path}
print( "\n" )

print( f"Content of {sdata.path}/images:" )
! ls {sdata.path}/images

Note: you can remove an element from the zarr store (e.g. on the command line with `rm -r dummy_image`), without 'breaking' the `SpatialData` object. After reloading it from the `.zarr` store, the element that was removed will no longer be an element of the `SpatialData` object.

If the `SpatialData` object is not backed by a `.zarr` store, elements can be removed in the Python shell via `del ...`.

Excercise:

Try removing `dummy_image` from the `.zarr` store.
Next reload the `SpatialData` object.

### Images

DAPI, PolyT, multiplex,...

In [None]:
sdata[ "clahe" ] # -> xarray.DataArray (or datatree.DataTree for multiscale )
sdata[ "clahe" ].data # -> Dask array
sdata[ "clahe" ].data.compute() # -> numpy array

In [None]:
from sparrow.image._image import _get_spatial_element
sdata[ "raw_image" ] # -> datatree.DataTree
se=_get_spatial_element( sdata, layer="raw_image" )  # gets scale0 in case it is multiscale
se # ->xarray.DataArray
se.data # -> Dask array

Images, Labels and Points are lazy if the `SpatialData` object is backed by a `.zarr` store. Lazy means they will not be 'pulled' into RAM, unless you ask for it (e.g. calling `.compute()`, `.persist()` on the Dask objects).

[Dask](https://www.dask.org/) enables out-of-core computation, allowing you to process datasets that exceed the available RAM, and also facilitates parallelized computations.

Note that currently Tables and Shapes are not lazy, and will be loaded into memory when you load a `SpatialData` object.

We can visualize the images:

Using SPArrOW:

In [None]:
sp.pl.plot_image( sdata, img_layer="clahe", figsize=( 5,5 ), colorbar=True )

Via SpatialData:

In [None]:
import spatialdata_plot

sdata.pl.render_images( "clahe" ).pl.show()

Excercise: use matplotlib to visualize the image layer with name `min_max_filtered`.

In [None]:
# solution

import matplotlib.pyplot as plt

plt.imshow( sdata[ "min_max_filtered" ].data[0].compute() )

Interactive exploration of `SpatialData` object:

In [None]:
from napari_spatialdata import Interactive

Interactive( sdata )

Images can have multiple channels:

In [None]:
sdata_macsima=sp.datasets.macsima_example()
sdata_macsima.images[ "HumanLiverH35" ]

In [None]:
Interactive( sdata_macsima )

### Labels

Typically representing a segmentation mask

Labels and images are sometimes referred to as `raster` data.

In [None]:
sdata[ "segmentation_mask" ]

In [None]:
sdata[ "segmentation_mask" ].data.compute()

In [None]:
sdata[ "segmentation_mask" ].data.compute().dtype

In [None]:
sdata.pl.render_images( "clahe" ).pl.render_labels( "segmentation_mask" ).pl.show()

Excercise:

Calculate the total number of cells (based on the segmentation mask provided).

Bonus: try not to load the segmentation mask in memory.

In [None]:
# Solution:

import dask.array as da

da.unique( sdata[ "segmentation_mask" ].data ).compute().shape

In [None]:
# conversion between labels and shapes  # via sp.sh.vectorize -> install new version of sparrow

### Shapes

Shapes either represent the boundaries of a segmentation mask, or an annotation (e.g. tumor region).

### Coordinate systems

All elements in a `SpatialData` object are assigned to one or more coordinate systems, which allows for storing multiple samples in the same `SpatialData` object.

In [None]:
from spatialdata.transformations import get_transformation

get_transformation( sdata[ "clahe" ], get_all=True )