# Spatialdata - A quick introduction 

This tutorial provides a quick overview over key functionalities of the spatialdata format. This includes 

1. data loading storage and loading
2. alignment of modalities,
3. plotting, and 
4. visualization in napari image viewers. 

For detailed explanations and tutorials, check out the official [documentation](https://spatialdata.scverse.org/en/latest/index.html) 


## What is spatialdata? 

Spatialdata is a data framework for the joint and accessible storage of imaging data, annotations, and -omics data. This is extremely useful for spatialomics experiments, that commonly generate imaging data and (partially) paired -omics measurements. 

### Yet another dataformat? 
The power of spatialdata comes from the fact that it was designed from the start as integrated format of -omics and imaging data. The spatialdata format implements FAIR principles following (future versions of) the OME-Zarr storage/disk format. It further provides many convenient functionalities, including the overlay of imaging data and -omics measurements as static plots or in dynamic viewers (Napari), interfaces with -omics analysis tools, and deep learning frameworks. Spatialdata is performant and can handle large imaging, and -omics data, as well as complex annotations. 

Most notably, spatialdata makes it simple to keep track of cells or shapes of interest between modalities (imaging, annotation, -omics) 

It should be expected that spatialdata will become the de-facto standard for the analysis of spatial -omics data in the following years, at least in the Python ecosystem.  

### Spatialdata and the DVP workflow 
Spatialdata implements key functionalities that are relevant in the DVP workflow. 

A typical DVP workflow might be outlined in the following. Note that spatialdata implements storage options for any step of the workflow and thus helps to keep track between modalities.  

\# | Step | Modality | Format | Spatialdata attribute 
--- | --- | --- | --- | --- 
1 | Immunofluorescence/Pathology staining | Imaging | `.czi`, `.mrxs`, `.tiff` | `.images` 
2 | Cell segmentation | Annotation | cellpose, ... (e.g. `.tiff`) | `.shapes` vectors, `.labels` raster data
3 | Selection of cells | Annotation/Featurization | scPortrait (`diverse`) | `.tables` | 
4 | Excision of cells | - | pyLMD (`.xml`) | - |
5 | MS measurement | omics | alphaDIA, alphabase, DIANN (`diverse`) | `.tables` |


## Getting started

In the following, we will just explore the spatialdata format a little more. Note that you should also checkout the official documentation [documentation](https://spatialdata.scverse.org/en/latest/index.html) 


In [2]:
import spatialdata as sd

Let's get a built-in mock dataset from spatialdata

In [5]:
blobs = sd.datasets.blobs()

  return convert_region_column_to_categorical(adata)


Let's explore the structure of the spatialdata object:

In [6]:
blobs

SpatialData object
├── Images
│     ├── 'blobs_image': DataArray[cyx] (3, 512, 512)
│     └── 'blobs_multiscale_image': DataTree[cyx] (3, 512, 512), (3, 256, 256), (3, 128, 128)
├── Labels
│     ├── 'blobs_labels': DataArray[yx] (512, 512)
│     └── 'blobs_multiscale_labels': DataTree[yx] (512, 512), (256, 256), (128, 128)
├── Points
│     └── 'blobs_points': DataFrame with shape: (<Delayed>, 4) (2D points)
├── Shapes
│     ├── 'blobs_circles': GeoDataFrame shape: (5, 2) (2D shapes)
│     ├── 'blobs_multipolygons': GeoDataFrame shape: (2, 1) (2D shapes)
│     └── 'blobs_polygons': GeoDataFrame shape: (5, 1) (2D shapes)
└── Tables
      └── 'table': AnnData (26, 3)
with coordinate systems:
    ▸ 'global', with elements:
        blobs_image (Images), blobs_multiscale_image (Images), blobs_labels (Labels), blobs_multiscale_labels (Labels), blobs_points (Points), blobs_circles (Shapes), blobs_multipolygons (Shapes), blobs_polygons (Shapes)

You can see that the dataset contains multiple attributes 

1. **Images** 
Images are 