In [1]:
import spatialdata as sd


	geopandas.options.use_pygeos = True

If you intended to use PyGEOS, set the option to False.
  _check_geopandas_using_shapely()


# Subsetting Spatialdata objects

In this tutorial, we illustrate how spatialdata objects may be subsetted using the `spatialdata-plot` preprocessing accessor `.pp`. We illustrate this on the MIBI-TOF dataset which can be obtained from the spatialdata-sandbox repo (https://github.com/giovp/spatialdata-sandbox).

In [2]:
data_dir = "../../../spatialdata-sandbox/mibitof/data.zarr"

In [3]:
mibi = sd.read_zarr(data_dir)


Spatialdata objects may contain various *elements* including images, labels, shapes and points, as well as *coordinate systems* which represent groups of associated elements. The content of a spatialdata object may be inspected simply by invoking its `__repr__` method.

In [4]:
mibi

SpatialData object with:
├── Images
│     ├── 'point8_image': SpatialImage[cyx] (3, 1024, 1024)
│     ├── 'point16_image': SpatialImage[cyx] (3, 1024, 1024)
│     └── 'point23_image': SpatialImage[cyx] (3, 1024, 1024)
├── Labels
│     ├── 'point8_labels': SpatialImage[yx] (1024, 1024)
│     ├── 'point16_labels': SpatialImage[yx] (1024, 1024)
│     └── 'point23_labels': SpatialImage[yx] (1024, 1024)
└── Table
      └── AnnData object with n_obs × n_vars = 3309 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (3309, 36)
with coordinate systems:
▸ 'point8', with elements:
        point8_image (Images), point8_labels (Labels)
▸ 'point16', with elements:
        point16_image (Images), point16_labels (Labels)
▸ 'point23', with elements:
        point23_image (Images), point23_labels (Labels)

Importing `spatialdata-plot` equips spatialdata objects with so called accessors that extend the object with additional methods. The preprocessing accessor `.pp` allows to subset spatialdata objects and exposes the methods `.pp.get_elements` and `.pp.get_bb`.

In [5]:
import spatialdata_plot

## Subsetting spatialdata objects

Any element or coordinate system may be extracted using `pp.get_elements` which receives the respective key(s) as an argument and returns a copy of the subsetted spatialdata object.

In [6]:
mibi.pp.get_elements("point8_image")  # extract the image point8_image

SpatialData object with:
└── Images
      └── 'point8_image': SpatialImage[cyx] (3, 1024, 1024)
with coordinate systems:
▸ 'point8', with elements:
        point8_image (Images)

In [7]:
mibi.pp.get_elements("point16_labels")  # extract point16_labels

SpatialData object with:
├── Labels
│     └── 'point16_labels': SpatialImage[yx] (1024, 1024)
└── Table
      └── AnnData object with n_obs × n_vars = 1023 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (1023, 36)
with coordinate systems:
▸ 'point16', with elements:
        point16_labels (Labels)

In [8]:
mibi.pp.get_elements("point23")  # extracts the coordinate system point23

SpatialData object with:
├── Images
│     └── 'point23_image': SpatialImage[cyx] (3, 1024, 1024)
├── Labels
│     └── 'point23_labels': SpatialImage[yx] (1024, 1024)
└── Table
      └── AnnData object with n_obs × n_vars = 1241 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (1241, 36)
with coordinate systems:
▸ 'point23', with elements:
        point23_image (Images), point23_labels (Labels)

Multiple elements/coordinate systems may be selected if the keys are provided in a list.

In [9]:
mibi.pp.get_elements(["point23_image", "point23_labels"])  # extract image and labels of point23

SpatialData object with:
├── Images
│     └── 'point23_image': SpatialImage[cyx] (3, 1024, 1024)
├── Labels
│     └── 'point23_labels': SpatialImage[yx] (1024, 1024)
└── Table
      └── AnnData object with n_obs × n_vars = 1241 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (1241, 36)
with coordinate systems:
▸ 'point23', with elements:
        point23_image (Images), point23_labels (Labels)

In [10]:
mibi.pp.get_elements(["point8", "point16"])  # extract coordinatesystems point8 and point16

SpatialData object with:
├── Images
│     ├── 'point8_image': SpatialImage[cyx] (3, 1024, 1024)
│     └── 'point16_image': SpatialImage[cyx] (3, 1024, 1024)
├── Labels
│     ├── 'point8_labels': SpatialImage[yx] (1024, 1024)
│     └── 'point16_labels': SpatialImage[yx] (1024, 1024)
└── Table
      └── AnnData object with n_obs × n_vars = 2068 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (2068, 36)
with coordinate systems:
▸ 'point8', with elements:
        point8_image (Images), point8_labels (Labels)
▸ 'point16', with elements:
        point16_image (Images), point16_labels (Labels)

## Extracting bounding boxes

The function `.pp.get_bb` allows to select bounding boxes. The method receives the x and y coordinates of the region of interest, and by default applies the selection to all elements within the object.

In [11]:
mibi.pp.get_bb([200, 500], [200, 500])  # select the area within the range x and y of [200, 500]

SpatialData object with:
├── Images
│     ├── 'point8_image': SpatialImage[cyx] (3, 300, 300)
│     ├── 'point16_image': SpatialImage[cyx] (3, 300, 300)
│     └── 'point23_image': SpatialImage[cyx] (3, 300, 300)
├── Labels
│     ├── 'point8_labels': SpatialImage[yx] (300, 300)
│     ├── 'point16_labels': SpatialImage[yx] (300, 300)
│     └── 'point23_labels': SpatialImage[yx] (300, 300)
└── Table
      └── AnnData object with n_obs × n_vars = 3309 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (3309, 36)
with coordinate systems:
▸ 'point8', with elements:
        point8_image (Images), point8_labels (Labels)
▸ 'point16', with elements:
        point16_image (Images), point16_labels (Labels)
▸ 'point23', with elements:
        point23_image (Images), point23_labels (Labels)

## Chaining preprocessing methods

Methods of preprocessing accessor may be chained such that specific parts of element(s) may be extracted. 

In [12]:
mibi.pp.get_elements("point16").pp.get_bb([200, 500], [200, 500])  # first select the coordinate system, then the ROI

SpatialData object with:
├── Images
│     └── 'point16_image': SpatialImage[cyx] (3, 300, 300)
├── Labels
│     └── 'point16_labels': SpatialImage[yx] (300, 300)
└── Table
      └── AnnData object with n_obs × n_vars = 1023 × 36
    obs: 'row_num', 'point', 'cell_id', 'X1', 'center_rowcoord', 'center_colcoord', 'cell_size', 'category', 'donor', 'Cluster', 'batch', 'library_id'
    uns: 'spatialdata_attrs'
    obsm: 'X_scanorama', 'X_umap', 'spatial': AnnData (1023, 36)
with coordinate systems:
▸ 'point16', with elements:
        point16_image (Images), point16_labels (Labels)