# InSituPy demonstration - Add annotations

In [1]:
## The following code ensures that all functions and init files are reloaded before executions.
%load_ext autoreload
%autoreload 2

In [2]:
from pathlib import Path
from insitupy import XeniumData

## Previous steps

1. Download the example data for demonstration: [01_InSituPy_demo_download_data.ipynb](./01_InSituPy_demo_download_data.ipynb)
2. Register images from external stainings: [02_InSituPy_demo_register_images.ipynb](./02_InSituPy_demo_register_images.ipynb)
3. Visualize data with napari and do preprocessing steps: [03_InSituPy_demo_analyze.ipynb](./03_InSituPy_demo_analyze.ipynb)

At this point, the structure of the data should look like this:

    ```
    ./demo_dataset
    ├───cropped_processed
    ├───output-XETG00000__slide_id__sample_id
    │   ├───analysis
    │   │   ├───clustering
    │   │   ├───diffexp
    │   │   ├───pca
    │   │   ├───tsne
    │   │   └───umap
    │   └───cell_feature_matrix
    ├───registered_images
    ├───registration_qc
    └───unregistered_images
    ```


## Load Xenium data into `XeniumData` object

Now the Xenium data can be parsed by providing the data path to `XeniumData`

In [3]:
# prepare paths
data_dir = Path("../demo_dataset") # output directory
xenium_dir = data_dir / "output-XETG00000__slide_id__sample_id" # directory of xenium data
image_dir = data_dir / "unregistered_images" # directory of images

In [24]:
xd = XeniumData(xenium_dir)

In [25]:
xd

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium

In [26]:
xd.read_all()

No `annotations` modality found.
Reading cells...
Reading images...
No `regions` modality found.
Reading transcripts...


In [6]:
# read all data modalities at once
#xd.read_all()

# alternatively, it is also possible to read each modality separately
xd.read_cells()
xd.read_images(names=["HE"])
#xd.read_annotations()
# xd.read_boundaries()
# xd.read_transcripts()


Reading cells...
Reading images...


## Load annotations

For the analysis of spatial transcriptomic datasets the inclusion of annotations from experts of disease pathology is key. Here, we demonstrate how to annotate data in [QuPath](https://qupath.github.io/), export the annotations as `.geojson` file and import them into the `XeniumData` object.

### Create annotations in QuPath

To create annotations in QuPath, follow these steps:

1. Select a annotation tool from the bar on the top left:

<center><img src="./demo_annotations/qupath_annotation_buttons.jpg" width="300"/></center>

2. Add as many annotations as you want and label them by setting classes in the annotation list. Do not forget to press the "Set class" button:

<center><img src="./demo_annotations/qupath_annotation_list.jpg" width="350"/></center>

3. Export annotations using `File > Export objects as GeoJSON`. Tick `Pretty JSON` to get an easily readable JSON file. The file name needs to have following structure: `annotation-{slide_id}__{sample_id}__{annotation_label}`.

### Import annotations into `XeniumData`

For demonstration purposes, we created a dummy annotation file in `./demo_annotations/`. To add the annotations to `XeniumData` follow the steps below.



In [27]:
xd.read_annotations(annotations_dir="../demo_annotations/")

Reading annotations...


In [28]:
xd.read_regions(regions_dir="../demo_regions/")

Reading regions...


In [30]:
xd

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(25778, 35416)
       [1mCD20:[0m	(25778, 35416)
       [1mHER2:[0m	(25778, 35416)
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 167780 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m
    ➤[95m[1m transcripts[0m
       DataFrame with shape 42638083 x 8
    ➤ [36m[1mannotations[0m
       [1mdemo:[0m	4 annotations, 2 cla

In [34]:
xd.show()

In [12]:
xd.annotations

[36m[1mannotations[0m
[1mdemo:[0m	4 annotations, 2 classes ('Positive', 'Negative') 
[1mdemo2:[0m	5 annotations, 3 classes ('Negative', 'Positive', 'Other') 

In [55]:
xd.regions.TMA

Unnamed: 0_level_0,objectType,name,isMissing,geometry,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
7ab5d5a6-49bd-4122-bc64-05477bc0207b,tmaCore,B-2,False,"POLYGON ((4299.61025 4213.18862, 4298.62425 42...",file
06ef93c1-f86d-45e6-ad9a-896e254638ea,tmaCore,A-3,False,"POLYGON ((7201.19150 903.26738, 7200.20550 934...",file
7933d3fd-ccd3-46af-8f15-fcc01ec9c128,tmaCore,B-1,False,"POLYGON ((1555.14725 4333.64638, 1554.15912 43...",file
7015118d-2947-48e3-baf0-4b220a76a951,tmaCore,B-3,False,"POLYGON ((7311.17300 4228.90087, 7310.18700 42...",file
bf86657f-31f6-40fe-983b-f80c3d75512b,tmaCore,A-1,False,"POLYGON ((1481.82625 908.50338, 1480.83812 939...",file
440d8f00-fb0e-42e7-9f98-30d30adfc8df,tmaCore,A-2,False,"POLYGON ((4173.91650 856.13063, 4172.93050 887...",file


In [60]:
xd.annotations.demo

Unnamed: 0_level_0,objectType,geometry,name,color,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
bd3aacca-1716-4df8-91dd-bf8f6413a7bd,annotation,"POLYGON ((1883.38750 2297.97500, 1883.38750 23...",Positive,"[250, 62, 62]",file
69814505-4059-42cd-8df2-752f7eb0810d,annotation,"POLYGON ((2782.90000 2654.55000, 2777.88500 26...",Positive,"[250, 62, 62]",file
1957cd32-0a21-4b45-9dae-ecf236217140,annotation,"POLYGON ((6582.24275 4874.32500, 6583.67500 48...",Negative,"[112, 112, 225]",file
19d2197a-1b8e-456f-8223-fba74641ac1c,annotation,"POLYGON ((6622.56250 3486.70000, 6619.16250 34...",Negative,"[112, 112, 225]",file


In [61]:
xd.show()



In [48]:
[elem for elem in (1,2,3)]

[autoreload of insitupy._core._widgets failed: Traceback (most recent call last):
  File "C:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\site-packages\IPython\extensions\autoreload.py", line 276, in check
    superreload(m, reload, self.old_objects)
  File "C:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\site-packages\IPython\extensions\autoreload.py", line 475, in superreload
    module = reload(module)
  File "C:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\importlib\__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 613, in _exec
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 983, in get_code
  File "<frozen importlib._bootstrap_external>", line 913, in source_to_code
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "C:\Users\ge37voy\Github\InSituPy\insitupy\_core\_wid

[1, 2, 3]

In [16]:
import matplotlib

In [17]:
colormap = matplotlib.colormaps["tab20"]

In [22]:
colormap.colors[0]

(0.12156862745098039, 0.4666666666666667, 0.7058823529411765)

In [37]:
len(colormap.colors)

20

In [34]:
colormap(4)

(0.17254901960784313, 0.6274509803921569, 0.17254901960784313, 1.0)

In [45]:
colormap(25)

(0.6196078431372549, 0.8549019607843137, 0.8980392156862745, 1.0)

In [46]:
list(xd.regions.metadata.keys())

['demo_regions', 'TMA']

In [16]:
xd.viewer

Viewer(camera=Camera(center=(0.0, 2738.80625, 3762.84375), zoom=0.09227467811158797, angles=(0.0, 0.0, 90.0), perspective=0.0, mouse_pan=True, mouse_zoom=True), cursor=Cursor(position=(1.0, 1.0), scaled=True, size=1, style=<CursorStyle.STANDARD: 'standard'>), dims=Dims(ndim=2, ndisplay=2, last_used=0, range=((0.0, 5477.825, 0.2125), (0.0, 7525.9, 0.2125)), current_step=(12888, 17707), order=(0, 1), axis_labels=('0', '1')), grid=GridCanvas(stride=1, shape=(-1, -1), enabled=False), layers=[<Image layer 'nuclei' at 0x13cdde5e4c0>, <Image layer 'CD20' at 0x13ce0f6eeb0>, <Image layer 'HER2' at 0x13ce0f82d30>, <Image layer 'HE' at 0x13ce0859fa0>], help='use <2> for transform', status='Ready', tooltip=Tooltip(visible=False, text=''), theme='dark', title='napari', mouse_over_canvas=False, mouse_move_callbacks=[], mouse_drag_callbacks=[], mouse_double_click_callbacks=[], mouse_wheel_callbacks=[<function dims_scroll at 0x0000013CF97ED0D0>], _persisted_mouse_event={}, _mouse_drag_gen={}, _mouse_w

In [31]:
xd.viewer.camera.center = (0,0,0)

In [38]:
xd.viewer.camera.zoom = 0.1

In [27]:
xd.viewer.camera

Camera(center=(0.0, 1652.8386517220902, 3715.2126964854133), zoom=0.06071755930404312, angles=(0.0, 0.0, 90.0), perspective=0.0, mouse_pan=True, mouse_zoom=True)

In [82]:
xd.regions.TMA

Unnamed: 0_level_0,objectType,name,isMissing,geometry,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
7ab5d5a6-49bd-4122-bc64-05477bc0207b,tmaCore,B-2,False,"POLYGON ((4299.61025 4213.18862, 4298.62425 42...",file
06ef93c1-f86d-45e6-ad9a-896e254638ea,tmaCore,A-3,False,"POLYGON ((7201.19150 903.26738, 7200.20550 934...",file
7933d3fd-ccd3-46af-8f15-fcc01ec9c128,tmaCore,B-1,False,"POLYGON ((1555.14725 4333.64638, 1554.15912 43...",file
7015118d-2947-48e3-baf0-4b220a76a951,tmaCore,B-3,False,"POLYGON ((7311.17300 4228.90087, 7310.18700 42...",file
bf86657f-31f6-40fe-983b-f80c3d75512b,tmaCore,A-1,False,"POLYGON ((1481.82625 908.50338, 1480.83812 939...",file
440d8f00-fb0e-42e7-9f98-30d30adfc8df,tmaCore,A-2,False,"POLYGON ((4173.91650 856.13063, 4172.93050 887...",file


In [56]:
query_res = getattr(xd.regions, "TMA").query('name == "B-1"')

In [79]:
xd.annotations.demo2.iloc[2]["geometry"].exterior

AttributeError: 'MultiPolygon' object has no attribute 'exterior'

In [80]:
from shapely.geometry.multipolygon import MultiPolygon


In [81]:
print("{MultiPolygon}")

{MultiPolygon}


In [61]:
getattr(xd.regions, "TMA")

Unnamed: 0_level_0,objectType,name,isMissing,geometry,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
7ab5d5a6-49bd-4122-bc64-05477bc0207b,tmaCore,B-2,False,"POLYGON ((4299.61025 4213.18862, 4298.62425 42...",file
06ef93c1-f86d-45e6-ad9a-896e254638ea,tmaCore,A-3,False,"POLYGON ((7201.19150 903.26738, 7200.20550 934...",file
7933d3fd-ccd3-46af-8f15-fcc01ec9c128,tmaCore,B-1,False,"POLYGON ((1555.14725 4333.64638, 1554.15912 43...",file
7015118d-2947-48e3-baf0-4b220a76a951,tmaCore,B-3,False,"POLYGON ((7311.17300 4228.90087, 7310.18700 42...",file
bf86657f-31f6-40fe-983b-f80c3d75512b,tmaCore,A-1,False,"POLYGON ((1481.82625 908.50338, 1480.83812 939...",file
440d8f00-fb0e-42e7-9f98-30d30adfc8df,tmaCore,A-2,False,"POLYGON ((4173.91650 856.13063, 4172.93050 887...",file


In [72]:
get_coordinates(query_res["geometry"].iloc[0].centroid)

array([[1055.14636063, 4333.64579273]])

In [59]:
get_coordinates(query_res.geometry.centroid)


  get_coordinates(query_res.geometry.centroid)


array([[1055.14636063, 4333.64579273]])

In [55]:
from shapely import get_coordinates

In [42]:
xd.annotations.metadata

{'demo': {'n_annotations': 4,
  'classes': ['Positive', 'Negative'],
  'analyzed': ''},
 'demo2': {'n_annotations': 5,
  'classes': ['Negative', 'Positive', 'Other'],
  'analyzed': ''}}

In [None]:
xd.regions.metadata

{'demo_regions': {'n_annotations': 3,
  'classes': ['Region1', 'Region2', 'Region3'],
  'analyzed': ''},
 'TMA': {'n_annotations': 6,
  'classes': ['B-2', 'A-3', 'B-1', 'B-3', 'A-1', 'A-2'],
  'analyzed': ''}}

In [22]:
xd.viewer.dims.range = ((0.0, 5477.825, 0.5125), (0.0, 1525.9, 0.5125))

In [13]:
xd.viewer.camera

Camera(center=(0.0, 2575.575468754741, 4159.764054695486), zoom=0.08152545395814942, angles=(0.0, 0.0, 90.0), perspective=0.0, mouse_pan=True, mouse_zoom=True)

In [13]:
xd.regions.metadata.keys()

dict_keys(['demo_regions', 'TMA'])

In [23]:
from shapely import get_coordinates

In [24]:
get_coordinates(xd.regions.TMA.iloc[0].geometry.centroid)

array([[3799.61058111, 4213.18920728]])

In [22]:
xd.regions.TMA.iloc[0].geometry.centroid.coords.xy

(array('d', [3799.610581113812]), array('d', [4213.189207276972]))

In [14]:
xd.regions.metadata

[autoreload of insitupy._core._widgets failed: Traceback (most recent call last):
  File "c:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\site-packages\IPython\extensions\autoreload.py", line 276, in check
    superreload(m, reload, self.old_objects)
  File "c:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\site-packages\IPython\extensions\autoreload.py", line 475, in superreload
    module = reload(module)
  File "c:\Users\ge37voy\AppData\Local\miniconda3\envs\insitupy\lib\importlib\__init__.py", line 169, in reload
    _bootstrap._exec(spec, module)
  File "<frozen importlib._bootstrap>", line 613, in _exec
  File "<frozen importlib._bootstrap_external>", line 846, in exec_module
  File "<frozen importlib._bootstrap_external>", line 983, in get_code
  File "<frozen importlib._bootstrap_external>", line 913, in source_to_code
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "C:\Users\ge37voy\Github\InSituPy\insitupy\_core\_wid

{'demo_regions': {'n_annotations': 3,
  'classes': ['Region1', 'Region2', 'Region3'],
  'analyzed': ''},
 'TMA': {'n_annotations': 6,
  'classes': ['B-2', 'A-3', 'B-1', 'B-3', 'A-1', 'A-2'],
  'analyzed': ''}}

In [14]:
"nuclei" in xd.viewer.layers

True

In [16]:
import geopandas

In [18]:
geopandas.read_file(r"C:\Users\ge37voy\Downloads\mixed_regions.geojson")

Unnamed: 0,id,objectType,classification,name,isMissing,geometry
0,3b1e2f98-ec6e-44f2-a895-ab319670286d,tmaCore,,B-1,False,"POLYGON ((4705.88000 23425.06000, 4701.24000 2..."
1,5db6897e-1830-4ec1-8b22-171eefa39eb1,annotation,,,,"POLYGON ((24621.00000 11484.00000, 29181.00000..."
2,c82af7f3-369e-4018-b0cd-9c34ba4882c2,tmaCore,,A-1,False,"POLYGON ((4705.88000 2352.94000, 4701.24000 25..."
3,f52e883f-fe4e-4eac-91e6-1866eca8cf2e,tmaCore,,B-2,False,"POLYGON ((20060.94000 23425.06000, 20056.30000..."
4,ac1f14a8-9901-4159-98c8-5f68f214b926,annotation,"{'name': 'Stroma', 'color': [150, 200, 150]}",,,"POLYGON ((6803.00000 6431.00000, 13925.00000 6..."
5,f6f639c6-62d0-4e3c-ac41-b50d92da7f82,tmaCore,,A-2,False,"POLYGON ((20060.94000 2352.94000, 20056.30000 ..."
6,f785735d-ff05-4d4d-b0d8-0b6c157acbda,tmaCore,,A-3,False,"POLYGON ((35416.00000 2352.94000, 35411.36000 ..."
7,1b57dd5a-87de-4550-9f0e-e8f3b5c2c7bc,tmaCore,,B-3,False,"POLYGON ((35416.00000 23425.06000, 35411.36000..."


In [19]:
geopandas.read_file(r"C:\Users\ge37voy\Github\InSituPy\notebooks\demo_regions\regions-slide_id__sample_id__demo_regions.geojson")

Unnamed: 0,id,objectType,name,geometry
0,2d0da635-c408-459f-9178-839097fe5a98,annotation,Region1,"POLYGON ((7362.00000 6221.00000, 10672.00000 6..."
1,ce6c2342-620d-4f44-be03-68a4454e9b33,annotation,Region2,"POLYGON ((21373.00000 6383.00000, 26418.00000 ..."
2,70a125ec-c53e-469b-8927-efe224e504c1,annotation,Region3,"POLYGON ((9933.00000 12745.00000, 15942.00000 ..."


In [64]:
xd.regions.TMA

Unnamed: 0_level_0,objectType,name,isMissing,geometry,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
7ab5d5a6-49bd-4122-bc64-05477bc0207b,tmaCore,B-2,False,"POLYGON ((4299.61025 4213.18862, 4298.62425 42...",file
06ef93c1-f86d-45e6-ad9a-896e254638ea,tmaCore,A-3,False,"POLYGON ((7201.19150 903.26738, 7200.20550 934...",file
7933d3fd-ccd3-46af-8f15-fcc01ec9c128,tmaCore,B-1,False,"POLYGON ((1555.14725 4333.64638, 1554.15912 43...",file
7015118d-2947-48e3-baf0-4b220a76a951,tmaCore,B-3,False,"POLYGON ((7311.17300 4228.90087, 7310.18700 42...",file
bf86657f-31f6-40fe-983b-f80c3d75512b,tmaCore,A-1,False,"POLYGON ((1481.82625 908.50338, 1480.83812 939...",file
440d8f00-fb0e-42e7-9f98-30d30adfc8df,tmaCore,A-2,False,"POLYGON ((4173.91650 856.13063, 4172.93050 887...",file


In [None]:
xd.read_annotations

In [9]:
import pandas as pd

In [None]:
pd.read_csv()

In [8]:
xd

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 167780 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m

In [10]:
xd.read_annotations()

<bound method XeniumData.read_annotations of [1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 167780 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m>

In [9]:
xd

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 167780 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m

In [13]:
xd.annotations.demo

Unnamed: 0_level_0,objectType,geometry,name,color,origin
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
bd3aacca-1716-4df8-91dd-bf8f6413a7bd,annotation,"POLYGON ((8863.00000 10814.00000, 8863.00000 1...",Positive,"[250, 62, 62]",file
69814505-4059-42cd-8df2-752f7eb0810d,annotation,"POLYGON ((13096.00000 12492.00000, 13072.40000...",Positive,"[250, 62, 62]",file
1957cd32-0a21-4b45-9dae-ecf236217140,annotation,"POLYGON ((30975.26000 22938.00000, 30982.00000...",Negative,"[112, 112, 225]",file
19d2197a-1b8e-456f-8223-fba74641ac1c,annotation,"POLYGON ((31165.00000 16408.00000, 31149.00000...",Negative,"[112, 112, 225]",file


In [22]:
xd.annotations.demo.geometry.scale(origin=(0,0), xfact=1, yfact=1, inplace=True)

TypeError: scale() got an unexpected keyword argument 'inplace'

### Visualize and edit annotations using napari

To show all annoation labels set `annotation_labels="all"`. We can also only show one specific annotation label or a list of labels, e.g. `xd.show(annotation_labels="demo2")`.


In [12]:
xd.show(annotation_labels="demo2")



In [10]:
xd.read_cells()

Reading cells...


In [58]:
xd.show()



In [61]:
xd.viewer.layers['Shapes'].data[0]

array([[ 9828.86495405,  6183.45101857],
       [ 9828.86495405, 10917.9511538 ],
       [13318.43676687, 10917.9511538 ],
       [13318.43676687,  6183.45101857]])

In [63]:
xd.viewer.layers['Shapes'].data[0] * xd.metadata["pixel_size"]

array([[2088.63380274, 1313.98334145],
       [2088.63380274, 2320.06462018],
       [2830.16781296, 2320.06462018],
       [2830.16781296, 1313.98334145]])

In [65]:
xd.cells.matrix.obsm['spatial']

array([[ 847.25991211,  326.19136505],
       [ 826.34199524,  328.03182983],
       [ 848.76691895,  331.74318695],
       ...,
       [7470.15942383, 5119.13205566],
       [7477.73720703, 5128.71281738],
       [7489.3765625 , 5123.19777832]])

In [119]:
xd.show()



In [120]:
xd_cropped = xd.crop(shape_layer="Shapes")

In [121]:
xd_cropped

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mnuclei:[0m	(3680, 4130)
       [1mCD20:[0m	(3680, 4130)
       [1mHER2:[0m	(3680, 4130)
       [1mHE:[0m	(3680, 4130, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 3949 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m
    ➤[95m[1m transcripts[0m
       DataFrame with shape 829113 x 8

In [122]:
xd_cropped.images.HE

[dask.array<getitem, shape=(3680, 4130, 3), dtype=uint8, chunksize=(1024, 1024, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(1840, 2065, 3), dtype=uint8, chunksize=(1024, 1024, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(920, 1032, 3), dtype=uint8, chunksize=(574, 697, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(460, 516, 3), dtype=uint8, chunksize=(460, 348, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(230, 258, 3), dtype=uint8, chunksize=(230, 258, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(115, 129, 3), dtype=uint8, chunksize=(115, 129, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(57, 64, 3), dtype=uint8, chunksize=(57, 64, 3), chunktype=numpy.ndarray>,
 dask.array<getitem, shape=(29, 32, 3), dtype=uint8, chunksize=(29, 32, 3), chunktype=numpy.ndarray>]

In [123]:
xd_cropped.show()



In [83]:
xd_cropped.images.metadata['HE']['OME']['Image']['Pixels']['PhysicalSizeX'] == xd_cropped.images.metadata['HE']['OME']['Image']['Pixels']['PhysicalSizeY']

True

In [None]:
xd

In [69]:
xd_cropped.metadata

{'run_name': 'PREVIEW: Human Breast Cancer',
 'run_start_time': '',
 'region_name': 'Replicate 1',
 'preservation_method': 'ffpe',
 'num_cells': 167780,
 'transcripts_per_cell': 166.0,
 'transcripts_per_100um': 105.9945809453222,
 'cassette_name': 'human_breast_cancer_SIM1',
 'slide_id': '0001879',
 'panel_design_id': 'PD_260',
 'panel_name': 'Breast Cancer Tumor Microenvironment',
 'panel_organism': 'Human',
 'panel_tissue_type': 'Breast',
 'panel_num_targets_predesigned': 280,
 'panel_num_targets_custom': 33,
 'pixel_size': 0.2125,
 'instrument_sn': 'Xenium prototype instrument',
 'instrument_sw_version': 'Development',
 'analysis_sw_version': 'Xenium-1.0.1',
 'experiment_uuid': '',
 'cassette_uuid': '',
 'roi_uuid': '',
 'z_step_size': 3.0,
 'well_uuid': '',
 'images': {'morphology_filepath': 'morphology.ome.tif',
  'morphology_mip_filepath': 'morphology_mip.ome.tif',
  'morphology_focus_filepath': 'morphology_focus.ome.tif',
  'registered_CD20_filepath': '..\\registered_images\\sli

In [57]:
xd_cropped.show()



In [48]:
xd_cropped

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(4686, 5794, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 0 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m

In [53]:
xd_cropped.viewer.layers['Shapes'].data

[array([[1598.09534724, 1653.58033604],
        [1598.09534724, 3640.90446785],
        [2878.06682197, 3640.90446785],
        [2878.06682197, 1653.58033604]])]

In [21]:
xd_cropped.metadata

{'run_name': 'PREVIEW: Human Breast Cancer',
 'run_start_time': '',
 'region_name': 'Replicate 1',
 'preservation_method': 'ffpe',
 'num_cells': 167780,
 'transcripts_per_cell': 166.0,
 'transcripts_per_100um': 105.9945809453222,
 'cassette_name': 'human_breast_cancer_SIM1',
 'slide_id': '0001879',
 'panel_design_id': 'PD_260',
 'panel_name': 'Breast Cancer Tumor Microenvironment',
 'panel_organism': 'Human',
 'panel_tissue_type': 'Breast',
 'panel_num_targets_predesigned': 280,
 'panel_num_targets_custom': 33,
 'pixel_size': 0.2125,
 'instrument_sn': 'Xenium prototype instrument',
 'instrument_sw_version': 'Development',
 'analysis_sw_version': 'Xenium-1.0.1',
 'experiment_uuid': '',
 'cassette_uuid': '',
 'roi_uuid': '',
 'z_step_size': 3.0,
 'well_uuid': '',
 'images': {'morphology_filepath': 'morphology.ome.tif',
  'morphology_mip_filepath': 'morphology_mip.ome.tif',
  'morphology_focus_filepath': 'morphology_focus.ome.tif',
  'registered_CD20_filepath': '..\\registered_images\\sli

In [22]:
xd.annotations.metadata

{'demo': {'n_annotations': 4,
  'classes': ['Positive', 'Negative'],
  'analyzed': ''},
 'demo2': {'n_annotations': 5,
  'classes': ['Negative', 'Positive', 'Other'],
  'analyzed': ''}}

In [23]:
xd.cells

[1mmatrix[0m
    AnnData object with n_obs × n_vars = 167780 × 313
    obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
    var: 'gene_ids', 'feature_types', 'genome'
    obsm: 'spatial'
[1mboundaries[0m
    BoundariesData object with 2 entries:
        [1mcellular[0m
        [1mnuclear[0m

In [24]:
xd.images.metadata

{'HE': {'file': '..\\registered_images\\slide_id__sample_id__HE__registered.ome.tif',
  'shape': (25778, 35416, 3),
  'subresolutions': 7,
  'axes': 'YXS',
  'OME': {'xmlns': 'http://www.openmicroscopy.org/Schemas/OME/2016-06',
   'xmlns:xsi': 'http://www.w3.org/2001/XMLSchema-instance',
   'xsi:schemaLocation': 'http://www.openmicroscopy.org/Schemas/OME/2016-06 http://www.openmicroscopy.org/Schemas/OME/2016-06/ome.xsd',
   'UUID': 'urn:uuid:5b74ab29-9b2b-11ee-a7a5-047bcbbe29a1',
   'Creator': 'tifffile.py 2023.9.26',
   'Image': {'ID': 'Image:0',
    'Name': 'Image0',
    'Pixels': {'ID': 'Pixels:0',
     'DimensionOrder': 'XYCZT',
     'Type': 'uint8',
     'SizeX': '35416',
     'SizeY': '25778',
     'SizeC': '3',
     'SizeZ': '1',
     'SizeT': '1',
     'Interleaved': 'true',
     'SignificantBits': '8',
     'PhysicalSizeX': '0.2125',
     'PhysicalSizeXUnit': 'µm',
     'PhysicalSizeY': '0.2125',
     'PhysicalSizeYUnit': 'µm',
     'Channel': {'ID': 'Channel:0:0',
      'Samp

In [27]:
xd.images.metadata['HE']['OME']['Image']['Pixels']['PhysicalSizeX']

'0.2125'

In [25]:
xd.cells.pixel_size

0.2125

#### Annotation layers

The annotations are added as shapes layers to the layer list. The layer name always starts with a "*" and has following syntax: `"* Class (Label)"`:

<left><img src="./demo_annotations/napari_layerlist_annotations.jpg" width="300"/></left>

- **Label**: A label for one collection of annotations. Could e.g. tell us who did the annotations or what is the focus of this collection of annotations.
- **Class**: Specifies the class of one specific annotation. Could be e.g. the name of cells, the morphological structure or the disease state annotated.

#### Add custom annotations using the Annotation Widget

<left><img src="./demo_annotations/napari_annotation_widget.jpg" width="200"/></left>

By clicking the `"Add annotation layer"` button a new layer with the above mentioned syntax is added. The layer controls on the top left can be then used to add new shapes as annotations:

<left><img src="./demo_annotations/napari_layerconrols_annotations.jpg" width="300"/></left>

An example annotation is shown here:

<left><img src="./demo_annotations/napari_annotation_example.jpg" width="200"/></left>

The annotations can then be stored in the `XeniumData` object using the `store_annotations` function.


In [41]:
xd.store_annotations()

'XeniumData' object has no attribute 'viewer'. Use `.show()` first to open a napari viewer.


UnboundLocalError: local variable 'viewer' referenced before assignment

In [42]:
xd

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	..\demo_dataset
[1mData folder:[0m	output-XETG00000__slide_id__sample_id
[1mMetadata file:[0m	experiment_modified.xenium
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m cells[0m
       [1mmatrix[0m
           AnnData object with n_obs × n_vars = 167780 × 313
           obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area'
           var: 'gene_ids', 'feature_types', 'genome'
           obsm: 'spatial'
       [1mboundaries[0m
           BoundariesData object with 2 entries:
               [1mcellular[0m
               [1mnuclear[0m
    ➤ [36m[1mannotations[0m
       [1mdemo:[0m	4 annotations, 2 classes ('Positive', 'Negative') 
       [1mdemo2:[0m	5 annotations, 3 classes ('Negative', 'Positive', 'Other') 

### Assign annotations to observations

To use the annotations in analyses (e.g. to select only observations within a certain annotation or compare gene expression between different annotations) one can use the `assign_annotations` function. It adds columns containing the annotation class to `xd.matrix.obs`. The column has the syntax `annotation-{Label}` and if an observation is not part of any annotation within this label, it contains `NaN`. 

In [43]:
xd.assign_annotations()

Assigning label 'demo'...
Assigning label 'demo2'...


After assigning the annotations, the labels analyzed here are marked with a tick (✔):

In [45]:
import numpy as np

In [59]:
xd.cells.matrix.obs['annotation-demo'][xd.cells.matrix.obs['annotation-demo'].notnull()].values

array(['Positive', 'Positive', 'Positive', ..., 'Negative', 'Negative',
       'Negative'], dtype=object)

In [66]:
import pandas as pd

In [67]:
pd.notnull(xd.cells.matrix.obs['annotation-demo'].values)

array([False, False, False, ..., False, False, False])

In [48]:
~np.isnan(xd.cells.matrix.obs['annotation-demo'].values)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

In [47]:
xd.cells.matrix.obs['annotation-demo'][~np.isnan(xd.cells.matrix.obs['annotation-demo'].values)]

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

In [32]:
np.sum(~np.isnan(xd.cells.matrix.obs['annotation-demo']))

0

In [68]:
xd.show()

In [61]:
xd.show()

Following cells show examples how to explore the assigned annotations:

In [44]:
# print number of cells within one annotation
xd.cells.matrix.obs["annotation-demo2"].notna().sum()

9431

In [17]:
# show only observations that were part of this annotation label
xd.matrix.obs[xd.matrix.obs["annotation-demo2"].notna()]

Unnamed: 0,transcript_counts,control_probe_counts,control_codeword_counts,total_counts,cell_area,nucleus_area,annotation-demo,annotation-demo2,annotation-newlabel
4921,281,0,0,281,733.247187,26.010000,,Other,
4922,273,1,0,274,380.576875,30.074063,,Other,
4923,189,2,0,191,285.658437,8.263594,,Other,
4924,212,0,0,212,282.226562,24.068281,,Other,
4925,58,0,0,58,81.823125,4.470469,,Other,
...,...,...,...,...,...,...,...,...,...
165374,96,1,0,97,150.234844,11.063281,Negative,Negative,
165375,379,0,0,379,153.666719,75.681875,Negative,Negative,
165376,101,0,0,101,27.996875,17.836719,Negative,Negative,
165377,472,0,0,472,200.177656,52.652188,Negative,Negative,


## Save results

The cropped and/or processed data can be saved into a folder using the `.save()` function of `XeniumData`.

The resulting folder has following structure:
```
with_annotations
│   xenium.json
│   xeniumdata.json
│
├───annotations
│       demo.geojson
│
├───boundaries
│       cells.parquet
│       nuclei.parquet
│
├───images
│       morphology_focus.ome.tif
│       slide_id__sample_id__CD20__registered.ome.tif
│       slide_id__sample_id__HER2__registered.ome.tif
│       slide_id__sample_id__HE__registered.ome.tif
│
├───matrix
│       matrix.h5ad
│
└───transcripts
        transcripts.parquet
```

In [18]:
out_dir = data_dir / "with_annotations"
xd.save(out_dir, overwrite=True)

In [19]:
xd_reloaded = XeniumData(out_dir)

In [20]:
xd_reloaded

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	demo_dataset
[1mData folder:[0m	with_annotations
[1mMetadata file:[0m	xeniumdata.json

In [21]:
xd_reloaded.read_all()

Reading annotations...
No `boundaries` modality found.
Reading images...
Reading matrix...
No `transcripts` modality found.




In [22]:
xd_reloaded

[1m[31mXeniumData[0m
[1mSlide ID:[0m	slide_id
[1mSample ID:[0m	sample_id
[1mData path:[0m	demo_dataset
[1mData folder:[0m	with_annotations
[1mMetadata file:[0m	xeniumdata.json
    ➤ [34m[1mimages[0m
       [1mHE:[0m	(25778, 35416, 3)
    ➤[32m[1m matrix[0m
       AnnData object with n_obs × n_vars = 167780 × 313
	       obs: 'transcript_counts', 'control_probe_counts', 'control_codeword_counts', 'total_counts', 'cell_area', 'nucleus_area', 'annotation-demo', 'annotation-demo2', 'annotation-newlabel'
	       var: 'gene_ids', 'feature_types', 'genome'
	       obsm: 'spatial'
    ➤ [36m[1mannotations[0m
       [1mdemo:[0m	4 annotations, 2 classes ('Positive', 'Negative') ✔
       [1mdemo2:[0m	5 annotations, 3 classes ('Negative', 'Positive', 'Other') ✔
       [1mnewlabel:[0m	6 annotations, 1 classes ('newclass',) ✔

In [23]:
xd_reloaded.show(annotation_labels="all")