# 1. Run segmentation 

We want to perform segmentation to identify cells using SOPA. Further information can be found at [this link](https://gustaveroussy.github.io/sopa)

## Import packages

In [1]:
import spatialdata as sd
import sopa 
import troutpy
import scanpy as sc

import sys



## Set up paths

In [3]:
sys.path.insert(0, "../../")  # this depends on the notebook depth and must be adapted per notebook

from _paths import PROJECT_DIR, RESULTS_DIR, xenium_path_cropped

## Read SpatialData dataset

We read the sdata cropped object we created in ``0.format_xenium_sdata.ipynb``

In [4]:
import os
os.getcwd()

'/ictstr01/home/icb/francesca.drummer/1-Projects/troutpy/notebooks/spatialdata_tutorials'

In [5]:
xenium_path_cropped

PosixPath('/lustre/groups/ml01/datasets/projects/2025_sergio_troutpy/example_datasets/mousebrain_prime_crop_communication.zarr')

In [6]:
os.listdir('/lustre/groups/ml01/datasets/projects/2025_sergio_troutpy/')

['example_datasets', 'mousebrain_prime_crop.zarr']

In [7]:
ls -lh /lustre/groups/ml01/datasets/projects/2025_sergio_troutpy/mousebrain_prime_crop_communication.zarr

ls: cannot access '/lustre/groups/ml01/datasets/projects/2025_sergio_troutpy/mousebrain_prime_crop_communication.zarr': No such file or directory


In [9]:
sdata = sd.read_zarr(xenium_path_cropped)

Next we will create the image patches needed for segmentation

In [11]:
sopa.make_image_patches(sdata) # creating overlapping patches

[36;20m[INFO] (sopa.patches._patches)[0m Added 6 patche(s) to sdata['image_patches']


We will run Cellpose segmentation using the DAPI and 18S layers in this example

In [None]:
sopa.segmentation.cellpose(sdata, ["DAPI","18S"], diameter=2) # running cellpose segmentation
sopa.aggregate(sdata, shapes_key="cellpose_boundaries", key_added='table',gene_column='feature_name')

  0%|          | 0/6 [00:00<?, ?it/s]
[A%|          | 0.00/25.3M [00:00<?, ?B/s]
[A%|          | 32.0k/25.3M [00:00<01:23, 318kB/s]
[A%|          | 96.0k/25.3M [00:00<00:57, 462kB/s]
[A%|          | 208k/25.3M [00:00<00:37, 698kB/s] 
[A%|▏         | 432k/25.3M [00:00<00:21, 1.19MB/s]
[A%|▎         | 880k/25.3M [00:00<00:11, 2.16MB/s]
[A%|▋         | 1.73M/25.3M [00:00<00:06, 4.06MB/s]
[A%|█▍        | 3.50M/25.3M [00:00<00:02, 7.83MB/s]
[A%|██▍       | 6.27M/25.3M [00:00<00:01, 12.9MB/s]
[A%|███▋      | 9.44M/25.3M [00:01<00:00, 17.3MB/s]
[A%|█████▏    | 13.2M/25.3M [00:01<00:00, 21.9MB/s]
[A%|██████▋   | 16.9M/25.3M [00:01<00:00, 25.2MB/s]
[A%|███████▊  | 19.8M/25.3M [00:01<00:00, 25.5MB/s]
100%|██████████| 25.3M/25.3M [00:01<00:00, 16.7MB/s]

100%|██████████| 3.54k/3.54k [00:00<00:00, 7.20MB/s]


We plot the location of cells 

In [None]:
sc.pl.spatial(sdata.tables['table'],spot_size=10,palette='viridis')

Finally, we save the spatialdata object as zarr

In [None]:
xenium_path_cropped = "/media/sergio/Discovair_final/mousebrain_prime_crop_points2regions.zarr"
sdata.write(xenium_path_cropped, overwrite=True)