Skip to content

feat: xenium patch tiling, cellpose/stardist updates, and processing improvements#123

Merged
heylf merged 1 commit intonf-core:devfrom
an-altosian:pr/xenium-processing-updates
Mar 13, 2026
Merged

feat: xenium patch tiling, cellpose/stardist updates, and processing improvements#123
heylf merged 1 commit intonf-core:devfrom
an-altosian:pr/xenium-processing-updates

Conversation

@an-altosian
Copy link

@an-altosian an-altosian commented Mar 10, 2026

Summary

  • Xenium patch tiling/stitching: New XENIUM_PATCH_DIVIDE and XENIUM_PATCH_STITCH modules enable spatial tiling of large Xenium datasets for parallel Baysor segmentation, with configurable tile width, overlap, and balanced partitioning. Includes outlier filtering (IQR/Z-score) at stitch time.
  • Cellpose v24 updates: process_high label (240 GB memory), --batch_size 1 and --flow_threshold 0 defaults, output narrowed to *_cp_masks.tif only, zero-cell bail-out patch.
  • StarDist module: New nf-core StarDist module with GPU support, configurable thresholds (prob_thresh, nms_thresh, n_tiles), and stardist_resolift_morphology_ome_tif subworkflow.
  • Segger improvements: GPU auto-detection via task.accelerator, PyTorch Lightning --accelerator auto, updated dataset creation with xenium bundle support.
  • SpatialData fixes: Disable cell/nucleus boundaries in pixel coordinate space to avoid degenerate polygon crash from XeniumRanger outputs with < 4 vertices.
  • Resource tuning: resourceLimits directive in base.config, GPU shared memory configuration, retry strategies for OOM (exit 137) and Xenium Ranger errors.
  • New utility modules: downscale_morphology, upscale_mask, parquet_to_csv, unzip.
  • Proseg tiled subworkflow: proseg_preset_proseg2baysor_tiled for tiled proseg segmentation.

Changes

New modules

  • modules/local/xenium_patch/divide/ — Tile-based transcript partitioning
  • modules/local/xenium_patch/stitch/ — Tile stitching with outlier filtering
  • modules/local/utility/downscale_morphology/ — Morphology image downscaling
  • modules/local/utility/upscale_mask/ — Mask upscaling post-segmentation
  • modules/local/parquet_to_csv/ — Parquet to CSV conversion
  • modules/nf-core/stardist/ — StarDist nuclear segmentation
  • modules/nf-core/unzip/ — ZIP decompression

Updated modules

  • modules/nf-core/cellpose/ — v24 updates (process_high, *_cp_masks.tif output, zero-cell check)
  • modules/nf-core/xeniumranger/ — Updated import-segmentation, relabel, resegment
  • modules/local/segger/ — GPU auto-detect, xenium bundle dataset creation
  • modules/local/spatialdata/write/ — Degenerate polygon fix
  • modules/local/baysor/run/ — Tiling-aware memory scaling
  • modules/local/proseg/ — Updated resource labels

New subworkflows

  • subworkflows/local/proseg_preset_proseg2baysor_tiled/
  • subworkflows/local/stardist_resolift_morphology_ome_tif/

Configuration

  • conf/base.configresourceLimits, GPU labels, retry strategies
  • conf/modules.config — StarDist, cellpose, segger, xenium_patch configs
  • nextflow.config — Environment variables (PYTORCH_CUDA_ALLOC_CONF, NUMBA), singularity timeout

Tests

  • tests/test_xenium_patch/ — Unit tests for divide and stitch transcripts
  • modules/nf-core/stardist/tests/ — nf-test for StarDist module

Test plan

  • nextflow run main.nf -profile test,docker -stub passes
  • nf-test suite passes for new/updated modules
  • pytest passes for xenium_patch unit tests
  • Cellpose runs with --batch_size 1 --flow_threshold 0 and outputs *_cp_masks.tif
  • StarDist runs with configurable thresholds
  • Tiled Baysor segmentation produces correct stitched output
  • SpatialData write handles degenerate polygons without crashing

🤖 Generated with Claude Code

@an-altosian an-altosian force-pushed the pr/xenium-processing-updates branch from f181a3e to 4c8e2a5 Compare March 10, 2026 02:22
@heylf heylf self-requested a review March 10, 2026 17:03
@heylf
Copy link
Collaborator

heylf commented Mar 10, 2026

I will take a look during hackathon

Copy link
Collaborator

@heylf heylf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not many things. Just very minor stuff.

I couldnt fully review your python scripts for the divide and steting of test_xenium_patch, because I think you fully know what is happening. Some stuff looked a bit like it could break if the format of the data is not correct. Lets see.


diameter = ${diameter}
diam_mean = ${diam_mean}
scale = diameter / diam_mean # e.g., 9/30 = 0.3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe better to do
scale = min(diameter / diam_mean, 1.0)
to prevent accidental upscaling.

# Handle multichannel OME-TIFFs: shape can be (H, W), (C, H, W), or (Z, C, H, W)
if img.ndim == 2:
orig_h, orig_w = img.shape
new_h = max(int(orig_h * scale), 256)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the 256 maximum? Wouldnt that mean you upscale smaller images. Is it that cellpose breaks for smaller images?

segmentation_csv,
polygons2d,
[], [], [],
ch_coordinate_space.val)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats I think since DSL2 not so nice .val on a Channel.

Better just do

.map { meta, bundle, segmentation_csv, polygons2d ->
                tuple(meta, bundle,
                    segmentation_csv,
                    polygons2d,
                    [], [], [],
                    "microns")


output:
tuple val(meta), path("${prefix}/filtered_transcripts.parquet"), emit: transcripts_parquet
tuple val(meta), path("${prefix}/filtered_transcripts.csv"), emit: transcripts_parquet
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

misleading here. you output .csv but emit with trnascirpts_parquet

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be removed

…improvements

- Add xenium patch tiling/stitching for parallel Baysor segmentation
- Add StarDist nf-core module with GPU support and configurable thresholds
- Update Cellpose to v24 defaults (process_high, batch_size 1, flow_threshold 0, *_cp_masks.tif output)
- Add GPU auto-detection via task.accelerator for Cellpose and Segger
- Fix degenerate polygon crash in SpatialData write for XeniumRanger boundaries
- Add resourceLimits, GPU shared memory config, and retry strategies in base.config
- Add new utility modules: downscale_morphology, upscale_mask, parquet_to_csv, unzip
- Add proseg tiled subworkflow and stardist resolift subworkflow
@an-altosian an-altosian force-pushed the pr/xenium-processing-updates branch from 4c8e2a5 to 74588c5 Compare March 11, 2026 16:05
@an-altosian
Copy link
Author

an-altosian commented Mar 11, 2026

Thank you for all the comments. I replied two and address all others.

One comment on the baysor tiling and stitching subworkflow: This strategy only makes baysor run without burning out the 1TB RAM. I have tested that the results IoU of cell masks is < 70%. This is another reason I suggest to set the default as proseg and cellpose other than baysor. Without any tweak, proseg processes the entire samples within 1 hour, even much faster than segger (4 hours max).

@heylf
Copy link
Collaborator

heylf commented Mar 11, 2026

Yeah, I agree. I will change it immediately after the PR is merged.

@heylf
Copy link
Collaborator

heylf commented Mar 11, 2026

I will make also a comment into the README about the runtime and memory with Baysor.

@an-altosian
Copy link
Author

Actually, lemme draft something. I have fully tested the runtime using an internal dataset on Nextflow tower.

@heylf
Copy link
Collaborator

heylf commented Mar 11, 2026

That would be great if you have something ready. Than its more concrete.

@an-altosian
Copy link
Author

an-altosian commented Mar 11, 2026

This is the final version.
Tested on the same set of real Xenium datasets.

Tool Compute Runtime (min / med / max) Peak RSS (min / med / max)
Cellpose GPU 1m / 4m / 1.4h 10 GB / 26 GB / 554 GB
Cellpose CPU 1.3h / 2.3h / 6.5h 161 GB / 426 GB / 1115 GB
StarDist GPU 1m / 4m / 7m 5 GB / 12 GB / 18 GB
StarDist CPU 5m / 6m / 7m 18 GB / 18 GB / 18 GB
Segger (create_dataset) GPU 2m / 9m / 31m 1.7 GB / 14 GB / 50 GB
Segger (create_dataset) CPU 13m / 21m / 46m 13 GB / 19 GB / 49 GB
Segger (train) GPU 10m / 43m / 2.9h 30 GB / 33 GB / 60 GB
Segger (predict) GPU 2m / 16m / 59m 10 GB / 25 GB / 87 GB
Baysor (whole-image) CPU 2m / 30m / 17h 6 GB / 10 GB / 650 GB
Baysor (tiled) CPU 1m / 18m / 13h 0.2 GB / 34 GB / 530 GB
Proseg CPU 1m / 18m / 6.8h 279 MB / 3.8 GB / 136 GB
XeniumRanger (resegment) CPU 18m / 39m / 3.7h 28 GB / 54 GB / 60 GB
XeniumRanger (import_seg) CPU 2m / 7m / 2.7h 2.6 GB / 11 GB / 51 GB
Ficture (preprocess) CPU 3m / 4m / 13m 331 MB / 357 MB / 21 GB
  • Cellpose GPU vs CPU: 35x faster on GPU (4m median vs 2.3h), 16x less memory (26 GB vs 426 GB)
  • Segger: Only tool that truly requires GPU for all 3 steps (create_dataset, train, predict)
  • StarDist: Very fast on CPU, GPU is not necessary to run its default model

@heylf heylf mentioned this pull request Mar 13, 2026
9 tasks
Copy link
Collaborator

@heylf heylf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@heylf
Copy link
Collaborator

heylf commented Mar 13, 2026

will be merged for now built on top from there

@heylf heylf merged commit 64a6468 into nf-core:dev Mar 13, 2026
24 of 32 checks passed
an-altosian added a commit to an-altosian/spatialxe that referenced this pull request Mar 20, 2026
…napshots

- Restore extract_dapi and convert_mask_uint32 modules lost during PR nf-core#123 merge
- Remove flows/cells outputs from cellpose meta.yml to match patched main.nf
- Update cellpose.diff to include meta.yml patch alongside main.nf patch
- Restore stardist.diff patch file (process_high label change)
- Update proseg snapshot for v3.1.0 container versions.yml hash
- Refresh all workflow test snapshots for current pipeline output structure

All 21 nf-tests pass with --profile=+docker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants