feat: xenium patch tiling, cellpose/stardist updates, and processing improvements#123
Conversation
f181a3e to
4c8e2a5
Compare
|
I will take a look during hackathon |
heylf
left a comment
There was a problem hiding this comment.
Not many things. Just very minor stuff.
I couldnt fully review your python scripts for the divide and steting of test_xenium_patch, because I think you fully know what is happening. Some stuff looked a bit like it could break if the format of the data is not correct. Lets see.
|
|
||
| diameter = ${diameter} | ||
| diam_mean = ${diam_mean} | ||
| scale = diameter / diam_mean # e.g., 9/30 = 0.3 |
There was a problem hiding this comment.
maybe better to do
scale = min(diameter / diam_mean, 1.0)
to prevent accidental upscaling.
| # Handle multichannel OME-TIFFs: shape can be (H, W), (C, H, W), or (Z, C, H, W) | ||
| if img.ndim == 2: | ||
| orig_h, orig_w = img.shape | ||
| new_h = max(int(orig_h * scale), 256) |
There was a problem hiding this comment.
why the 256 maximum? Wouldnt that mean you upscale smaller images. Is it that cellpose breaks for smaller images?
| segmentation_csv, | ||
| polygons2d, | ||
| [], [], [], | ||
| ch_coordinate_space.val) |
There was a problem hiding this comment.
thats I think since DSL2 not so nice .val on a Channel.
Better just do
.map { meta, bundle, segmentation_csv, polygons2d ->
tuple(meta, bundle,
segmentation_csv,
polygons2d,
[], [], [],
"microns")
|
|
||
| output: | ||
| tuple val(meta), path("${prefix}/filtered_transcripts.parquet"), emit: transcripts_parquet | ||
| tuple val(meta), path("${prefix}/filtered_transcripts.csv"), emit: transcripts_parquet |
There was a problem hiding this comment.
misleading here. you output .csv but emit with trnascirpts_parquet
tests/test_xenium_patch/__init__.py
Outdated
There was a problem hiding this comment.
I think this can be removed
…improvements - Add xenium patch tiling/stitching for parallel Baysor segmentation - Add StarDist nf-core module with GPU support and configurable thresholds - Update Cellpose to v24 defaults (process_high, batch_size 1, flow_threshold 0, *_cp_masks.tif output) - Add GPU auto-detection via task.accelerator for Cellpose and Segger - Fix degenerate polygon crash in SpatialData write for XeniumRanger boundaries - Add resourceLimits, GPU shared memory config, and retry strategies in base.config - Add new utility modules: downscale_morphology, upscale_mask, parquet_to_csv, unzip - Add proseg tiled subworkflow and stardist resolift subworkflow
4c8e2a5 to
74588c5
Compare
|
Thank you for all the comments. I replied two and address all others. One comment on the baysor tiling and stitching subworkflow: This strategy only makes baysor run without burning out the 1TB RAM. I have tested that the results IoU of cell masks is < 70%. This is another reason I suggest to set the default as proseg and cellpose other than baysor. Without any tweak, proseg processes the entire samples within 1 hour, even much faster than segger (4 hours max). |
|
Yeah, I agree. I will change it immediately after the PR is merged. |
|
I will make also a comment into the README about the runtime and memory with Baysor. |
|
Actually, lemme draft something. I have fully tested the runtime using an internal dataset on Nextflow tower. |
|
That would be great if you have something ready. Than its more concrete. |
|
This is the final version.
|
|
will be merged for now built on top from there |
…napshots - Restore extract_dapi and convert_mask_uint32 modules lost during PR nf-core#123 merge - Remove flows/cells outputs from cellpose meta.yml to match patched main.nf - Update cellpose.diff to include meta.yml patch alongside main.nf patch - Restore stardist.diff patch file (process_high label change) - Update proseg snapshot for v3.1.0 container versions.yml hash - Refresh all workflow test snapshots for current pipeline output structure All 21 nf-tests pass with --profile=+docker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
XENIUM_PATCH_DIVIDEandXENIUM_PATCH_STITCHmodules enable spatial tiling of large Xenium datasets for parallel Baysor segmentation, with configurable tile width, overlap, and balanced partitioning. Includes outlier filtering (IQR/Z-score) at stitch time.process_highlabel (240 GB memory),--batch_size 1and--flow_threshold 0defaults, output narrowed to*_cp_masks.tifonly, zero-cell bail-out patch.prob_thresh,nms_thresh,n_tiles), andstardist_resolift_morphology_ome_tifsubworkflow.task.accelerator, PyTorch Lightning--accelerator auto, updated dataset creation with xenium bundle support.resourceLimitsdirective in base.config, GPU shared memory configuration, retry strategies for OOM (exit 137) and Xenium Ranger errors.downscale_morphology,upscale_mask,parquet_to_csv,unzip.proseg_preset_proseg2baysor_tiledfor tiled proseg segmentation.Changes
New modules
modules/local/xenium_patch/divide/— Tile-based transcript partitioningmodules/local/xenium_patch/stitch/— Tile stitching with outlier filteringmodules/local/utility/downscale_morphology/— Morphology image downscalingmodules/local/utility/upscale_mask/— Mask upscaling post-segmentationmodules/local/parquet_to_csv/— Parquet to CSV conversionmodules/nf-core/stardist/— StarDist nuclear segmentationmodules/nf-core/unzip/— ZIP decompressionUpdated modules
modules/nf-core/cellpose/— v24 updates (process_high, *_cp_masks.tif output, zero-cell check)modules/nf-core/xeniumranger/— Updated import-segmentation, relabel, resegmentmodules/local/segger/— GPU auto-detect, xenium bundle dataset creationmodules/local/spatialdata/write/— Degenerate polygon fixmodules/local/baysor/run/— Tiling-aware memory scalingmodules/local/proseg/— Updated resource labelsNew subworkflows
subworkflows/local/proseg_preset_proseg2baysor_tiled/subworkflows/local/stardist_resolift_morphology_ome_tif/Configuration
conf/base.config—resourceLimits, GPU labels, retry strategiesconf/modules.config— StarDist, cellpose, segger, xenium_patch configsnextflow.config— Environment variables (PYTORCH_CUDA_ALLOC_CONF, NUMBA), singularity timeoutTests
tests/test_xenium_patch/— Unit tests for divide and stitch transcriptsmodules/nf-core/stardist/tests/— nf-test for StarDist moduleTest plan
nextflow run main.nf -profile test,docker -stubpasses--batch_size 1 --flow_threshold 0and outputs*_cp_masks.tif🤖 Generated with Claude Code