Address PR #139 review: segger refactor + 5 targeted fixes#148
Open
an-altosian wants to merge 9 commits intonf-core:devfrom
Open
Address PR #139 review: segger refactor + 5 targeted fixes#148an-altosian wants to merge 9 commits intonf-core:devfrom
an-altosian wants to merge 9 commits intonf-core:devfrom
Conversation
Add inline comment documenting the semantics of each exit-code range in the errorStrategy retry list. Notably, 2147483647 is Integer.MAX_VALUE — Nextflow's sentinel for tasks that died before writing .exitcode (e.g. AWS Batch spot reclamation, kubernetes preemption, grid-scheduler cancellations) — not an AWS-specific exit code. Cite Nextflow docs/aws.md for the AWS case. Addresses PR nf-core#139 review comment r3165322845.
…m() for STARDIST ext.args
Replaces the Groovy-fallthrough `findAll()` (no closure) with the documented
stdlib `join` + `trim` chain. Internal double-spaces from empty-string entries
get collapsed by bash word-splitting at `${args}` interpolation; `.trim()`
removes leading/trailing whitespace.
Matches the existing project idiom used in
subworkflows/local/utils_nfcore_spatialxe_pipeline/main.nf:341,352
(toolCitationText / toolBibliographyText).
Addresses PR nf-core#139 review comment r3165345526.
This Dockerfile was unreferenced by any module: a repo-wide grep for "modules/local/utility/Dockerfile" returns zero hits. Each utility submodule (convert_mask_uint32, extract_dapi, resize_tif, segger2xr, etc.) declares its own container directly via `container "..."`. The associated `conda.yml` it expected (per the COPY instruction) does not exist next to it. Leftover from an earlier development phase; no longer needed. Addresses PR nf-core#139 review comment r3165438419.
…hema_input.json Add nf-schema `format` keys so the validator confirms paths exist: - bundle: format: directory-path - image: format: file-path These validate path existence at runtime, not name conventions — so the team's concern about varying 10x bundle/image naming doesn't conflict. The format key is documented at https://nextflow-io.github.io/nf-schema/latest/nextflow_schema/nextflow_schema_specification/ Addresses PR nf-core#139 review comments r3165291219 (C3) and r3165292878 (C4).
…ll via ext.args
Removes the inline `params.baysor_tiling_min_transcripts_per_cell` reference
from the stitch process script. Instead:
- The XENIUM_PATCH_STITCH withName: block in conf/modules.config sets the
flag via ext.args
- The .nf script declares `def args = task.ext.args ?: ''` and interpolates
`${args}` into the command
Per Florian's narrowing directive, only the cited param at line 40 is fixed.
Other params references in the file (if any) remain as-is for follow-up work.
Addresses PR nf-core#139 review comment r3165442902.
Replaces the three inline `python3 - <<'PYEOF'` heredoc blocks (parquet
column statistics, tile-split workaround, NaN bd.x fix) plus the bundle
symlink shell loop with a single module binary at
resources/usr/bin/run_create_dataset.py.
The .nf script block shrinks from ~190 lines to ~15. The container
(quay.io/dongzehe/segger:1.0.14) stays general — pipeline-specific
workarounds for upstream segger bugs now live in the module binary
where they can be reviewed, tested, and removed when fixed upstream.
Each WORKAROUND function in the binary has a section comment naming
the upstream bug and removal condition.
Pattern matches existing baysor reference modules
(modules/local/baysor/{create_dataset,preprocess}/) which use
resources/usr/bin/ scripts called without ${moduleDir}/ prefix.
Addresses PR nf-core#139 review comment r3165415041 (segger create_dataset only;
predict refactor in next commit; train kept as-is per single-line debug
P2 carve-out).
Replaces inline GPU enumeration (`python3 -c` for torch.cuda.device_count), predict_parquet.py path resolver, and two `sed -i` runtime patches against installed segger source with a single module binary at resources/usr/bin/run_predict.py. The .nf script block shrinks from ~50 lines to ~15. The two sed patches (torch.no_grad() for VRAM savings, deterministic GPU seed) move into a named `patch_predict_parquet()` function with a clear "remove once upstreamed to segger" comment. Container (quay.io/dongzehe/segger:1.0.14) stays general — patches live in the pipeline. Addresses PR nf-core#139 review comment r3165415041 (segger predict).
This was referenced Apr 30, 2026
… with tar.gz test data The format: directory-path / file-path validators added in d59fac3 break CI because: - bundle field accepts EITHER a directory path (real Xenium bundle) OR a .tar.gz archive URL (test data, extracted at runtime by UNTAR). Neither 'directory-path' nor 'file-path' fits both, and both fail against remote URLs. - image field is similar — can be a local OME-TIFF path, a URL, or empty. The reviewer's suggestion in r3165291219 / r3165292878 was reasonable in isolation but conflicts with this pipeline's dual-purpose input semantics where the same field can be a path or a URL or an archive. Reverts the format keys; keeps the type / pattern / errorMessage entries.
4 tasks
5 tasks
…dict The heredocs were inadvertently introduced when extracting the inline Python to module binaries. Project convention (CLAUDE.md "Version Reporting") is topic-channel only — no `versions.yml` files. Topic channels (versions_segger, versions_python, etc.) emit the actual versions; the hardcoded `segger: 0.1.0` heredoc was unused and risked going stale. No other modules/local/* writes a versions.yml file. Aligns the segger modules with the rest of the pipeline.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Addresses the remaining PR #139 review comments after Florian's PR #147 already landed ~15 of the items. This PR contains 7 focused commits, each mapped 1:1 to a specific review thread for clean SHA-to-thread traceability:
7eedabddocs(base.config): explain errorStrategy exit codes— addresses r3165322845. Clarifies that2147483647isInteger.MAX_VALUE, the Nextflow sentinel for tasks killed before writing.exitcode(citingnextflow/processor/TaskRun.groovyand the AWS Batch troubleshooting docs).1585a9dstyle(modules.config): replace findAll().join(' ') with join(' ').trim() for STARDIST ext.args— addresses r3165345526. Matches existing project idiom insubworkflows/local/utils_nfcore_spatialxe_pipeline/main.nf:341,352.74aa917chore: remove orphan modules/local/utility/Dockerfile— addresses r3165438419. Repo-wide grep confirmed zero references.d59fac3feat(schema): add format validators (directory-path, file-path) to schema_input.json— addresses r3165291219 (C3) and r3165292878 (C4). Validates path existence at runtime, not name conventions.cf0d42dfix(xenium_patch/stitch): thread baysor_tiling_min_transcripts_per_cell via ext.args— addresses r3165442902. Per Florian's narrowing directive, only the cited param is fixed.1db7443refactor(segger): extract create_dataset orchestration to module binary— addresses r3165415041 (segger create_dataset). 3 inlinepython3 - <<PYEOFheredocs extracted to a singleresources/usr/bin/run_create_dataset.py(253 lines) following thebaysor/preprocess/reference pattern. The.nfscript block shrinks from ~190 lines to ~15. Each WORKAROUND for upstream segger bugs is named and commented for easy removal when fixed upstream.381c9e3refactor(segger): extract predict orchestration to module binary— addresses r3165415041 (segger predict). GPU enumeration + twosed -iruntime patches againstpredict_parquet.pyextracted toresources/usr/bin/run_predict.py(137 lines). The.nfscript block shrinks from ~50 lines to ~15.Out of scope (separate follow-up PRs)
utility/{convert_mask_uint32,extract_dapi,downscale_morphology,upscale_mask},xenium_patch/stitchpostprocess,parquet_to_csv). Same pattern as the segger refactors here.params.*references inworkflows/spatialxe.nfto formaltake:workflow inputs (addresses r3165543921 more thoroughly than the cited line alone).Test plan
pre-commiton changed files — passes (prettier, trim trailing whitespace, fix end of files all green)nextflow configparses without errors after all 7 commitstest_image_mode(skipped locally due to image-pull timeout; CI will exercise this)Notes for reviewers
nf-test -stubsnapshots should not regenerate.python3 -c "import torch; print(...)"debug print, which falls under principle P2's carve-out for single-line debug prints. Train has no heredocs, no sed patches, and no orchestration logic warranting a binary.