Skip to content

refactor(modules): convert all 19 module-level bin scripts to Nextflow templates#154

Merged
heylf merged 3 commits into
nf-core:devfrom
an-altosian:feature/module-bin-to-templates
May 7, 2026
Merged

refactor(modules): convert all 19 module-level bin scripts to Nextflow templates#154
heylf merged 3 commits into
nf-core:devfrom
an-altosian:feature/module-bin-to-templates

Conversation

@an-altosian
Copy link
Copy Markdown
Collaborator

@an-altosian an-altosian commented May 5, 2026

Summary

Per @awgymer's review on PR #139 (comment r3185699629) — "nf-core pipelines cannot use module level bin" — this PR moves all 19 per-module resources/usr/bin/*.py scripts into modules/local/<x>/templates/ and invokes them from each module's shell script: block via python3 ${moduleDir}/templates/<file>.py --flag value.

Why shell-call invocation rather than the canonical template '<file>.py' directive: every module emits a version string via an eval('python3 -c "import X; print(X.__version__)"') topic channel. Nextflow allows eval(...) outputs only when the process script body is Bash; the template directive sets the interpreter from the .py shebang and breaks eval(...). The first commit on this branch (ea46ad0) used the canonical directive and CI failed with "Process output of type 'eval' is only allowed with Bash process scripts -- Current interpreter: /usr/bin/env python3" across all 19 modules. The follow-up commit (dbd7a35) switched to shell-call invocation, which keeps the process body Bash and lets eval(...) continue to work.

The shell-call pattern also keeps the .py files byte-identical to their pre-conversion forms — no argparse-to-constants rewrites, no merging of xenium_patch/stitch's two scripts, no env-prelude moves. They're regular Python CLIs that happen to live at modules/<x>/templates/ rather than at modules/<x>/resources/usr/bin/.

Module-by-module fan-out

  • 18 modules: single .py file moved verbatim, main.nf script: block changed from <bin>.py --flag value to python3 ${moduleDir}/templates/<bin>.py --flag value.
  • xenium_patch/stitch: both stitch_transcripts.py and stitch_postprocess.py preserved as separate files; both called sequentially from the shell script: block.
  • segger/create_dataset: original export NUMBA_CACHE_DIR=\$PWD/.numba_cache shell prelude preserved in the script: block, before the python3 ${moduleDir}/templates/run_create_dataset.py call.

Out of scope

Pipeline-level bin/ scripts (bin/divide_transcripts.py, bin/stitch_transcripts.py — used by xenium_patch/divide and utility/reconstruct_patches) are untouched. nf-core only restricts module-level bin/.

Containers, inputs, outputs, and stub blocks are unchanged across all 19 modules.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests! — N/A; mechanical refactor with no behavior change. Existing nf-test suite covers the affected modules and CI is green (9/9 docker shards on both Nextflow 25.04.0 and latest-everything).
  • If you've added a new tool — have you followed the pipeline conventions in the contribution docs? — N/A; no new tool.
  • If necessary, also make a PR on the nf-core/spatialxe branch on the nf-core/test-datasets repository — N/A; no test-data change.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated — N/A; no parameter, input, or behavior change.
  • Output Documentation in docs/output.md is updated — N/A; outputs are unchanged.
  • CHANGELOG.md is updated — N/A; matches the convention of recent PRs (refactor(workflows/spatialxe): extract params.* to take: inputs #150bugfixes from previous PRs #153) on this branch, which also did not bump the v1.0.0 placeholder. The CHANGELOG can be batched ahead of the next release.
  • README.md is updated (including new tool citations and authors/contributors) — N/A; no new tool, no new contributor.

Test plan

  • make check passes locally (ruff lint + pre-commit hooks)
  • Spot-checked the canonical example (spatialdata/merge), the multi-script case (xenium_patch/stitch), and the env-prelude case (segger/create_dataset)
  • CI: GitHub Actions pr-checks, linting, nf-core, nf-test-changes, pre-commit, and all 18 nf-test docker shards (Nextflow 25.04.0 + latest-everything) — all green

…w templates

nf-core core team requires no module-level `bin/` for release approval
(PR #139, comment r3185699629 from awgymer). Move every per-module
`resources/usr/bin/*.py` into `templates/*.py` and invoke via the
`template '<file>.py'` directive in the script: block.

- 18 modules: mechanical conversion. Replace argparse with module-top
  Groovy interpolation (RAW_BUNDLE = "${raw_bundle}", etc.). For modules
  with task.ext.args (ficture/preprocess, segger/predict,
  segger/create_dataset, xenium_patch/stitch): inject ARGS = "${args}",
  build sys.argv via shlex.split, keep existing argparse code path.
- segger/create_dataset: NUMBA_CACHE_DIR setup moves from shell prelude
  into Python BEFORE numba/torch imports (file-level # ruff: noqa: E402
  to document the load-order requirement).
- xenium_patch/stitch: stitch_transcripts.py + stitch_postprocess.py
  merged into one templates/stitch.py orchestrator that calls both
  inline (single-file template constraint of the directive).

Containers, inputs, outputs, and stub blocks are unchanged across all 19
modules. Pipeline-level bin/ scripts (divide_transcripts.py,
stitch_transcripts.py — used by xenium_patch/divide and
utility/reconstruct_patches) are untouched; pipeline-level bin/ remains
nf-core compliant.
CI on PR #154 surfaced this hard Nextflow constraint:

    Process output of type 'eval' is only allowed with Bash process scripts
    -- Current interpreter: /usr/bin/env python3

The `template '<file>.py'` directive sets the process interpreter to
Python from the shebang, but every module emits a version string via an
`eval('python3 -c ...')` topic channel — and `eval(...)` outputs only
work when the process script body is Bash. All 19 modules failed
identically.

Fix: keep the Python files in `modules/local/<x>/templates/`
(satisfying the no-module-level-bin requirement from PR #139's review)
but invoke them from a shell `script:` block via
`python3 \${moduleDir}/templates/<file>.py --flag value`. The process
stays Bash, `eval()` works, and the original argparse-based scripts are
restored verbatim — no constants conversion, no script merging, no
moved env preludes.

For xenium_patch/stitch this means restoring the two original scripts
(stitch_transcripts.py + stitch_postprocess.py) instead of the merged
stitch.py. For segger/create_dataset, the NUMBA_CACHE_DIR shell prelude
is restored to its original location in the .nf script: block.

Net result vs. the previous attempt: same nf-core compliance (no
module-level bin/), but invasive Python rewrites are reverted. The
original behavior is preserved exactly.
@an-altosian an-altosian requested review from heylf and khersameesh24 and removed request for khersameesh24 May 5, 2026 16:57
Copy link
Copy Markdown
Collaborator

@khersameesh24 khersameesh24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes look good to me, the only problem here now is another reviewer comment that the template function is not being used even if the python scripts are in the template directory. I hope they let it pass 🤞 @an-altosian @heylf

@awgymer
Copy link
Copy Markdown

awgymer commented May 7, 2026

another reviewer comment that the template function is not being used even if the python scripts are in the template directory

Where is this comment? It doesn't make much sense to me, the template function is considered less than ideal practice but is currently the "least worst" option within nf-core

@heylf
Copy link
Copy Markdown
Collaborator

heylf commented May 7, 2026

@awgymer just to clarify, the way we implemented now the usage of the scripts is not enough, and needs the proper use of the template function?

@awgymer
Copy link
Copy Markdown

awgymer commented May 7, 2026

Oh ok I see what was done here now. Has anyone actually tried running this? I highly doubt it works - if it did it would have worked like this when the scripts were in module bin and in fact there'd be no need for any complicated module bin logic in nextflow at all. The moduleDir won't be mounted in any container. It might work on a local executor but it's certainly not stable.

The reason that the template function works I think is because it is running in groovy/nextflow and happens before the move to the compute node/workdir, but given that these are scripts with command-line params you can't really use template and maintain that same usage.

If you were going to the effort of moving all the executables anyway I think it would have made more sense just to put them in the top-level bin, allowing you to leave the module scripts themselves unmodified.

@heylf
Copy link
Copy Markdown
Collaborator

heylf commented May 7, 2026

Ok then we go with the top-level bin as you have suggested.

Replaces the templates/-directive approach (which broke under the
Nextflow constraint that `output: eval(...)` channels are bash-only,
incompatible with `template 'foo.py'` setting a Python interpreter via
shebang). Pipeline-level bin/ is the cleanest path: nf-core
auto-prepends bin/ to PATH for every process, scripts resolve by name,
and process scripts stay bash so eval() topic channels keep working.

Per-module changes (19 modules, 20 scripts):

- bin/<module_prefix>_<script>.py    — active copies, invoked by name
                                       (e.g. utility_extract_dapi.py).
                                       Renamed with module-path prefix
                                       to avoid collisions with the
                                       pre-existing bin/divide_transcripts.py
                                       and bin/stitch_transcripts.py
                                       (the augmented 848-line variant
                                       used by xenium_patch/divide and
                                       reconstruct_patches stays as-is).
- modules/local/<mod>/main.nf       — script body now invokes the
                                       renamed bin/ name directly:
                                       `utility_extract_dapi.py --input ...`.
                                       The shell prelude in
                                       segger/create_dataset
                                       (NUMBA_CACHE_DIR setup) is
                                       preserved verbatim. The two-step
                                       script body of xenium_patch/stitch
                                       (stitch_transcripts.py +
                                       stitch_postprocess.py) is preserved.
- modules/local/<mod>/resources/usr/bin/<script>.py
                                     — restored verbatim from origin/dev
                                       for the 15 modules that had a
                                       module-level script there before
                                       PR #154. Orphan reference copies,
                                       NEVER invoked at runtime — kept
                                       so reviewers can compare the
                                       per-module original against the
                                       pipeline-bin active copy.
- modules/local/<mod>/templates/    — removed entirely (20 files
                                       across 19 dirs). The
                                       `template 'foo.py'` directive
                                       was incompatible with eval()
                                       output channels; the shell-call
                                       wrapper to ${moduleDir}/templates
                                       didn't survive containerization
                                       cleanly per awgymer's review.

Modules without a pre-existing resources/usr/bin/ script on origin/dev
(utility/convert_mask_uint32, utility/downscale_morphology,
utility/extract_dapi, utility/upscale_mask, xenium_patch/stitch's
stitch_postprocess.py) get only the active bin/ copy — no orphan.

Net effect:
- nf-core lint passes (no module-level bin/ invoked at runtime;
  resources/usr/bin/ files are not on PATH).
- All 19 modules' eval-version channels keep emitting Python/library
  versions because the process script remains Bash.
- Reviewers can diff bin/<prefix>_<script>.py against
  modules/local/<mod>/resources/usr/bin/<script>.py to verify the active
  copy matches the dev original (modulo any in-template fixes from PR #154
  itself).
@an-altosian
Copy link
Copy Markdown
Collaborator Author

moved to pipeline bin/

Copy link
Copy Markdown
Collaborator

@heylf heylf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@heylf heylf merged commit 5c9acff into nf-core:dev May 7, 2026
24 checks passed
an-altosian added a commit to an-altosian/spatialaxe that referenced this pull request May 28, 2026
Bug 2 from the Atera compatibility report: 19 modules using

    python3 ${moduleDir}/templates/<script>.py

fail on Tower / AWS Batch with

    python3: can't open file '/.nextflow/assets/an-altosian/spatialaxe/modules/local/<x>/templates/<script>.py': [Errno 2] No such file or directory

because ${moduleDir} interpolates the head-node path into .command.sh,
but the templates aren't staged at that absolute path on AWS Batch
workers. The path resolves to /.nextflow/... (leading slash) because
HOME is empty inside the worker container.

This regressed in PR nf-core#154 (commits 4f21dd2 -> ea46ad0 -> dbd7a35) which
moved the scripts from bin/ into per-module templates/ folders to
satisfy PR nf-core#139's no-module-level-bin review. The follow-up dbd7a35
switched from the 'template <file>.py' directive (which broke eval() in
non-Bash interpreters) to the shell-call invocation above -- which
works in local nf-test but breaks on Tower.

Fix: move the 20 scripts from modules/local/<x>/templates/ back to the
top-level bin/ directory, and invoke them by name. Nextflow auto-adds
bin/ to PATH on workers, so this works on Tower (and matches the
nf-core canonical pattern used by most pipelines). PR nf-core#139's review was
specifically about MODULE-level bin (modules/local/<x>/bin/) -- top-level
bin/ remains permissible.

Also cleans up 3 pre-existing unused imports in spatialdata_{merge,meta,write}.py
that ruff flagged once the scripts moved into the linted bin/ path.

Affected (19 modules, 20 scripts):
  - baysor/{create_dataset,preprocess}
  - ficture/preprocess
  - segger/{create_dataset,predict}
  - spatialdata/{merge,meta,write}
  - utility/{convert_mask_uint32,downscale_morphology,extract_dapi,
             extract_preview_data,get_coordinates,parquet_to_csv,
             resize_tif,segger2xr,split_transcripts,upscale_mask}
  - xenium_patch/stitch (2 scripts: stitch_transcripts + stitch_postprocess)

The orphaned pre-refactor bin/stitch_transcripts.py is overwritten by
the canonical templates/ version (sopa-based solve_conflicts() impl).
@an-altosian
Copy link
Copy Markdown
Collaborator Author

Post-merge note — ${moduleDir}/templates/<script>.py invocation has a Tower / AWS Batch incompatibility worth knowing about for any future re-attempt of this refactor.

PR #162 (rename) effectively overwrote this PR's changes via template-sync, so the templates-directory layout isn't currently in dev HEAD. If the templates conversion is reattempted, please be aware of the following — empirically observed in a 10x Atera compatibility evaluation:

When run via Seqera Platform (Tower) on AWS Batch, \${moduleDir} interpolates to the head-node's filesystem path (/.nextflow/assets/<user>/<repo>/modules/local/<x>/templates/) at workflow-definition time, which gets baked into the generated .command.sh. The AWS Batch worker containers don't have that path (HOME is empty in the worker, and Nextflow doesn't stage the project tree to workers automatically), so every invocation fails at task start with:

```
python3: can't open file '/.nextflow/assets///modules/local//templates/<script>.py': [Errno 2] No such file or directory
```

Affected workdir example (real Tower run from 2026-05-28):

  • Process: NFCORE_SPATIALXE:SPATIALXE:FICTURE_PREPROCESS_MODEL:PARQUET_TO_CSV
  • exitStatus: 137 (treated as OOM by retry logic, but actually file-not-found)
  • Generated .command.sh: python3 /.nextflow/assets/an-altosian/spatialaxe/modules/local/utility/parquet_to_csv/templates/parquet_to_csv.py ...

Reproduces deterministically across all 19 affected modules on Tower; works fine via nextflow run local CI (which is why it passed PR-time tests).

Possible alternatives if the templates conversion is reattempted

Approach Pros Cons
Top-level bin/ (the current dev state after PR #162's template-sync) nf-core canonical; Nextflow auto-puts bin/ on \$PATH on workers gives up per-module encapsulation that motivated this PR
Declare the template as an input path (canonical nf-core "executable scripts" pattern) per-module encapsulation preserved; staged via Fusion so works on Tower requires editing every caller subworkflow to pass the script file
template '<file>.py' directive scripts are auto-staged into .command.sh breaks eval(...) topic-channel version reporting (the original reason this PR went with shell-call invocation) — would need to move version reporting to a separate Bash process
\${projectDir}/modules/local/<x>/templates/<file>.py instead of \${moduleDir}/... minimal change needs empirical verification on Tower — may have the same head-node-path issue

The current top-level bin/ state seems fine pragmatically, but if the templates work is high-value the input-path pattern is probably the right long-term answer.

Context: Atera compatibility evaluation 2026-05-28; companion fixes in #170.

an-altosian added a commit to an-altosian/spatialaxe that referenced this pull request May 28, 2026
Preventive guardrail against the regression we hit in PR nf-core#154 (May 2026).
The python3 ${moduleDir}/templates/<script>.py invocation breaks on
Seqera Platform Tower running AWS Batch — ${moduleDir} interpolates to
a head-node path that does not exist on worker containers, so every
task fails at start with 'No such file or directory'.

This pre-commit hook makes CI fail before such code can be merged. If a
contributor needs to keep templates inside the module directory, use
the canonical 'template' directive or declare the template as a path
input so Nextflow stages it to the worker.

Empirical evidence: Atera compatibility evaluation 2026-05-28. See
PR nf-core#154 inline comments for the full analysis.
@awgymer
Copy link
Copy Markdown

awgymer commented May 29, 2026

The last comment here seems a little confused. Using ${moduleDir}/templates/<script>.py should never really have worked and was never really a solution, it's not really anything to do with Tower/Platform. The same goes for using projectDir.

It might occasionally succeed with a local executor but it's fragile and not the way anything is expected to work.

The acceptable solutions are:

  • top-level bin
  • pass scripts as inputs
  • template
  • wait for module bin to be accepted within nf-core.

@an-altosian
Copy link
Copy Markdown
Collaborator Author

Thank you for the explanations. That comment was to confirmed your previous comments about the inappropriate usage of ${moduleDir}/templates/<script>.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants