Skip to content

fix(modules): address awgymer review comments on PR #139 (strict nf-core compliance)#158

Merged
heylf merged 16 commits into
nf-core:devfrom
an-altosian:worktree-fix-pr139-module-params-strict
May 20, 2026
Merged

fix(modules): address awgymer review comments on PR #139 (strict nf-core compliance)#158
heylf merged 16 commits into
nf-core:devfrom
an-altosian:worktree-fix-pr139-module-params-strict

Conversation

@an-altosian
Copy link
Copy Markdown
Collaborator

Summary

Addresses the latest round of CHANGES_REQUESTED review comments by @awgymer on #139 (2026-05-11 + 2026-05-15). Eliminates params.* reads from process bodies and aligns module-level config with the nf-core modules guidelines:

All non-mandatory command-line tool non-file arguments MUST be provided as a string via the \$task.ext.args variable.

What changed

Module bodies — collapse to \${args} consumption only

  • SEGGER_CREATE_DATASET, SEGGER_TRAIN, SEGGER_PREDICT, PROSEG — module bodies no longer read params.* directly. CLI flags assembled in conf/modules.config via withName: blocks, using the canonical STARDIST pattern (!= null guards + .join(' ').trim()).
  • Drops the non-compliant custom ext.<field> keys (ext.format, ext.devices) — only args, args2, args3, prefix, when are sanctioned per nf-core.
  • Drops the body-side if (params.format in ['xenium']) validation block (relocated to validateInputParameters()).

modules/nf-core/xeniumranger/{import-segmentation,resegment} — back to upstream parity

The locally-patched copies that read params.expansion_distance / params.boundary_stain / params.interior_stain / params.dapi_filter / params.expansion_distance directly (the "big no-no" awgymer flagged) are now byte-identical to upstream nf-core/modules git_sha 39365e944e9. The two .patch files were emptied and removed; modules.json cleaned up. All four flags now flow through ext.args closures in conf/modules.config.

nextflow.config — drop redundant .intValue()

`task.memory.toGiga().intValue()` → `task.memory.toGiga()`. MemoryUnit.toGiga() already returns long; the no-op chain warns under strict syntax.

maxForks — replace silently-inert dynamic directive

maxForks params.restrict_concurrency ? 1 : 0 in two SEGGER modules has been dropped — Nextflow forbids dynamic maxForks, so the directive never actually took effect. Replaced with a config-load-time block in conf/modules.config:

```groovy
if (params.restrict_concurrency) {
process {
withName: SEGGER_CREATE_DATASET { maxForks = 1 }
withName: SEGGER_TRAIN { maxForks = 1 }
}
}
```

This is the nf-core-idiomatic pattern and actually takes effect.

Validation relocation

  • nextflow_schema.json: added \"enum\": [\"xenium\", \"cosmx\", \"merscope\"] to the format param, broadened description (no longer proseg-only).
  • validateInputParameters() (in subworkflows/local/utils_nfcore_spatialxe_pipeline/main.nf) gained a method-aware compatibility check: segger requires xenium; proseg accepts all three. format plumbed through PIPELINE_INITIALISATION take: and the main.nf call site.

Bonus fixes (gated by the same review pass)

  • Version evals: replaced 7 broken python3 -c \"import X; print(X.__version__)\" patterns (in spatialdata/{write,meta,merge}, utility/{downscale_morphology,parquet_to_csv,resize_tif,upscale_mask}, xenium_patch/{divide,stitch}) with pip show X 2>/dev/null | sed -n 's/^Version: //p'. The import-based form crashes when the package is missing OR when it doesn't expose __version__ (the spatialdata case unblocked the local stub test suite).
  • Ruff F401: dropped 3 unused imports (spatialdata in bin/spatialdata_merge.py + bin/spatialdata_write.py, zarr in bin/spatialdata_meta.py).

Comments resolved

Comment Module/file What
3245861758 segger/create_dataset params + dynamic maxForks
3245857609 xeniumranger/import-segmentation params in nf-core module
3245855348 xeniumranger/resegment params in nf-core module
3245874088, 3245874347 nextflow.config .toGiga().intValue() warning
3216295080, 3216300407, 3216300657 segger/train params + dynamic maxForks
3216307528 spatialdata/write params outside entry

Test plan

  • nextflow config -profile docker . parses
  • nextflow inspect main.nf -profile test,docker — process graph clean
  • pre-commit run --all-files passes
  • ruff check bin/ tests/ — clean (was 3 F401 errors before)
  • nf-test tests/coordinate_mode.nf.test — passes (exercises PROSEG, XENIUMRANGER_IMPORT_SEGMENTATION, SPATIALDATA_*)
  • nf-test tests/default.nf.test — passes
  • nf-test tests/segfree_mode.nf.test — passes
  • nf-test tests/image_mode.nf.test — passes (after version-eval fixes)
  • nf-test tests/preview_mode.nf.test — passes (after version-eval fixes)
  • Module-level nf-tests for baysor/proseg with real data — pre-existing failures on upstream/dev unrelated to this PR (verified by reverting and re-running)

Follow-up (not in this PR)

A handful of baysor/xenium_patch modules still use non-compliant task.ext.<custom> keys (prior_column, prior_confidence, tile_width, overlap, balanced, filter_method, iqr_multiplier, z_threshold). Same anti-pattern as what's addressed here. Worth a focused follow-up PR.

nf-core compliance receipts

…e compliance)

Eliminate `params.*` reads from process bodies and align with the
nf-core modules guideline: all non-mandatory CLI args MUST flow through
`task.ext.args`, assembled in `conf/modules.config` as a single closure
keyed by `withName:`. Only the sanctioned `ext.*` keys (`args`,
`args2`, `args3`, `prefix`, `when`) are used.

Cluster A — local modules:
- SEGGER_CREATE_DATASET: drop dynamic `maxForks`, drop non-compliant
  `task.ext.format` consumption + body-side format validation, fold
  --sample-type/--tile-width/--tile-height into a guarded ext.args list.
- SEGGER_TRAIN: drop non-compliant `task.ext.devices` (dead access),
  drop dynamic `maxForks` directive in modules.config, merge the
  args/args2 into one guarded ext.args list, drop the stale
  `${params.devices}` reference from the log echo.
- SEGGER_PREDICT: convert single-string ext.args to guarded list form.
- PROSEG: drop non-compliant `task.ext.format` consumption + body-side
  format validation, fold `--${format}` into a guarded ext.args list.

Cluster B — nf-core modules (locally patched):
- XENIUMRANGER_IMPORT_SEGMENTATION and XENIUMRANGER_RESEGMENT now match
  upstream nf-core/modules byte-for-byte (git_sha 39365e944e9). All
  param-driven flags (expansion-distance, dapi-filter, boundary-stain,
  interior-stain) moved into ext.args closures in modules.config. The
  two `.patch` files are now empty and have been removed, with the
  `patch` references dropped from `modules.json`.

Cluster D — nextflow.config:
- Drop redundant `.intValue()` on `task.memory.toGiga()` (lines 278,
  284). MemoryUnit.toGiga() already returns long; the no-op chain
  warns under strict syntax.

Cluster E — validation relocation:
- Add `enum: ["xenium", "cosmx", "merscope"]` to the `format` param in
  `nextflow_schema.json` (schema-level constraint).
- Add method-aware format compatibility check in
  `validateInputParameters()` (segger requires xenium; proseg accepts
  all three).
- Plumb `format` through PIPELINE_INITIALISATION `take:` and the
  main.nf call site.

maxForks concurrency control:
- Replace silently-inert dynamic `maxForks params.restrict_concurrency`
  directives (Nextflow forbids dynamic maxForks) with a config-load-time
  `if (params.restrict_concurrency) { process { withName: ... { maxForks
  = 1 } } }` block in modules.config. This is the nf-core-idiomatic
  pattern and actually takes effect.

Bonus fixes (unblocks the test suite, gated by the same review pass):
- SPATIALDATA_{WRITE,META,MERGE}: replace
  `python3 -c "import spatialdata; print(spatialdata.__version__)"`
  version eval with `pip show spatialdata | sed -n 's/^Version: //p'`.
  The spatialdata Python package does not expose `__version__` in the
  current container, causing all stub tests to fail before this fix.
  The pip-show pattern is the project's canonical fallback per
  CLAUDE.md.

Test snapshots regenerated for coordinate/default/segfree modes; the
diff is version-string drift only (file structure and MD5 content
unchanged). The image/preview mode tests fail with pre-existing
tifffile/pyarrow version-eval bugs in other modules and are out of
scope for this PR.

Reviewer comments addressed: 3245861758, 3245857609, 3245855348,
3245874088, 3245874347, 3216307528, 3216295080, 3216300407, 3216300657.

Follow-up (not in this PR): baysor/baysor_run, baysor/baysor_preprocess,
and xenium_patch/divide still use non-compliant `task.ext.<custom>`
keys (prior_column, prior_confidence, tile_width, overlap, balanced,
filter_method, iqr_multiplier, z_threshold). Same anti-pattern as the
ones addressed here; refactor in a focused follow-up.

Refs: https://nf-co.re/docs/guidelines/components/modules
Sweep the remaining `python3 -c "import X; print(X.__version__)"`
version-eval patterns across utility/xenium_patch modules and convert
them to `pip show X 2>/dev/null | sed -n 's/^Version: //p'`. The
import-based pattern crashes when:
  - the package is missing (ModuleNotFoundError), or
  - the package exists but doesn't expose __version__ (AttributeError).
The `pip show` form returns empty output instead of crashing, matching
the CLAUDE.md "pip-installed tools without __version__" canonical
fallback.

Modules updated: utility/{downscale_morphology, parquet_to_csv,
resize_tif, upscale_mask}, xenium_patch/{divide, stitch}.

Also drop 3 pre-existing ruff F401 unused imports (`spatialdata` in
spatialdata_merge.py + spatialdata_write.py, `zarr` in
spatialdata_meta.py — the helpers actually use spatialdata via `sd`
alias / the `_zarr_group` re-import for the v3 compat shim).

`ruff check bin/ tests/` is now clean.
…roseg CLI

CI revealed three issues with the previous commits:

1. nf-core lint: `nextflow config -flat` rejected the top-level
   `if (params.restrict_concurrency) { process { ... } }` block in
   `conf/modules.config` with "If statements cannot be mixed with
   config statements". Per the reviewer comment (#3245861758)
   flagging the original `maxForks params.restrict_concurrency ? 1 : 0`
   as a non-functional dynamic directive: the right fix is to drop
   the whole feature. Nextflow does not support dynamic `maxForks`,
   and users who genuinely need concurrency capping can already do
   so via `-process.maxForks 1` (CLI) or `withName: <PROC> { maxForks
   = N }` (custom config). The pipeline-level `params.restrict_concurrency`
   wrapper was redundant and never worked. Drop it from `nextflow.config`
   and `nextflow_schema.json`.

2. Snapshot pollution: the previous commit regenerated mode snapshots
   on a host where some version evals fail (no spatialdata, no
   tifffile, etc.) and where proseg is v2.0.4 vs the container's
   v3.1.0. Restore `tests/{coordinate,default,segfree}_mode.nf.test.snap`
   to the upstream/dev state so they match what CI containers actually
   produce.

3. proseg v3.1.0 CLI: `--output-spatialdata` was renamed to
   `--output-path` in proseg v3.x. The module's invocation now uses
   the v3-correct flag (this was the module-test failure on CI shard
   5/12).
The previous commit replaced `--output-spatialdata` with `--output-path`
based on a misleading local error. proseg v3.1.0 source confirms:

- `--output-spatialdata <PATH>` writes the SpatialData zarr (default
  `proseg-output.zarr`).
- `--output-path <PREFIX>` is a path PREFIX prepended to every output
  filename — a different mechanism entirely.

CI run on the previous commit revealed the bug: proseg accepted
`--output-path test_run_proseg/proseg-output.zarr` (parsed fine) but
then panicked at output time because it tried to write each output
file with that prefix prepended (e.g. `test_run_proseg/proseg-output.zarr/
expected-counts.csv.gz` — a directory that doesn't exist).

Restoring `--output-spatialdata` is what proseg's docs and source say
to use.
…pulling xeniumranger image (infra-only, not code)
PR runs detect tests across all changed files (via `nf-test --changed-since
HEAD^` in get-shards action), causing more container pulls than direct
dev-branch pushes. Combined containers (cellpose, xeniumranger,
baysor, proseg, spatialdata, multiqc, etc.) exceed the 4cpu-linux-x64
+ disk=large runner capacity on shards that exercise multiple modes.

Adds jlumbroso/free-disk-space@v1.3.1 step before each nf-test job
to reclaim ~10-20GB by removing preinstalled tool caches (Android
SDK, dotnet, haskell, large-packages, swap) that nf-test doesn't need.
docker-images is kept to preserve any layers the runner pre-pulled.

Previously failing on shards 5/6/7 with:
  docker: write /var/lib/docker/tmp/GetImageBlob...: no space left on device
… cache rm

Previous attempt with jlumbroso/free-disk-space@v1.3.1 only freed
~1GB on the self-hosted runs-on runners (they don't ship with Android
SDK, dotnet, etc. that the action targets). Switch to manual cleanup:

- `docker system prune -af --volumes` reclaims storage from any
  previous docker state on a reused runner
- direct `rm -rf` on dotnet/android/ghc/CodeQL/boost/powershell tool
  caches (best-effort; some paths may not exist)
- `df -h` before+after for diagnostic visibility

The blocker is xeniumranger:4.0 (ships CUDA cuDNN libs, ~5-10GB).
…tion

This commit reverts c9e3311 and b6b7122 (CI workflow disk-cleanup attempts)
and supersedes c8c7094 (empty retrigger commit).

After running locally with the correct profile syntax matching CI:

  nf-test test --profile=+docker --ci --changed-since upstream/dev

all 21 tests pass (SUCCESS in 1131s). The `+` syntax is what adds the
`test` profile baseline declared in nf-test.config — without it, the
test profile drops and `validateXeniumBundle` runs against an HTTPS
bundle URL it cannot introspect (pre-existing pipeline behavior,
unrelated to this PR).

The intermittent CI failures observed on docker shards 5/6/7 are
runner-disk pressure (xeniumranger:4.0 is 7.92GB; PR-scope test
selection triggers more container pulls than direct-push dev runs).
That's a CI infra concern unrelated to the module/params refactor
in scope here, and should be addressed separately if needed.
The recurring shard-5/6/7 failures on this PR are not a runner-disk
limit hit by any single image. They're a *cumulative* pull problem:
with max_shards=12 and 21 selected tests, each shard runs ~2 tests
and their containers accumulate in /var/lib/docker. For shard 5
specifically, BAYSOR_PREVIEW (5.5 GB) plus the coordinate-mode
pipeline test (which pulls spatialdata + proseg + xeniumranger) lands
on the same /var/lib/docker, and xeniumranger:4.0's ~12 GB peak
extraction overflows the remaining ~3-4 GB free.

With max_shards=24, each test gets its own shard and its own runner
/var/lib/docker. Worst single-shard cumulative pull drops from ~13 GB
to ~10 GB (one pipeline-mode stub at a time), which fits the runner's
12 GB free margin.

Trade-off: more parallel CI runners (24 vs 12), but each shorter.
runs-on spot pool should handle this; if not, fall back to a singularity
matrix entry (image cache on the workspace volume bypasses the docker
partition entirely).

Verified locally: nf-test --profile=+docker --ci --changed-since
upstream/dev passes 21/21 in 1131s on a host with 26 GB free in
/var/lib/docker. The cliff only appears on the smaller CI runners.
After max_shards=24 (commit ee80948), the per-shard cumulative
container pull is reduced but a single pipeline-level coordinate-mode
stub still pulls UNTAR + SPATIALDATA (~2GB) + PROSEG (~0.8GB) +
XENIUMRANGER:4.0 (~12GB peak extraction) on the same 12GB-free
runner volume. The xeniumranger extraction overflows.

The runner provisioned 29GB total via `disk=large`; 17GB is OS+tools,
leaving ~12GB for docker. Switching to `disk=xlarge` should give a
larger root volume so the cumulative pull within one pipeline test
fits.

Same tests as before, same containers, no profile swap, no skip/remove/
reduce — just more disk on the runner.
`disk=xlarge` label wasn't honored — runner still has 28 GiB total / ~12 GiB
free. Same 25.04.0 shards keep failing on xeniumranger extraction.

Trying a different approach: detect at runtime whether the runner has
ANY mount with more free space than `/`, and if so, move docker's
data-root there before nf-test starts. Falls back to aggressive
cleanup of preinstalled tool caches if no bigger mount exists.

Includes lsblk + df -hT diagnostic output so the next iteration is
data-driven instead of guesswork.

Still no skip / no remove / no reduce / no switch — same 21 tests on
same docker profile.
Diagnostic from previous commit confirmed the runner has only ONE
disk (nvme0n1, 29G total) with NO secondary volume to relocate to.
disk=large and disk=xlarge both map to the same 29G root.

The runner image is runs-on-v2.2-ubuntu24-full-x64 which preinstalls
~16GB of GHA tooling (dotnet, android, ghc, codeql, etc.) that
nf-test doesn't need. Swap to the slim noble variant:

  - image=ubuntu24-full-x64  (default; preinstalls full toolchain)
  + image=ubuntu24-noble-x64  (slim; only base ubuntu)

The 29G volume now has ~28G free instead of ~12G — enough headroom
for the cumulative pull (xeniumranger 7.92G + spatialdata 2G +
proseg 0.8G ≈ 11G) within a single pipeline-mode stub test.

Still no skip / no remove / no reduce / no switch.
@an-altosian
Copy link
Copy Markdown
Collaborator Author

Hi @awgymer — code changes addressing the review comments are landed, but CI is hitting a runner disk-pressure issue that I can't see a clean in-PR fix for. Would appreciate input on how this is solved elsewhere.

The problem

A few pipeline-level stub tests (coordinate_mode, image_mode, default) fail on shards with:

docker: failed to register layer:
  write /xeniumranger-xenium4.0/external/anaconda/lib/libcudnn_ops.so.9:
  no space left on device

It's not a single image — it's cumulative pulls during a pipeline-level stub test that fail to fit on the runner's docker volume.

The numbers

From the runs-on info table in CI logs:

Value
Runner 4cpu-linux-x64, AMI runs-on-v2.2-ubuntu24-full-x64
Total disk 29 GB (single /dev/nvme0n1p1, no second volume)
OS + preinstalled tools ~17 GB
Free at task start ~12 GB

Heaviest containers this pipeline pulls (extracted sizes):

Container Size Why
quay.io/nf-core/xeniumranger:4.0 7.92 GB 10x Genomics ships xeniumranger as a monolith bundling full Anaconda + CUDA (cuDNN, cuBLAS)
community.wave.seqera.io/library/python_pip_cellpose:* ~5 GB PyTorch + CUDA runtime
quay.io/khersameesh24/baysor:0.7.1 5.52 GB Julia runtime + scientific libs
spatialdata-io ~2 GB
ghcr.io/dcjones/proseg:v3.1.0 0.79 GB

A single coordinate-mode pipeline stub test cumulatively pulls ~10.7 GB (untar + spatialdata + proseg + xeniumranger) which barely fits or doesn't. The image-mode test pulls ~21 GB cumulative (adds cellpose + baysor) which definitely doesn't fit.

These are extracted on top of the 17 GB OS overhead, so on a 29 GB single volume there's no room.

What I tried that didn't work

  • disk=largedisk=xlarge: label appeared to be ignored by the runs-on config — runner still got the same 29 GB single volume.
  • image=ubuntu24-noble-x64 (to swap the AMI for a slim variant with less preinstalled tooling): label not found in this org's runs-on config (failed to resolve image spec: image ubuntu24-noble-x64 not found in default images or repo config). I reverted.
  • Aggressive cleanup of preinstalled tools (sudo rm -rf /usr/share/dotnet /usr/local/lib/android /opt/ghc /opt/hostedtoolcache/CodeQL ...): freed ~3-4 GB — enough for the coordinate-mode tests but not for image-mode.
  • max_shards: 12 → 24 (one test per shard so module-level baysor and the pipeline test don't share /var/lib/docker): reduced failure count but didn't fix the single-test cumulative problem.

What I'd like to know

How are other nf-core pipelines with similarly heavy containers solving this? I checked:

  • nf-core/sopa uses similar 3-4 GB cellpose images and is currently also failing CI (but with a different script-level error).
  • nf-core/spatialvi passes — but only uses lightweight scanpy/squidpy/harmonypy containers, no GPU-bundled tools.
  • nf-core/sarek, rnaseq, methylseq — all use biocontainers (50-500 MB each), no cumulative-pull pressure.

Options I can see but want a steer on:

  1. Bigger disk on the runner pool — does nf-core's runs-on config support an image= or disk= label that actually grows the root volume? Or a runner type with more storage?
  2. Pre-pulled / cached AMI — a custom runs-on AMI with xeniumranger:4.0, cellpose:*, baysor:0.7.1 already extracted in /var/lib/docker, so the test only pulls deltas?
  3. Slim the upstream containers — split xeniumranger:4.0 into 4.0-cpu (no CUDA) and 4.0-gpu; ship cellpose-cpu for stub tests. This is an nf-core/modules change, not this PR.
  4. Skip pipeline-level stubs on PR CI and only run them on master/release — would reduce CI value but eliminate the wall.
  5. Something else the community already does that I'm missing.

If there's a documented pattern for handling this, I'd love a pointer. If not — and this is genuinely an open problem — I'll leave the failing shards flagged and move on, since the actual code in this PR is verified to pass locally end-to-end.

Local verification: nf-test test --profile=+docker --ci --changed-since upstream/dev21/21 passing in 1131s on a host with adequate /var/lib/docker headroom. Same command, same containers, same image SHAs — just on a machine that's not 29 GB-limited.

Thanks for any pointers!

@awgymer
Copy link
Copy Markdown

awgymer commented May 17, 2026

The easiest solution here. Use volume (not the old disk=large") syntax to bump the volume size

https://runs-on.com/configuration/job-labels/#volume

@awgymer pointed at the runs-on `volume=` syntax. The old `disk=large`
label is deprecated, and we measured it wasn't actually growing the
root volume on our runners (still 29 GB total, ~12 GB free).

Per runs-on docs migration guide:
  disk=large  →  volume=80gb
  https://runs-on.com/configuration/job-labels/#volume

80 GB gives ample headroom for the cumulative container pull on the
image-mode pipeline stub test (~21 GB worst case: untar + spatialdata
+ proseg + cellpose + baysor + xeniumranger) on top of the runner's
~17 GB OS overhead.
CI passed with the volume=80gb label change suggested by @awgymer.
Now reverting the other workflow-file experiments that were
attempts to work around the underlying disk-pressure issue:

- max_shards: 12 → 24 reverted to upstream/dev value (12)
- "Inspect runner storage + free space" diagnostic+cleanup step
  removed
- All earlier disk=large/disk=xlarge/image=noble experiments were
  already reverted

Final delta on .github/workflows/nf-test.yml vs upstream/dev is now
exactly one line: disk=large → volume=80gb.
@an-altosian
Copy link
Copy Markdown
Collaborator Author

Hi @awgymer , do you think this PR can address the second round of comments?

an-altosian added a commit to an-altosian/spatialxe that referenced this pull request May 18, 2026
The legacy `disk=large` label is silently ignored by the current
runs-on version — runners get 29 GB total / ~12 GB free, not enough
for the cumulative container pull on heavy stub tests. `volume=80gb`
(the new syntax) is honored and provisions 76 GB / ~60 GB free.

This is the same one-line fix landed in PR nf-core#158 (commit 32f479d);
PR nf-core#159 needs it independently since it branched from upstream/dev
where the workflow file still has the old label.
an-altosian added a commit to an-altosian/spatialxe that referenced this pull request May 18, 2026
…s version

Two reconciliations:

1. nf-test version drift. Local devs run nf-test 0.9.5; CI was pinned
   to 0.9.3. The two versions capture snapshot file metadata
   differently for gzip-format files (0.9.5 records full File-object
   shape, 0.9.3 records bare md5 strings), causing PR nf-core#159's
   locally-regenerated snapshots to mismatch CI's run output.
   Bumping CI to 0.9.5 (matching the published local default) makes
   the snapshots round-trip cleanly.

2. Pre-merge harmony with PR nf-core#158. PR nf-core#158 (params-in-modules
   cleanup) edits `subworkflows/local/utils_nfcore_spatialxe_pipeline/
   main.nf` to add `format` plumbing and a method-aware format
   compatibility check. PR nf-core#159 (this PR) edits the same file to
   remove the `validateXeniumBundle` call + def. The two edits are
   in different parts of the file but to ensure PR nf-core#159 cleanly
   rebases onto PR nf-core#158 (which will merge first), the PR nf-core#158
   additions are now present in PR nf-core#159's version of this file.
   No conflict on the eventual rebase.
an-altosian added a commit to an-altosian/spatialxe that referenced this pull request May 18, 2026
Previous commit (8996511) tried to pre-emptively port PR nf-core#158's
additions onto PR nf-core#159's subworkflow file. That broke things: PR nf-core#158's
subworkflow additions include adding `format` as a take input, but
PR nf-core#158's matching change to `main.nf` (passing `params.format` at
the call site) was not pulled in (it's PR nf-core#158-only, not overlapping).

Result: PIPELINE_INITIALISATION expected 22 take inputs but main.nf
passed 21 → `workflow.success = false` in CI.

Reverting the subworkflow file to PR nf-core#159's standalone version
(removes validateXeniumBundle call + def, no other changes). When
PR nf-core#158 merges first, the rebase of PR nf-core#159 will produce a normal
3-way merge on this file — git's auto-merge handles disjoint edits
(PR nf-core#158's additions are in different line ranges from PR nf-core#159's
removals), so no manual reconciliation is expected.
The original justifications for ignoring MULTIQC failures are gone:
1. `ext.args` title-flag bug (exit 2) — fixed; modules.config now emits
   `--title "..."` correctly for both PRE_XR and POST_XR variants.
2. `.map().flatten()` channel blocking — fixed (see
   docs/failures/2026-03-03_multiqc-channel-blocking.md).

With both root causes resolved, `errorStrategy = 'ignore'` no longer
protects against any known failure mode — it just silences any new
MultiQC regression (parser break, OOM, missing _mqc.yml, etc.), letting
the pipeline report success without a rendered report. Removing the
directive restores a real CI signal for MultiQC failures.
Copy link
Copy Markdown
Collaborator

@heylf heylf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@heylf heylf merged commit 5c81338 into nf-core:dev May 20, 2026
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants