Skip to content

fastq_fastqc_umitools_trimgalore: handle null trim_log in read-count map#11228

Merged
pinin4fjords merged 4 commits intomasterfrom
pinin4fjords/fastq-fastqc-umitools-trimgalore-null-log
Apr 20, 2026
Merged

fastq_fastqc_umitools_trimgalore: handle null trim_log in read-count map#11228
pinin4fjords merged 4 commits intomasterfrom
pinin4fjords/fastq-fastqc-umitools-trimgalore-null-log

Conversation

@pinin4fjords
Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords commented Apr 20, 2026

Summary

Three commits, one behaviour change:

  1. 1c2b050309 — null-log fix.
  2. 546c722afdch_ prefix rename on outer local channels (removes the collision with closure params for trim_log). No behaviour change.
  3. 7af199b8ad — lint fix: reads_ keeps its trailing underscore in closures because reads is a take: parameter and Nextflow lint blocks the shadow.

The bug

When ext.args includes --hardtrim5 / --hardtrim3, trim_galore skips the adapter / quality pass and emits no *_trimming_report.txt. The .join(trim_log, remainder: true) then carries a null log element, and the null guard had two scoping bugs:

.map { meta, reads_, trim_log_ ->
    if (trim_log) {                                     // outer channel, always truthy
        def num_reads = getTrimGaloreReadsAfterFiltering(
            meta.single_end ? trim_log_ : trim_log_[-1])
        [meta, reads_, num_reads]
    } else {
        [meta, reads, min_trimmed_reads.toFloat() + 1]  // outer take param, a channel
    }
}
  • if (trim_log) checked the outer channel (always truthy) → getTrimGaloreReadsAfterFiltering(null)Cannot invoke method eachLine() on null object.
  • Else branch referenced the outer reads take parameter instead of the closure element.

Commit 1 fixes both by targeting the _-suffixed closure params. Commit 2 renames outer channels to ch_* so the trim_log closure param can drop its underscore. Commit 3 reinstates reads_ because Nextflow lint rejects shadowing the take: parameter.

Reproduction

Surfaced triaging nf-core/rnaseq#1807. Running the pipeline with --extra_trimgalore_args '--hardtrim5 8' crashes on master at main.nf:15.

Test

New case test paired end read without UMI - hardtrim5 (no trim log) sets ext.args = '--hardtrim5 8' on TRIMGALORE. Fails on master, passes here. All 9 tests pass.

Review tip

Most of the line count lives in commit 2 (the rename). The behaviourally-important lines appear unchanged in the final master-vs-HEAD diff because the rename flipped which name resolves to what. Self-review pins the real fix inline. Commit-by-commit review recommended.

PR checklist

  • This comment contains a description of changes (with reason).
  • CHANGELOG.md is updated.
  • If you've fixed a bug or added code that should be tested, add tests!
  • Ensure the test suite passes (nf-core subworkflows test <SUBWORKFLOW> --profile docker).

🤖 Generated with Claude Code

When Trim Galore does not emit a `*_trimming_report.txt` (for example
when the user passes `--hardtrim5` or `--hardtrim3` via `ext.args`,
which makes trim_galore skip the adapter/quality pass and write only
a hard-trimmed FASTQ), the `.join(trim_log, remainder: true)` leaves
a `null` log element. The subsequent `.map` had two scoping bugs in
its null guard:

- `if (trim_log)` checked the channel itself (always truthy) instead
  of `trim_log_` (the current element), so the null-log fallback was
  never reached and `getTrimGaloreReadsAfterFiltering(null)` blew up
  with `Cannot invoke method eachLine() on null object`.
- The else branch emitted `[meta, reads, ...]`, referencing the
  outer-scope `reads` channel parameter instead of the mapped
  element `reads_`.

Swap to `if (trim_log_)` and `reads_` so the fallback actually fires
and carries through the correct per-sample reads list.

Added an nf-test case that exercises this path by configuring
TRIMGALORE with `ext.args = '--hardtrim5 8'`, which reproduces the
crash on master.

Originally surfaced while triaging nf-core/rnaseq#1807.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure-rename follow-up to the null-log fix. The previous layout used
the same base name for both the outer-scope channel variable
(`trim_log`, `reads`, etc.) and the closure parameter inside nearby
maps (`trim_log_`, `reads_`, ...), distinguished only by a trailing
underscore. That's the footgun that caused the null-log bug in the
first place -- `if (trim_log)` read the channel instead of the
element and was silently always truthy.

Prefix the local channel-bearing variables in the workflow body with
`ch_` (`ch_trim_log`, `ch_reads`, `ch_umi_log`, etc.) and drop the
trailing underscore from closure parameters now that there's no
collision. `take:` parameters are left as-is so the subworkflow's
public signature is unchanged.

No behaviour change; all 9 nf-tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/m and removed size/s labels Apr 20, 2026
Copy link
Copy Markdown
Member Author

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review: three commits, one behaviour change.

  • 1c2b050309 — null-log fix. Two scoping bugs (if (trim_log) tested the outer channel; else branch emitted the outer reads take parameter). Fixed by targeting the _-suffixed closure params.
  • 546c722afdch_ prefix rename of the outer locals, drops the underscore from trim_log in the null-log map. Pure rename, no behaviour change.
  • 7af199b8ad — lint fix: reads_ stays underscored in closures because reads is a take: parameter and Nextflow lint won't permit the shadow.

The behaviourally-important lines appear unchanged in the master-vs-HEAD diff because the rename flipped which name resolves to what. Inline comments pin the real fix. Review commit-by-commit.

TRIMGALORE.out.reads
.join(trim_log, remainder: true)
.map { meta, reads_, trim_log_ ->
.join(ch_trim_log, remainder: true)
Copy link
Copy Markdown
Member Author

@pinin4fjords pinin4fjords Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix anchor. Master had .join(trim_log, ...) with trim_log_ for the element below — same base name, distinguished only by a trailing underscore. That footgun is what let bug 1 slip in. Now ch_trim_log = channel, trim_log below = element, unambiguous. (reads_ keeps its underscore because reads is a take: parameter and Nextflow lint won't allow shadowing.)

.map { meta, reads_, trim_log_ ->
.join(ch_trim_log, remainder: true)
.map { meta, reads, trim_log ->
if (trim_log) {
Copy link
Copy Markdown
Member Author

@pinin4fjords pinin4fjords Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug fix line 1. Text unchanged vs master, semantics flipped: trim_log used to resolve to the outer channel (always truthy), so the null-log branch was unreachable and getTrimGaloreReadsAfterFiltering(null) crashed. After the rename it resolves to the closure param (the element, null when trim_galore emits no report). The new hardtrim5 test fails here on master.

}
}

test("test paired end read without UMI - hardtrim5 (no trim log)") {
Copy link
Copy Markdown
Member Author

@pinin4fjords pinin4fjords Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression test for the null-log path. --hardtrim5 8 makes trim_galore emit no report, so .join(..., remainder: true) carries a null — exactly the crash scenario on master.

Nextflow's linter rejects closure parameters that shadow an outer
name, and `reads` is declared as a `take:` parameter of the
subworkflow. Dropping the trailing underscore in the previous rename
commit (546c722) meant the closures at lines 59, 84 and 97
redeclared `reads`, which `nextflow lint` (and the repo's pre-commit
hook) flagged with `` `reads` is already declared ``.

Keep `reads_` in those three closures, leaving the `trim_log` rename
in place (that one's fine because the outer was renamed to
`ch_trim_log`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
}
else {
[meta, reads, min_trimmed_reads.toFloat() + 1]
[meta, reads_, min_trimmed_reads.toFloat() + 1]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug fix line 2. The text changed from reads to reads_, but the interesting change is the semantics: on master reads resolved to the outer take: parameter (a channel), so had this branch ever been reached (it wasn't — see bug 1) it would have emitted a channel in place of the per-sample FASTQ list. reads_ keeps the underscore because reads is a take parameter and Nextflow lint blocks the shadow.

Copy link
Copy Markdown
Contributor

@vagkaratzas vagkaratzas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Minor adjustments and a question :D

Comment thread subworkflows/nf-core/fastq_fastqc_umitools_trimgalore/tests/main.nf.test Outdated
])
input[1] = true // skip_fastqc
input[2] = false // with_umi
input[3] = false // skip_umi_extract
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this be true since reads are without UMI? or are you deliberately testing this parameter combo?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No deliberate combo — I'd mirrored the existing "without UMI" test above (skip_umi_extract: false, with_umi: false), which is behaviourally equivalent because with_umi: false gates the UMI block off regardless. Flipped to true in b56425b for semantic clarity.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm tests are failing now though; could it be that the flag is not properly used? Or just needs a snapshot update?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just singularity having a bad day I think

Comment thread subworkflows/nf-core/fastq_fastqc_umitools_trimgalore/tests/main.nf.test Outdated
- `Channel.of` → `channel.of` (lowercase namespace).
- `skip_umi_extract` → `true` for consistency with the "without UMI"
  wording; no-op behaviourally since `with_umi: false` already gates
  the UMI block off.
- Align closing `).match() }` with surrounding lines.

Per review feedback on #11228.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vagkaratzas vagkaratzas self-requested a review April 20, 2026 11:35
Copy link
Copy Markdown
Contributor

@vagkaratzas vagkaratzas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pinin4fjords pinin4fjords added this pull request to the merge queue Apr 20, 2026
Merged via the queue into master with commit e5777ea Apr 20, 2026
89 of 105 checks passed
@pinin4fjords pinin4fjords deleted the pinin4fjords/fastq-fastqc-umitools-trimgalore-null-log branch April 20, 2026 11:41
pinin4fjords added a commit to nf-core/rnaseq that referenced this pull request Apr 22, 2026
…ntegration

Bump to the latest upstream of:

- fq/lint (nf-core/modules#11227): constrain reads arity to 1..2
- ribodetector (nf-core/modules#11258): unpin GPU container from pytorch-gpu=1.11.0; emit cuda version on the topic
- tximeta/tximport (nf-core/modules#11141): fix gene-level crash on mismatched transcript FASTA/GTF
- fastq_fastqc_umitools_trimgalore (nf-core/modules#11228): handle null trim_log in the read-count map
- custom/catadditionalfasta (nf-core/modules#11256): topic-based versions, explicit out/\${prefix}.{fasta,gtf} paths, task.ext.prefix ?: meta.id prefix handling

The custom/catadditionalfasta interface change needs pipeline-side follow-up in conf/modules/prepare_genome.config:

- Fix the stale CAT_ADDITIONAL_FASTA selector (now CUSTOM_CATADDITIONALFASTA) and split PREPROCESS_TRANSCRIPTS_FASTA_GENCODE into its own block.
- Set ext.prefix = "\${params.genome ?: fasta.baseName}_\${add_fasta.baseName}" so output filenames follow the previous {genome}_{add_name} pattern; the new module default (meta.id) would otherwise rename outputs to genome_transcriptome.{fasta,gtf}.

Behaviour note: fixing the withName selector also exposes a pre-existing intent that was masked. CUSTOM_CATADDITIONALFASTA outputs now only publish when --save_reference is set; the stale selector previously let them fall through to the default publishDir and land in <outdir>/custom/out/ regardless of --save_reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants