fastq_fastqc_umitools_trimgalore: handle null trim_log in read-count map#11228
Conversation
When Trim Galore does not emit a `*_trimming_report.txt` (for example when the user passes `--hardtrim5` or `--hardtrim3` via `ext.args`, which makes trim_galore skip the adapter/quality pass and write only a hard-trimmed FASTQ), the `.join(trim_log, remainder: true)` leaves a `null` log element. The subsequent `.map` had two scoping bugs in its null guard: - `if (trim_log)` checked the channel itself (always truthy) instead of `trim_log_` (the current element), so the null-log fallback was never reached and `getTrimGaloreReadsAfterFiltering(null)` blew up with `Cannot invoke method eachLine() on null object`. - The else branch emitted `[meta, reads, ...]`, referencing the outer-scope `reads` channel parameter instead of the mapped element `reads_`. Swap to `if (trim_log_)` and `reads_` so the fallback actually fires and carries through the correct per-sample reads list. Added an nf-test case that exercises this path by configuring TRIMGALORE with `ext.args = '--hardtrim5 8'`, which reproduces the crash on master. Originally surfaced while triaging nf-core/rnaseq#1807. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure-rename follow-up to the null-log fix. The previous layout used the same base name for both the outer-scope channel variable (`trim_log`, `reads`, etc.) and the closure parameter inside nearby maps (`trim_log_`, `reads_`, ...), distinguished only by a trailing underscore. That's the footgun that caused the null-log bug in the first place -- `if (trim_log)` read the channel instead of the element and was silently always truthy. Prefix the local channel-bearing variables in the workflow body with `ch_` (`ch_trim_log`, `ch_reads`, `ch_umi_log`, etc.) and drop the trailing underscore from closure parameters now that there's no collision. `take:` parameters are left as-is so the subworkflow's public signature is unchanged. No behaviour change; all 9 nf-tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Self-review: three commits, one behaviour change.
1c2b050309— null-log fix. Two scoping bugs (if (trim_log)tested the outer channel; else branch emitted the outerreadstake parameter). Fixed by targeting the_-suffixed closure params.546c722afd—ch_prefix rename of the outer locals, drops the underscore fromtrim_login the null-log map. Pure rename, no behaviour change.7af199b8ad— lint fix:reads_stays underscored in closures becausereadsis atake:parameter and Nextflow lint won't permit the shadow.
The behaviourally-important lines appear unchanged in the master-vs-HEAD diff because the rename flipped which name resolves to what. Inline comments pin the real fix. Review commit-by-commit.
| TRIMGALORE.out.reads | ||
| .join(trim_log, remainder: true) | ||
| .map { meta, reads_, trim_log_ -> | ||
| .join(ch_trim_log, remainder: true) |
There was a problem hiding this comment.
Fix anchor. Master had .join(trim_log, ...) with trim_log_ for the element below — same base name, distinguished only by a trailing underscore. That footgun is what let bug 1 slip in. Now ch_trim_log = channel, trim_log below = element, unambiguous. (reads_ keeps its underscore because reads is a take: parameter and Nextflow lint won't allow shadowing.)
| .map { meta, reads_, trim_log_ -> | ||
| .join(ch_trim_log, remainder: true) | ||
| .map { meta, reads, trim_log -> | ||
| if (trim_log) { |
There was a problem hiding this comment.
Bug fix line 1. Text unchanged vs master, semantics flipped: trim_log used to resolve to the outer channel (always truthy), so the null-log branch was unreachable and getTrimGaloreReadsAfterFiltering(null) crashed. After the rename it resolves to the closure param (the element, null when trim_galore emits no report). The new hardtrim5 test fails here on master.
| } | ||
| } | ||
|
|
||
| test("test paired end read without UMI - hardtrim5 (no trim log)") { |
There was a problem hiding this comment.
Regression test for the null-log path. --hardtrim5 8 makes trim_galore emit no report, so .join(..., remainder: true) carries a null — exactly the crash scenario on master.
Nextflow's linter rejects closure parameters that shadow an outer name, and `reads` is declared as a `take:` parameter of the subworkflow. Dropping the trailing underscore in the previous rename commit (546c722) meant the closures at lines 59, 84 and 97 redeclared `reads`, which `nextflow lint` (and the repo's pre-commit hook) flagged with `` `reads` is already declared ``. Keep `reads_` in those three closures, leaving the `trim_log` rename in place (that one's fine because the outer was renamed to `ch_trim_log`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| } | ||
| else { | ||
| [meta, reads, min_trimmed_reads.toFloat() + 1] | ||
| [meta, reads_, min_trimmed_reads.toFloat() + 1] |
There was a problem hiding this comment.
Bug fix line 2. The text changed from reads to reads_, but the interesting change is the semantics: on master reads resolved to the outer take: parameter (a channel), so had this branch ever been reached (it wasn't — see bug 1) it would have emitted a channel in place of the per-sample FASTQ list. reads_ keeps the underscore because reads is a take parameter and Nextflow lint blocks the shadow.
vagkaratzas
left a comment
There was a problem hiding this comment.
Nice! Minor adjustments and a question :D
| ]) | ||
| input[1] = true // skip_fastqc | ||
| input[2] = false // with_umi | ||
| input[3] = false // skip_umi_extract |
There was a problem hiding this comment.
shouldnt this be true since reads are without UMI? or are you deliberately testing this parameter combo?
There was a problem hiding this comment.
No deliberate combo — I'd mirrored the existing "without UMI" test above (skip_umi_extract: false, with_umi: false), which is behaviourally equivalent because with_umi: false gates the UMI block off regardless. Flipped to true in b56425b for semantic clarity.
There was a problem hiding this comment.
Hmm tests are failing now though; could it be that the flag is not properly used? Or just needs a snapshot update?
There was a problem hiding this comment.
It's just singularity having a bad day I think
- `Channel.of` → `channel.of` (lowercase namespace). - `skip_umi_extract` → `true` for consistency with the "without UMI" wording; no-op behaviourally since `with_umi: false` already gates the UMI block off. - Align closing `).match() }` with surrounding lines. Per review feedback on #11228. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ntegration Bump to the latest upstream of: - fq/lint (nf-core/modules#11227): constrain reads arity to 1..2 - ribodetector (nf-core/modules#11258): unpin GPU container from pytorch-gpu=1.11.0; emit cuda version on the topic - tximeta/tximport (nf-core/modules#11141): fix gene-level crash on mismatched transcript FASTA/GTF - fastq_fastqc_umitools_trimgalore (nf-core/modules#11228): handle null trim_log in the read-count map - custom/catadditionalfasta (nf-core/modules#11256): topic-based versions, explicit out/\${prefix}.{fasta,gtf} paths, task.ext.prefix ?: meta.id prefix handling The custom/catadditionalfasta interface change needs pipeline-side follow-up in conf/modules/prepare_genome.config: - Fix the stale CAT_ADDITIONAL_FASTA selector (now CUSTOM_CATADDITIONALFASTA) and split PREPROCESS_TRANSCRIPTS_FASTA_GENCODE into its own block. - Set ext.prefix = "\${params.genome ?: fasta.baseName}_\${add_fasta.baseName}" so output filenames follow the previous {genome}_{add_name} pattern; the new module default (meta.id) would otherwise rename outputs to genome_transcriptome.{fasta,gtf}. Behaviour note: fixing the withName selector also exposes a pre-existing intent that was masked. CUSTOM_CATADDITIONALFASTA outputs now only publish when --save_reference is set; the stale selector previously let them fall through to the default publishDir and land in <outdir>/custom/out/ regardless of --save_reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Three commits, one behaviour change:
1c2b050309— null-log fix.546c722afd—ch_prefix rename on outer local channels (removes the collision with closure params fortrim_log). No behaviour change.7af199b8ad— lint fix:reads_keeps its trailing underscore in closures becausereadsis atake:parameter and Nextflow lint blocks the shadow.The bug
When
ext.argsincludes--hardtrim5/--hardtrim3, trim_galore skips the adapter / quality pass and emits no*_trimming_report.txt. The.join(trim_log, remainder: true)then carries anulllog element, and the null guard had two scoping bugs:.map { meta, reads_, trim_log_ -> if (trim_log) { // outer channel, always truthy def num_reads = getTrimGaloreReadsAfterFiltering( meta.single_end ? trim_log_ : trim_log_[-1]) [meta, reads_, num_reads] } else { [meta, reads, min_trimmed_reads.toFloat() + 1] // outer take param, a channel } }if (trim_log)checked the outer channel (always truthy) →getTrimGaloreReadsAfterFiltering(null)→Cannot invoke method eachLine() on null object.readstake parameter instead of the closure element.Commit 1 fixes both by targeting the
_-suffixed closure params. Commit 2 renames outer channels toch_*so thetrim_logclosure param can drop its underscore. Commit 3 reinstatesreads_because Nextflow lint rejects shadowing thetake:parameter.Reproduction
Surfaced triaging nf-core/rnaseq#1807. Running the pipeline with
--extra_trimgalore_args '--hardtrim5 8'crashes on master atmain.nf:15.Test
New case
test paired end read without UMI - hardtrim5 (no trim log)setsext.args = '--hardtrim5 8'on TRIMGALORE. Fails on master, passes here. All 9 tests pass.Review tip
Most of the line count lives in commit 2 (the rename). The behaviourally-important lines appear unchanged in the final master-vs-HEAD diff because the rename flipped which name resolves to what. Self-review pins the real fix inline. Commit-by-commit review recommended.
PR checklist
CHANGELOG.mdis updated.nf-core subworkflows test <SUBWORKFLOW> --profile docker).🤖 Generated with Claude Code