Skip to content

Fix NETMHCSTABANDPAN to support samples without SV data#246

Merged
johnoooh merged 1 commit into
developfrom
feature/netmhcstabandpan
May 14, 2026
Merged

Fix NETMHCSTABANDPAN to support samples without SV data#246
johnoooh merged 1 commit into
developfrom
feature/netmhcstabandpan

Conversation

@johnoooh
Copy link
Copy Markdown
Collaborator

Description

Fixes a silent failure in the NETMHCSTABANDPAN subworkflow when samples are run without structural variants — a common case in MSK pipelines where the samplesheet only carries MAF/HLA/CNV inputs.

Root cause

createNETMHCInput joins ch_fasta_and_hla against ch_sv_fasta with an inner .join(...). When no SV files are provided in the samplesheet, the upstream NEOSV process never runs, so the SV fasta channel emits zero items. The inner join therefore produces an empty channel, and NETMHCSTABPAN / NETMHCPAN4 / NEOANTIGENUTILS_FORMATNETMHCPAN — and everything downstream of them in any pipeline using this subworkflow — silently get skipped.

Existing tests cover the case where the SV channel emits [meta, [], []] (empty file lists with a meta), but not the "no items at all" case, which is what real pipelines hit when SVs aren't part of the sample.

Change

Switch the join to a left join with remainder: true and treat a missing SV side as [null, [], []], so a MUT and WT tuple is still emitted per sample regardless of SV presence:

def merged_mut = fastas_and_hla_channel
    .join(sv_fastas_channel, by:0, remainder: true)
    .map({
        def sv = it[2] ?: [null, [], []]
        [it[1][0], it[1][1], sv[1], it[1][3], "MUT"]
    })

Tests

Adds netmhcstabandpan - empty SV channel - fa,hla_str - tsv - stub, which feeds input[1] = channel.empty() to lock in the SV-less code path.

All 7 tests pass locally (Nextflow 24.10.6, nf-test 0.9.5):

Test [f411820f] 'netmhcstabandpan - SV - fa,hla_str - tsv'                                 PASSED
Test [474d8bcb] 'netmhcstabandnetmhc3 - SV - fa,hla_str - tsv'                             PASSED
Test [f20bb7d2] 'netmhcstabandpan - fa,hla_str - tsv'                                      PASSED
Test [b2d766c9] 'netmhcstabandnetmhc3 - fa,hla_str - tsv'                                  PASSED
Test [958a9bbe] 'netmhcstabandpan - empty SV channel - fa,hla_str - tsv - stub'  (NEW)     PASSED
Test [d1dc7b0c] 'netmhcstabandpan - fa,hla_str - tsv - stub'                               PASSED
Test [1be0a66d] 'netmhcstabandnetmhc3 - fa,hla_str - tsv - stub'                           PASSED

The fix was also verified end-to-end by running the mskcc/neoantigenpipeline (-profile test,docker): before the fix the pipeline silently completed with 6 of 13 stages skipped after GENERATEMUTFASTA; after the fix all 31 processes run to completion (0 failures), producing the final TSV and annotated JSON outputs.

PR Checklist

  • Description of changes with reasoning
  • No nf-core equivalent (MSK-specific netMHC stack)
  • Branch named feature/netmhcstabandpan
  • Tests added for new code (regression test for empty SV channel)
  • No external test data needed (stub-mode test)
  • No TODOs introduced
  • versions.yml emission untouched
  • Naming conventions followed
  • No resource label changes
  • No container changes

The createNETMHCInput helper used an inner .join() on the SV fasta channel.
When the upstream NEOSV process does not run (no structural variants
provided in the samplesheet, a common case), the SV channel emits zero
items and the inner join silently produces an empty channel, causing
NETMHCSTABPAN, NETMHCPAN4, and all downstream subworkflow steps to be
skipped.

Switch to a left join with remainder: true and default to empty file
lists when there is no SV match, so MUT/WT tuples are still emitted per
sample whether or not SV data is present.

Adds a regression test that drives input[1] = channel.empty() to lock in
the SV-less code path.
@johnoooh johnoooh requested a review from pintoa1-mskcc May 14, 2026 13:31
@johnoooh johnoooh marked this pull request as ready for review May 14, 2026 15:55
@johnoooh johnoooh requested review from a team and nikhil as code owners May 14, 2026 15:55
Copy link
Copy Markdown
Contributor

@pintoa1-mskcc pintoa1-mskcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good for our purposes, if we ever plan on using this in clinical production we would need to output empty files in cases like this, rather than skipping the step

@johnoooh johnoooh merged commit 42c230a into develop May 14, 2026
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants