Skip to content

update subworkflows for differential and functional enrichment analysis#11024

Merged
suzannejin merged 9 commits intomasterfrom
update-for-workflow-outputs
Mar 26, 2026
Merged

update subworkflows for differential and functional enrichment analysis#11024
suzannejin merged 9 commits intomasterfrom
update-for-workflow-outputs

Conversation

@suzannejin
Copy link
Copy Markdown
Contributor

@suzannejin suzannejin commented Mar 23, 2026

Update subworkflows for differential and functional enrichment analysis.
This update is needed for the workflow output migration here

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

Copilot AI review requested due to automatic review settings March 23, 2026 18:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates nf-core subworkflows used by differential abundance + functional enrichment pipelines to align emitted outputs with an upcoming output migration in nf-core/differentialabundance.

Changes:

  • Refactors differential_functional_enrichment GSEA wiring (CHIP handling + input reshaping) and adds new artifact output channels for gprofiler2 and GSEA.
  • Adds new plots and other output channels to abundance_differential_filter and updates workflow wiring accordingly.
  • Updates nf-test snapshots to reflect the new output structure.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
subworkflows/nf-core/differential_functional_enrichment/main.nf Adjusts GSEA channel construction and adds aggregated artifact outputs for gprofiler2/GSEA.
subworkflows/nf-core/differential_functional_enrichment/meta.yml Documents new outputs (gprofiler2_artifacts, gsea_artifacts).
subworkflows/nf-core/differential_functional_enrichment/tests/main.nf.test.snap Snapshot updates for new outputs and metadata ordering.
subworkflows/nf-core/abundance_differential_filter/main.nf Adds plots + other channels and minor closure style tweaks.
subworkflows/nf-core/abundance_differential_filter/meta.yml Documents the new outputs (plots, other).
subworkflows/nf-core/abundance_differential_filter/tests/main.nf.test.snap Snapshot updates for new outputs and metadata.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

suzannejin and others added 3 commits March 23, 2026 20:10
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@suzannejin suzannejin enabled auto-merge March 23, 2026 19:30
Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very unconvinced by the channel combinations, can we leave them separate please to allow flexible usage? All you should really need to do here is add any outputs that currently don't go to channels, to support the workflow output syntax in the pipeline.

GSEA_GSEA.out.rpt
.mix(GSEA_GSEA.out.index_html)
.mix(GSEA_GSEA.out.heat_map_corr_plot)
.mix(GSEA_GSEA.out.report_tsvs_ref)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you intend to mix these here as well as above?

decoupler_png = DECOUPLER_DECOUPLER.out.png

// main results
results = ch_results
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unconvinced by outputting everything in a single channel this way. You're mixing channel 'arity' and forcing anyone who wants to access specific channels to sort though a heterogenous set. Can we leave these separate please, and defer any combinations to the workflow?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes.. i was trying to see if i can simplify, but didnt like it either. reverting back to the previous commit now.

@pinin4fjords
Copy link
Copy Markdown
Member

e.g. here's how I prepared subworkflows for workflow output syntax: #9765

Essentially just making sure that everything was available via channels. No need to combine everything artificially into a small number of outputs.

@suzannejin suzannejin force-pushed the update-for-workflow-outputs branch from 55f1d16 to 2957563 Compare March 24, 2026 10:59
@suzannejin
Copy link
Copy Markdown
Contributor Author

hi @pinin4fjords , sorry for the confusion, i was trying to see if extra simplification works, but it did not.
Just reverted back to the previous commit, this should work for the workflow output migration purpose.

ch_plots = DESEQ2_DIFFERENTIAL.out.dispersion_plot_png
.mix(LIMMA_DIFFERENTIAL.out.md_plot)

ch_other = DESEQ2_DIFFERENTIAL.out.rdata
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be a pain, thinking about this we should probably emit disparate things separately, the 'other' doesn't really work. If I want the session info in a consuming workflow I don't want to have to filter it out.

I'm inclined to say the same about the plots unless they're equivalent plots (which I don't think is the case here).

Copy link
Copy Markdown
Contributor Author

@suzannejin suzannejin Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I splitted the output for the ABUNDANCE_DIFFERENTIAL_FILTER subworkflow.
Now, for the GSEA artifacts, you want to emit a different output for each of the 21 artifacts?
That's a lot of duplicated verbosity to deal with at all subworkflow, workflow levels.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't make sense to combine disparate things into channels. I agree the current implementation makes things awkward- it's why I put this PR on hold for rnaseq: nf-core/rnaseq#1679

Nextflow will have some improvements, notably record types, that may help this soon. It's fine to put this on pause until that happens if you feel the same here as I did for rnaseq.

Copy link
Copy Markdown
Contributor Author

@suzannejin suzannejin Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The migration here is mainly needed to solve this issue affecting pipeline cache.
That's why I wouldn't pause it for this PR. I'd vote to have a less perfect subworkflow output structure for the moment (it can always be adapted later on).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why separating the outputs would block that work?

Copy link
Copy Markdown
Contributor Author

@suzannejin suzannejin Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What meant is that pausing the migration will block that work, not splitting the output itself.

However, the entire idea about pipeline hub is that alternative methods (in this case functional enrichment) can be exchanged easily. This means adding a new method will ideally only require changes in the subworkflow, not the pipeline itself. By splitting outputs, instead of classifying outputs into the same category, will make this even more difficult.

But yes, i agree that having them in artifacts/other is also not ideal. I will split the outputs as temporal solution. With syntax improvements in the future this may be better addressed.

Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When different methods have clear analogs, putting them in the same channel makes sense (perhaps after some normalisation of the structure- for future). But combining e.g. size factors and session info will never make sense at the component level.

@suzannejin suzannejin force-pushed the update-for-workflow-outputs branch from 03506ff to 729a7c3 Compare March 24, 2026 14:56
gprofiler2_all_enrich = GPROFILER2_GOST.out.all_enrich
gprofiler2_sub_enrich = GPROFILER2_GOST.out.sub_enrich
gprofiler2_plot_html = GPROFILER2_GOST.out.plot_html
gprofiler2_artifacts = GPROFILER2_GOST.out.plot_png
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same deal with these things. Arguably makes sense to group e.g. plots and reports, but we shouldn't be combining all the disparate things

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, I know why you want to do this- I wanted something similar when I opened nextflow-io/nextflow#6756, i.e. to have channels mapping directly to publish locations.

But we can't make assumptions in components about how pipelines will want to publish.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok nice it looks like with records, things will become easier.

Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better- thanks!

@suzannejin suzannejin added this pull request to the merge queue Mar 26, 2026
Merged via the queue into master with commit 2f1b98c Mar 26, 2026
65 checks passed
@suzannejin suzannejin deleted the update-for-workflow-outputs branch March 26, 2026 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants