Skip to content

Conversation

@suzannejin
Copy link
Contributor

@suzannejin suzannejin commented Apr 22, 2025

  • update differential subworkflow to take list of contrasts as input, and use differential_method instead of method_differential
  • update functional subworkflow to take list of contrasts as input, and use functional_method instead of method_functional
  • update limma module to use round_digits

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@suzannejin suzannejin changed the title Update differential subworkflows Update differential subworkflows to take list of contrasts as input Apr 23, 2025
@suzannejin suzannejin marked this pull request as ready for review April 23, 2025 13:25
@suzannejin
Copy link
Contributor Author

@pinin4fjords hello!! Some minor changes here, just changed the subworkflows to take in list of contrasts as required by the new pipeline structure

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Some points. I haven't replicated for the other subworkflow, but you'll see what I mean.

def meta_input_new = meta_input + [ 'method_differential': analysis_method ]
def criteria = multiMapCriteria { meta, abundance, analysis_method, fc_threshold, stat_threshold, samplesheet, transcript_length, control_features, contrast, variable, reference, target, formula, comparison ->
def meta_with_method = meta + [ 'differential_method': analysis_method ]
def meta_for_diff = meta_with_method + contrast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def meta_for_diff = meta_with_method + contrast
def meta_for_diff = meta_with_method + meta_contrast

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably keep the mergeMaps usage? Realise I didn't use it in #448, but that was probably a mistake. We still need to concatenate e.g. the study-wise ID with the contrast ID, rather than just overwriting the former with the latter.

Copy link
Contributor Author

@suzannejin suzannejin Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually don't like the fact it will merge for everything though.
If you want, I can modify the function so that it only merge id, and raise error if other elements are repeated.
I think it should be explicitly handled by the user instead of the automatic merging.

Copy link
Member

@pinin4fjords pinin4fjords Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think silent over-writing of common keys (as this version will do) is better? I do not. I'd rather they were concatentated, as at least information is not lost.

}

// create joined channel with samplesheet and optional transcript lengths and control features
ch_samplesheet_with_control = ch_samplesheet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we just completely forget about handling these correctly before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems so...

Copy link
Contributor Author

@suzannejin suzannejin Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, before we basically implemented the subworkflow assuming that the pipeline will always only provide one ch_samplesheet/ch_contrast etc. It only parallelizes sets of different methods.

Now, we wanted the pipeline to be able to paralellize any rows of full params, hence the subworkflow assume now any input channel can have various rows, and need to be join/combine relying on the meta.

Comment on lines 68 to 74
.combine(
// take the first contrast for each common meta
ch_contrasts.map { meta, contrast, variable, reference, target, formula, comparison ->
[meta, contrast[0], variable[0], reference[0], target[0], formula[0], comparison[0]]
}
, by:0
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.combine(
// take the first contrast for each common meta
ch_contrasts.map { meta, contrast, variable, reference, target, formula, comparison ->
[meta, contrast[0], variable[0], reference[0], target[0], formula[0], comparison[0]]
}
, by:0
)
.join(ch_contrasts.transpose(), by:0)

If we use a join here after the transpose, instead of a combine, that should give us the first matching contrast more simply, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the thing is that the subworkflow can potentially receive various ch_contrats with different metas (if rows in the toolsheet also allow changing input data etc).
So we do need to take the first for each common meta.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you quite understood. Unless I'm having a senior moment (it happens), these two sets of logic are functionally identical, mine is just simpler.

You have a map which extracts the first element of the collapsed contrasts, prior to doing a combine 'with key' (the by) part. This is effectively an outer join, made to function like an inner join by moving all items on the right to a single contrast for a given key.

In my version I 'uncollapse' the contrasts using transpose(), and then do an inner join with join. Since there will be multiple keys on the right with the same key value, this will have the effect of taking the 'first' matching one, replicating your logic. It's the same deal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, did not see it before!
Thank you for explaining, i will change it

@suzannejin
Copy link
Contributor Author

just applied everything you suggested :)
Now it should be fine @pinin4fjords

Copy link
Member

@pinin4fjords pinin4fjords left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, looks good i think

@suzannejin suzannejin added this pull request to the merge queue Apr 25, 2025
Merged via the queue into master with commit 6ac0725 Apr 25, 2025
69 checks passed
@suzannejin suzannejin deleted the update_differential_method branch April 25, 2025 13:37
famosab pushed a commit to famosab/modules that referenced this pull request Jun 3, 2025
…f-core#8346)

* update abundance_differential_filter

* round limma digits for consistency

* update test configs and snapshots

* update enrichment workflow. Still need to solve bug when multiply running differential subworkflow

* fix bug (abundance_differential_filter): proper multiply combine norm channels

* add comments

* update functional subworkflow snapshot

* update meta.yaml

* update meta.yaml

* apply suggestions

* add back mergeMaps
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants