Subworkflow for phylogenetic placement: `fasta_newick_epang_gappa` #2856

erikrikarddaniel · 2023-02-08T10:07:35Z

PR checklist

This adds a subworkflow that runs phylogenetic placement with EPA-NG and Gappa.

d4straub

Looks good to me! But this is the first subworkflow that I review, probably a second opinion would be good as well.

subworkflows/nf-core/fasta_newick_epang_gappa/main.nf

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

nvnieuwk · 2023-02-08T15:29:22Z

subworkflows/nf-core/fasta_newick_epang_gappa/main.nf

+workflow FASTA_NEWICK_EPANG_GAPPA {
+
+    take:
+    ch_pp_data // channel: [ meta: val(meta), data: [ alignmethod: val(alignmethod), queryseqfile: file(queryseqfile), refseqfile: file(refseqfile), refphylogeny: file(refphylogeny), hmmfile: file(hmmfile), model: val(model) ] ]


It's preffered to split this channel into multiple input channels that will get merged in the subworkflow. This reduces the complexity for the user too (so you don't have to use arrays in arrays etc)

The problem is that that's difficult to synchronise if the pieces come from different sources, i.e. to make sure that they come in the same order. (At least I don't know how to do that.)

You can join the channels on meta. Just gotta make sure that all channels that have to be merged do contain exactly the same meta. (You can use the .join() operator for this: https://www.nextflow.io/docs/latest/operator.html#join)

Right. I'd like to give some more context. The workflow is called by my phyloplace pipeline (it's basically the whole pipeline). The sample sheet for the pipeline, parsed with splitCsv(), is turned into a single channel with this structure. If I were to give the subworkflow separate channels, I'd have to split the channel before calling the subworkflow, then join() them in the subworkflow.

I chose the map structure for the channel because I personally prefer to have named inputs when the number of inputs is large.

I plan to include the subworkflow also into ampliseq. I don't foresee any problems calling it with this structure. (Of course it should be documented though. That was an oversight of mine mostly because I'm not so familiar with the documentation format.)

Moreover, I was recently advised to merge channels into one when submitting a module. An advise that I later was grateful for, because I found it more explicit to join() before calling the module rather than the module doing that for me.

While I agree that the input channel here is a little more complex than for other workflows, I find it quite intuitive due to the naming. I am fine with it as is. @nvnieuwk is there a rule for input channels?

Aaah okay the context is very clear, thanks! You can leave it as it is :)

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

…modules into phyloplace-swf

d4straub

Not sure my suggestions are perfect, but at least to me it makes understanding much easier.

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

erikrikarddaniel · 2023-02-10T11:17:58Z

Not sure my suggestions are perfect, but at least to me it makes understanding much easier.

Thank you, much better!

erikrikarddaniel · 2023-02-13T09:43:20Z

Thank you both @nvnieuwk and @d4straub !

…f-core#2856) * Generated subworkflow from template * Started * Continued work * Rename, add tests * Prettier, remove extra tag * Remove trailing whitespace * Remove quotes to hopefully fix prettier issue * Remove checks for md5sums that frequently fail * It whas the colon not the quotes of course * Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> * Improved documentation of input * Better documentation of input channel * Remoed two remaining md5sum tests for versions.yml * Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> * Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com> --------- Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

erikrikarddaniel added 7 commits December 22, 2022 11:07

Generated subworkflow from template

f8907cb

Started

b55f1eb

Merge branch 'master' into phyloplace-swf

e71bb9f

Continued work

8b066fe

Fix merge conflict

57db5e7

Rename, add tests

8df0cdf

Prettier, remove extra tag

3bee22e

erikrikarddaniel requested a review from d4straub February 8, 2023 10:07

erikrikarddaniel added 5 commits February 8, 2023 11:11

Remove trailing whitespace

543eb1e

Remove quotes to hopefully fix prettier issue

6b75ca3

Remove checks for md5sums that frequently fail

905a1a2

It whas the colon not the quotes of course

315e041

Merge branch 'master' into phyloplace-swf

ddd5362

d4straub reviewed Feb 8, 2023

View reviewed changes

nvnieuwk reviewed Feb 8, 2023

View reviewed changes

erikrikarddaniel and others added 5 commits February 8, 2023 17:35

Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

f1ecd2f

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

Improved documentation of input

8e721f9

Merge branch 'phyloplace-swf' of github.com:erikrikarddaniel/nf-core-…

c04b8b2

…modules into phyloplace-swf

Better documentation of input channel

a84085b

Remoed two remaining md5sum tests for versions.yml

7c83698

d4straub reviewed Feb 10, 2023

View reviewed changes

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Outdated Show resolved Hide resolved

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Outdated Show resolved Hide resolved

subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml Show resolved Hide resolved

erikrikarddaniel and others added 2 commits February 10, 2023 12:16

Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

ab97114

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

Update subworkflows/nf-core/fasta_newick_epang_gappa/meta.yml

39aaab6

Co-authored-by: Daniel Straub <42973691+d4straub@users.noreply.github.com>

erikrikarddaniel added 3 commits February 10, 2023 12:18

Merge branch 'master' into phyloplace-swf

80c784c

Merge branch 'master' into phyloplace-swf

b451e07

Merge branch 'master' into phyloplace-swf

7c4ac6d

nvnieuwk approved these changes Feb 13, 2023

View reviewed changes

erikrikarddaniel merged commit 6ad90f5 into nf-core:master Feb 13, 2023

erikrikarddaniel deleted the phyloplace-swf branch February 13, 2023 09:49

d4straub mentioned this pull request Feb 17, 2023

Phylogenetic placement of ASVs nf-core/ampliseq#236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Subworkflow for phylogenetic placement: `fasta_newick_epang_gappa` #2856

Subworkflow for phylogenetic placement: `fasta_newick_epang_gappa` #2856

erikrikarddaniel commented Feb 8, 2023

d4straub left a comment

nvnieuwk Feb 8, 2023

erikrikarddaniel Feb 8, 2023

nvnieuwk Feb 9, 2023

erikrikarddaniel Feb 9, 2023

d4straub Feb 10, 2023

nvnieuwk Feb 13, 2023

d4straub left a comment

erikrikarddaniel commented Feb 10, 2023

erikrikarddaniel commented Feb 13, 2023

Subworkflow for phylogenetic placement: fasta_newick_epang_gappa #2856

Subworkflow for phylogenetic placement: fasta_newick_epang_gappa #2856

Conversation

erikrikarddaniel commented Feb 8, 2023

PR checklist

d4straub left a comment

Choose a reason for hiding this comment

nvnieuwk Feb 8, 2023

Choose a reason for hiding this comment

erikrikarddaniel Feb 8, 2023

Choose a reason for hiding this comment

nvnieuwk Feb 9, 2023

Choose a reason for hiding this comment

erikrikarddaniel Feb 9, 2023

Choose a reason for hiding this comment

d4straub Feb 10, 2023

Choose a reason for hiding this comment

nvnieuwk Feb 13, 2023

Choose a reason for hiding this comment

d4straub left a comment

Choose a reason for hiding this comment

erikrikarddaniel commented Feb 10, 2023

erikrikarddaniel commented Feb 13, 2023

Subworkflow for phylogenetic placement: `fasta_newick_epang_gappa` #2856

Subworkflow for phylogenetic placement: `fasta_newick_epang_gappa` #2856