Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

each lineage in BUSCO module #2089

Closed
2 tasks done
muffato opened this issue Sep 22, 2022 · 2 comments · Fixed by #2135
Closed
2 tasks done

each lineage in BUSCO module #2089

muffato opened this issue Sep 22, 2022 · 2 comments · Fixed by #2135
Assignees
Labels
bug Something isn't working

Comments

@muffato
Copy link
Member

muffato commented Sep 22, 2022

Have you checked the docs?

Description of the bug

The BUSCO module uses a each on the lineage input parameter, and we've got some trouble making it work in a pipeline with @alxndrdiaz

The main problem is that it doesn't work on dynamic lists. See a minimal example below

process BUSCO {
  input:
    val fasta
    each lineage
  output:
    stdout
  script:
    """
    echo '$fasta $lineage Busco!'
    """
}

workflow {
  ch_inputs = Channel.of(
    ['Bonjour', file('lineages.txt')],
  ).map {
    [it[0], it[1].readLines()]
  }
  ch_inputs.view()
  BUSCO (
    ch_inputs.map {it[0]},
    ch_inputs.map {it[1]},
  ).view()
}

This creates a single BUSCO job that takes the list itself as lineage rather than unrolling it:

[13/4bcd3c] process > BUSCO (1) [100%] 1 of 1 ✔
[Bonjour, [agaricomycetes_odb10, polyporales_odb10, basidiomycota_odb10, fungi_odb10]]
Bonjour [agaricomycetes_odb10, polyporales_odb10, basidiomycota_odb10, fungi_odb10] Busco!

If I do a flatMap instead of map:

  ).flatMap {
    it[1].readLines().collect { line -> [it[0], line] }

ch_inputs is like this:

[Bonjour, agaricomycetes_odb10]
[Bonjour, polyporales_odb10]
[Bonjour, basidiomycota_odb10]
[Bonjour, fungi_odb10]

but Nextflow makes 4*4=16 jobs !

[61/a05438] process > BUSCO (15) [100%] 16 of 16 ✔
Bonjour polyporales_odb10 Busco!
Bonjour agaricomycetes_odb10 Busco!
Bonjour basidiomycota_odb10 Busco!
Bonjour fungi_odb10 Busco!
Bonjour agaricomycetes_odb10 Busco!
Bonjour basidiomycota_odb10 Busco!
Bonjour polyporales_odb10 Busco!
Bonjour agaricomycetes_odb10 Busco!
Bonjour fungi_odb10 Busco!
Bonjour basidiomycota_odb10 Busco!
Bonjour fungi_odb10 Busco!
Bonjour agaricomycetes_odb10 Busco!
Bonjour polyporales_odb10 Busco!
Bonjour polyporales_odb10 Busco!
Bonjour fungi_odb10 Busco!
Bonjour basidiomycota_odb10 Busco!

We couldn't find a way of making it spawn exactly 4 jobs with the each in place. But with a val instead, it becomes standard and the flatMap version works.

Command used and terminal output

No response

Relevant files

lineages.txt:

agaricomycetes_odb10
polyporales_odb10
basidiomycota_odb10
fungi_odb10

System information

Nextflow 22.04.0

@muffato muffato added the bug Something isn't working label Sep 22, 2022
@mahesh-panchal
Copy link
Member

mahesh-panchal commented Sep 29, 2022

Sorry, was on holiday.
Does this solve the issue?

workflow {
    ch_inputs = Channel.of( ['Bonjour', file('lineages.txt').readLines() ] )
        .multiMap { fa, lineage ->
            fa_ch: fa
            lineage_ch: lineage
        }.set { busco_input }

    BUSCO (
        busco_input.fa_ch,
        busco_input.lineage_ch.collect()
    ).view()
}

The solution isn't valid if there are more inputs in ch_inputs though because of how each works (and I hadn't realised it worked that way either so thanks for pointing it out).

@muffato
Copy link
Member Author

muffato commented Sep 29, 2022

Hi @mahesh-panchal . Yes, this works 🙌🏼 !
But, as you said, only if BUSCO is called once. With two elements in ch_inputs:

[Bonjour, [agaricomycetes_odb10, polyporales_odb10, basidiomycota_odb10, fungi_odb10]]
[Hola, [eukaryota_odb10, bacteria_odb10, archaea_odb10]]

it also runs Bonjour against eukaryota_odb10 etc, and Hola against agaricomycetes_odb10 etc.

Would you be happy if I make a PR to turn the each into val, and update the test cases ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants