Skip to content
Closed

humann3 #1679

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions modules/humann/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
// TODO nf-core: If in doubt look at other nf-core/modules to see how we are doing things! :)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the TODO statements everywhere (in meta.yml and modules/humann/main.nf)

// https://github.com/nf-core/modules/tree/master/modules
// You can also ask for help via your pull request or on the #modules channel on the nf-core Slack workspace:
// https://nf-co.re/join
// TODO nf-core: A module file SHOULD only define input and output files as command-line parameters.
// All other parameters MUST be provided using the "task.ext" directive, see here:
// https://www.nextflow.io/docs/latest/process.html#ext
// where "task.ext" is a string.
// Any parameters that need to be evaluated in the context of a particular sample
// e.g. single-end/paired-end data MUST also be defined and evaluated appropriately.
// TODO nf-core: Software that can be piped together SHOULD be added to separate module files
// unless there is a run-time, storage advantage in implementing in this way
// e.g. it's ok to have a single module for bwa to output BAM instead of SAM:
// bwa mem | samtools view -B -T ref.fasta
// TODO nf-core: Optional inputs are not currently supported by Nextflow. However, using an empty
// list (`[]`) instead of a file can be used to work around this issue.

process HUMANN {
tag "$meta.id"
label 'process_low'

// TODO nf-core: List required Conda package(s).
// Software MUST be pinned to channel (i.e. "bioconda"), version (i.e. "1.10").
// For Conda, the build (i.e. "h9402c20_2") must be EXCLUDED to support installation on different operating systems.
// TODO nf-core: See section in main README for further information regarding finding and adding container addresses to the section below.
conda (params.enable_conda ? "bioconda::humann=3.0.0" : null)
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/humann:3.0.0--pyh5e36f6f_1':
'quay.io/biocontainers/humann:3.0.0--pyh5e36f6f_1' }"

input:
// TODO nf-core: Where applicable all sample-specific information e.g. "id", "single_end", "read_group"
// MUST be provided as an input via a Groovy Map called "meta".
// This information may not be required in some instances e.g. indexing reference genome files:
// https://github.com/nf-core/modules/blob/master/modules/bwa/index/main.nf
// TODO nf-core: Where applicable please provide/convert compressed files as input/output
// e.g. "*.fastq.gz" and NOT "*.fastq", "*.bam" and NOT "*.sam" etc.
tuple val(meta), path(sample), path(nt),path(protein),path(metadb)

output:
// TODO nf-core: Named file extensions MUST be emitted for ALL output channels
tuple val(meta), path("outdir"), emit: outdir
// TODO nf-core: List additional required output channels/values here
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Align the emit statements, e.g.:

tuple val(meta), path("outdir")  , emit: outdir
path "versions.yml"              , emit: versions

path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
// TODO nf-core: Where possible, a command MUST be provided to obtain the version number of the software e.g. 1.10
// If the software is unable to output a version number on the command-line then it can be manually specified
// e.g. https://github.com/nf-core/modules/blob/master/modules/homer/annotatepeaks/main.nf
// Each software used MUST provide the software name and version number in the YAML version file (versions.yml)
// TODO nf-core: It MUST be possible to pass additional parameters to the tool as a command-line string via the "task.ext.args" directive
// TODO nf-core: If the tool supports multi-threading then you MUST provide the appropriate parameter
// using the Nextflow "task" variable e.g. "--threads $task.cpus"
// TODO nf-core: Please replace the example samtools command below with your module's command
// TODO nf-core: Please indent the command appropriately (4 spaces!!) to help with readability ;)
"""
humann \\
-o outdir \\
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest setting the outdir as a custom name per sample (e.g. ${prefix}_outdir). If you would use the module to analyze multiple samples in a pipeline, the results would overwrite each other. You should also change the output channel to: tuple val(meta), path("*_outdir"), emit: outdir if you were to do this.

--threads $task.cpus \\
-i $sample \\
--nucleotide-database $nt \\
--protein-database $protein \\
--metaphlan-options "-x $prefix --bowtie2db $metadb "


cat <<-END_VERSIONS > versions.yml
humann: \$(echo \$(humann --version 2>&1) )
END_VERSIONS

"""
}
51 changes: 51 additions & 0 deletions modules/humann/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
name: "humann"
## TODO nf-core: Add a description of the module and list keywords
description: write your description here
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to write a short description of the tool :p

keywords:
- sort
tools:
- "humann":
## TODO nf-core: Add a description and other details for the software below
description: "HUMAnN: The HMP Unified Metabolic Analysis Network, version 3"
homepage: "http://huttenhower.sph.harvard.edu/humann"
documentation: "http://huttenhower.sph.harvard.edu/humann"
tool_dev_url: "https://github.com/biobakery/humann"
doi: ""
licence: "['MIT']"

## TODO nf-core: Add a description of all of the variables used as input
input:
# Only when we have meta
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
#
## TODO nf-core: Delete / customise this example input
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update your input fields to correspond with the actual input of the module :)

- bam:
type: file
description: BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"

## TODO nf-core: Add a description of all of the variables used as output
output:
#Only when we have meta
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
#
- versions:
type: file
description: File containing software versions
pattern: "versions.yml"
## TODO nf-core: Delete / customise this example output
- bam:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also update your output field to correspond with the real output :)

type: file
description: Sorted BAM/CRAM/SAM file
pattern: "*.{bam,cram,sam}"

authors:
- "@kohjy-ag"
18 changes: 18 additions & 0 deletions tests/modules/humann/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { HUMANN } from '../../../modules/humann/main.nf'

workflow test_humann {

input = [
[ id:'mpa_v30_CHOCOPhlAn_201901' ], // meta map
file('https://raw.githubusercontent.com/nf-core/test-datasets/sarek/testdata/umi-dna/qiaseq/SRR7545951-small_1.fastq.gz', checkIfExists: true),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using the params.test_data['homo_sapiens']['illumina']['test_1_fastq_gz'] here instead of the full link. You can find all test datasets for modules either on https://github.com/nf-core/test-datasets/tree/modules or in the conf/test_data.config

file('chocophlan', checkIfExists: true),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these files meant to be?

file('uniref', checkIfExists: true),
file('metaphlandb', checkIfExists: true)
]

HUMANN ( input )
}
5 changes: 5 additions & 0 deletions tests/modules/humann/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
process {

publishDir = { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" }

}
7 changes: 7 additions & 0 deletions tests/modules/humann/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- name: humann test_humann
command: nextflow run tests/modules/humann -entry test_humann -c tests/config/nextflow.config
tags:
- humann
files:
- path: output/humann/versions.yml
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove versions.yml from the file, if you update to the latest version of nf-core you won't have to delete these manually (you can also see that you are not getting any other output, is this supposed to happen?)

md5sum: f6ffecc08eaf8ab67253798065a54158