Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow output definition #275

Closed
wants to merge 17 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions assets/schema_mappings.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
$schema: 'http://json-schema.org/draft-07/schema'
$id: 'https://raw.githubusercontent.com/nf-core/fetchngs/master/assets/schema_mappings.yml'
title: 'nf-core/fetchngs pipeline - id_mappings.csv schema'
description: 'Schema for the mappings file produced by fetchngs'
type: array
items:
type: object
properties:
sample:
type: string
experiment_accession:
type: string
run_accession:
type: string
sample_accession:
type: string
experiment_alias:
type: string
run_alias:
type: string
sample_alias:
type: string
experiment_title:
type: string
sample_title:
type: string
sample_description:
type: string
81 changes: 81 additions & 0 deletions assets/schema_samplesheet.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
$schema: 'http://json-schema.org/draft-07/schema'
$id: 'https://raw.githubusercontent.com/nf-core/fetchngs/master/assets/schema_mappings.yml'
title: 'nf-core/fetchngs pipeline - samplesheet.csv schema'
description: 'Schema for the samplesheet file produced by fetchngs'
type: array
items:
type: object
properties:
sample:
type: string
fastq_1:
type: string
format: file-path
pattern: '^\\S+\\.f(ast)?q\\.gz$'
fastq_2:
type: string
format: file-path
pattern: '^\\S+\\.f(ast)?q\\.gz$'
run_accession:
type: string
experiment_accession:
type: string
sample_accession:
type: string
secondary_sample_accession:
type: string
study_accession:
type: string
secondary_study_accession:
type: string
submission_accession:
type: string
run_alias:
type: string
experiment_alias:
type: string
sample_alias:
type: string
study_alias:
type: string
library_layout:
type: string
library_selection:
type: string
library_source:
type: string
library_strategy:
type: string
library_name:
type: string
instrument_model:
type: string
instrument_platform:
type: string
base_count:
type: integer
read_count:
type: integer
tax_id:
type: string
scientific_name:
type: string
sample_title:
type: string
experiment_title:
type: string
study_title:
type: string
sample_description:
type: string
fastq_md5:
type: string
pattern: '^[0-9a-f]{32}$'
fastq_bytes:
type: integer
fastq_ftp:
type: string
fastq_galaxy:
type: string
fastq_aspera:
type: string
6 changes: 0 additions & 6 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,6 @@ process {
memory = { check_max( 6.GB * task.attempt, 'memory' ) }
time = { check_max( 4.h * task.attempt, 'time' ) }

publishDir = [
path: { "${params.outdir}/${task.process.tokenize(':')[-1].tokenize('_')[0].toLowerCase()}" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]

errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' }
maxRetries = 1
maxErrors = '-1'
Expand Down
6 changes: 6 additions & 0 deletions main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
*/

nextflow.enable.dsl = 2
nextflow.preview.output = true

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -83,6 +84,11 @@ workflow {
)
}

output {
directory params.outdir
mode params.publish_dir_mode
}

/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
THE END
Expand Down
12 changes: 0 additions & 12 deletions modules/local/aspera_cli/nextflow.config
Original file line number Diff line number Diff line change
@@ -1,17 +1,5 @@
process {
withName: 'ASPERA_CLI' {
ext.args = '-QT -l 300m -P33001'
publishDir = [
[
path: { "${params.outdir}/fastq" },
mode: params.publish_dir_mode,
pattern: "*.fastq.gz"
],
[
path: { "${params.outdir}/fastq/md5" },
mode: params.publish_dir_mode,
pattern: "*.md5"
]
]
}
}
9 changes: 0 additions & 9 deletions modules/local/multiqc_mappings_config/nextflow.config

This file was deleted.

12 changes: 0 additions & 12 deletions modules/local/sra_fastq_ftp/nextflow.config
Original file line number Diff line number Diff line change
@@ -1,17 +1,5 @@
process {
withName: 'SRA_FASTQ_FTP' {
ext.args = '-t 5 -nv -c -T 60'
publishDir = [
[
path: { "${params.outdir}/fastq" },
mode: params.publish_dir_mode,
pattern: "*.fastq.gz"
],
[
path: { "${params.outdir}/fastq/md5" },
mode: params.publish_dir_mode,
pattern: "*.md5"
]
]
}
}
8 changes: 0 additions & 8 deletions modules/local/sra_ids_to_runinfo/nextflow.config

This file was deleted.

9 changes: 0 additions & 9 deletions modules/local/sra_runinfo_to_ftp/nextflow.config

This file was deleted.

8 changes: 0 additions & 8 deletions modules/local/sra_to_samplesheet/nextflow.config

This file was deleted.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 0 additions & 5 deletions modules/nf-core/sratools/fasterqdump/nextflow.config

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 0 additions & 8 deletions modules/nf-core/sratools/prefetch/nextflow.config

This file was deleted.

16 changes: 16 additions & 0 deletions output.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
$schema: 'http://json-schema.org/draft-07/schema'
$id: 'https://raw.githubusercontent.com/nf-core/fetchngs/master/output.yml'
title: 'nf-core/fetchngs pipeline outputs'
description: ''
type: object
properties:
id_mappings:
type: string
format: file-path
mimetype: text/csv
schema: assets/schema_mappings.yml
samplesheet:
type: string
format: file-path
mimetype: text/csv
schema: assets/schema_samplesheet.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 15 additions & 3 deletions workflows/sra/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ workflow SRA {
.fastq
.mix(SRA_FASTQ_FTP.out.fastq)
.mix(FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS.out.reads)
.tap { ch_fastq }
.map {
meta, fastq ->
def reads = fastq instanceof List ? fastq.flatten() : [ fastq ]
Expand Down Expand Up @@ -153,7 +154,7 @@ workflow SRA {
.map { it[1] }
.collectFile(name:'tmp_samplesheet.csv', newLine: true, keepHeader: true, sort: { it.baseName })
.map { it.text.tokenize('\n').join('\n') }
.collectFile(name:'samplesheet.csv', storeDir: "${params.outdir}/samplesheet")
.collectFile(name:'samplesheet.csv')
.set { ch_samplesheet }

SRA_TO_SAMPLESHEET
Expand All @@ -162,7 +163,7 @@ workflow SRA {
.map { it[1] }
.collectFile(name:'tmp_id_mappings.csv', newLine: true, keepHeader: true, sort: { it.baseName })
.map { it.text.tokenize('\n').join('\n') }
.collectFile(name:'id_mappings.csv', storeDir: "${params.outdir}/samplesheet")
.collectFile(name:'id_mappings.csv')
.set { ch_mappings }

//
Expand All @@ -181,14 +182,25 @@ workflow SRA {
// Collate and save software versions
//
softwareVersionsToYAML(ch_versions)
.collectFile(storeDir: "${params.outdir}/pipeline_info", name: 'nf_core_fetchngs_software_mqc_versions.yml', sort: true, newLine: true)
.collectFile(name: 'nf_core_fetchngs_software_mqc_versions.yml', sort: true, newLine: true)
.set { ch_versions_yml }

emit:
samplesheet = ch_samplesheet
mappings = ch_mappings
sample_mappings = ch_sample_mappings_yml
sra_metadata = ch_sra_metadata
versions = ch_versions.unique()

publish:
ch_fastq >> 'fastq/'
ASPERA_CLI.out.md5 >> 'fastq/md5/'
SRA_FASTQ_FTP.out.md5 >> 'fastq/md5/'
SRA_RUNINFO_TO_FTP.out.tsv >> 'metadata/'
ch_versions_yml >> 'pipeline_info/'
ch_samplesheet >> 'samplesheet/'
ch_mappings >> 'samplesheet/'
ch_sample_mappings_yml >> 'samplesheet/'
Comment on lines +195 to +203
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, the "topic" name will be used as the publish path, which makes fetchngs really simple. No need to define any rules in the output DSL, just the base directory and publish mode, then all of these channels will be published exactly as it says.

These names don't have to be paths, they can also be arbitrary names which you would then use in the output DSL to customize publish options for that name. I'll demonstrate this with rnaseq. You can think of the names as "topics" if you want, but at this point I'm not even using topics under the hood because they aren't necessary.

}

/*
Expand Down
5 changes: 0 additions & 5 deletions workflows/sra/nextflow.config
Original file line number Diff line number Diff line change
@@ -1,8 +1,3 @@
includeConfig "../../modules/local/multiqc_mappings_config/nextflow.config"
includeConfig "../../modules/local/aspera_cli/nextflow.config"
includeConfig "../../modules/local/sra_fastq_ftp/nextflow.config"
includeConfig "../../modules/local/sra_ids_to_runinfo/nextflow.config"
includeConfig "../../modules/local/sra_runinfo_to_ftp/nextflow.config"
includeConfig "../../modules/local/sra_to_samplesheet/nextflow.config"
includeConfig "../../modules/nf-core/sratools/prefetch/nextflow.config"
includeConfig "../../subworkflows/nf-core/fastq_download_prefetch_fasterqdump_sratools/nextflow.config"