Conversation
There was a problem hiding this comment.
Pull request overview
This PR cleans up and standardizes the Talos Nextflow pipelines by moving further toward DSL2 + workflow outputs, tightening bash execution behavior across modules, and correcting several workflow/module inconsistencies.
Changes:
- Standardize Nextflow DSL2 usage and workflow outputs (removing
publishDirusage across many processes). - Harden module script execution (
set -euo pipefail), adjust indexing/IO wiring for some VCF flows, and fix a bcftools output mode bug. - Add a
nextflow_schema.jsonfor parameter documentation/validation and update docs to reflectoutputDirusage.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| talos_only.nf | Adds DSL2/shebang and adjusts how cohort MT inputs are discovered for TALOS-only runs. |
| src/talos/cpg_internal_scripts/cpgflow_jobs/make_config.py | Comment wording/grammar tweak. |
| nextflow_schema.json | Introduces a Nextflow params JSON schema for the pipeline. |
| nextflow/modules/talos/ValidateMOI/main.nf | Removes publishDir; adds stricter bash options. |
| nextflow/modules/talos/UnifiedPanelAppParser/main.nf | Removes publishDir; adds missing script: + stricter bash options. |
| nextflow/modules/talos/StartupChecks/main.nf | Removes publishDir comment; strengthens bash error handling. |
| nextflow/modules/talos/RunHailFiltering/main.nf | Removes publishDir; adds stricter bash options. |
| nextflow/modules/talos/HPOFlagging/main.nf | Removes publishDir; adds missing script: + stricter bash options. |
| nextflow/modules/talos/CreateTalosHTML/main.nf | Removes publishDir; adds stricter bash options. |
| nextflow/modules/prep/ResummariseRawSubmissions/main.nf | Emits VCF+index as a tuple, adds tabix indexing, and hardens cleanup. |
| nextflow/modules/prep/ParseManeIntoJson/main.nf | Adds stricter bash options. |
| nextflow/modules/prep/ParseAlphaMissense/main.nf | Adds stricter bash options. |
| nextflow/modules/prep/MakeClinvarbitrationPm5/main.nf | Adds stricter bash options and makes cleanup tolerant. |
| nextflow/modules/prep/EncodeAlphaMissense/main.nf | Adds stricter bash options. |
| nextflow/modules/prep/DownloadPanelApp/main.nf | Adds stricter bash options. |
| nextflow/modules/prep/DownloadClinVarFiles/main.nf | Adds stricter bash options in shell: block. |
| nextflow/modules/prep/CreateRoiFromGff3/main.nf | Adds stricter bash options; minor comment adjustment. |
| nextflow/modules/prep/ConvertSpliceVarDb/main.nf | Adds stricter bash options. |
| nextflow/modules/prep/AnnotateClinvarWithBcftools/main.nf | Updates input signature to include VCF index; adds stricter bash options. |
| nextflow/modules/annotation/SplitVcf/main.nf | Adds stricter bash options. |
| nextflow/modules/annotation/NormaliseAndRegionFilterVcf/main.nf | Adds stricter bash options and indexes the input VCF. |
| nextflow/modules/annotation/MergeVcfsWithBcftools/main.nf | Adds stricter bash options and fixes bcftools output mode to match .vcf.bgz. |
| nextflow/modules/annotation/AnnotatedVcfIntoMatrixTable/main.nf | Adds stricter bash options; makes checkpoint cleanup tolerant. |
| nextflow/modules/annotation/AnnotateWithEchtvar/main.nf | Adds stricter bash options. |
| nextflow/modules/annotation/AnnotateCsqWithBcftools/main.nf | Adds stricter bash options and indexes before csq. |
| nextflow/annotation.nf | Fixes Channel.fromPath capitalization/usage consistency. |
| nextflow.config | Removes outdir, normalizes resource units to X.GB, and updates process resources. |
| main.nf | Explicitly enables DSL2. |
| docs/NextflowConfiguration.md | Removes outdir docs; documents outputDir instead. |
| README.md | Updates output path documentation to use {outputDir}. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| .map { row -> tuple( | ||
| row.cohort, | ||
| files("${params.outdir}/${row.cohort}_outputs/*.mt", type: 'dir'), | ||
| Channel.fromPath("${outputDir}/${row.cohort}_outputs/*.mt", checkIfExists: true), |
There was a problem hiding this comment.
Channel.fromPath(...) inside the .map { row -> tuple(...) } produces a nested channel, but downstream TALOS inputs expect a concrete path/collection (e.g., the list of *.mt directories), not a Channel. This will break channel typing/tuple structure at runtime. Build the mts value as a path collection (e.g., via files(...) with type:'dir') and avoid creating a new channel per row; also prefer workflow.outputDir here (the bare outputDir identifier is not consistently used elsewhere in this script).
| Channel.fromPath("${outputDir}/${row.cohort}_outputs/*.mt", checkIfExists: true), | |
| files("${workflow.outputDir}/${row.cohort}_outputs/*.mt", type: 'dir'), |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Fixes
nextflow.enable.dsl=2etc.)"1 GB"instead of1.GB- same result, but though string parsing instead of plain Groovyset -euo pipefailadded to all script blocks),rm -fto prevent failure if files (which are definitely written) stop being writtenscript:directives to some filesProposed Changes
Checklist