-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implementation of goleft/indexcov #1312
base: dev
Are you sure you want to change the base?
Conversation
Candidate release 3.3.1
FEAT: Release Canditate 3.3.2
|
label 'process_single' | ||
|
||
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? | ||
'https://depot.galaxyproject.org/singularity/goleft:0.2.4--h9ee0642_1': | ||
'biocontainers/goleft:0.2.4--h9ee0642_1' }" | ||
|
||
input: | ||
val(meta) | ||
path(bams) | ||
path(fasta) | ||
path(fai) | ||
output: | ||
path("${prefix}/*"),emit:output | ||
path("versions.yml"),emit:versions | ||
script: | ||
def args = task.ext.args ?: '' | ||
prefix = task.ext.prefix ?: "${meta.id}" | ||
def input_files = bams.findAll{it.name.endsWith(".bam")} + bams.findAll{it.name.endsWith(".crai")} | ||
def extranormalize = input_files.any{it.name.endsWith(".crai")} ? " --extranormalize " : "" | ||
""" | ||
goleft indexcov \\ | ||
--fai "${fai}" \\ | ||
--directory "${prefix}" \\ | ||
${extranormalize} \\ | ||
$args \\ | ||
${input_files.join(" ")} | ||
|
||
cat <<-END_VERSIONS > versions.yml | ||
"${task.process}": | ||
goleft: \$(goleft --version 2>&1 | head -n 1 | sed 's/^.*goleft Version: //') | ||
END_VERSIONS | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you indent so it is similar to the other modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
How about combining goleft.indexcov.config and indexcov_reindex.config since they belong to the same tool?
-
Would it be possible to add the modules upstream to nf-core/modules so everyone can use them?
-
Missing some output docs
Looks super good already! 🚀 ❤️
|
||
input: | ||
val(meta) | ||
path(bams) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the tool run on all bams for all samples in one job?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind I see it further down. The docs say 30sec per 30 WGS 🤯 . Any idea how it scales with number of genomes?
nextflow_schema.json
Outdated
@@ -100,7 +100,7 @@ | |||
"fa_icon": "fas fa-toolbox", | |||
"description": "Tools to use for duplicate marking, variant calling and/or for annotation.", | |||
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka2\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.", | |||
"pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_dnascope|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(?<!,)$" | |||
"pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_dnascope|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|sentieon_dedup|snpeff|strelka|tiddit|vep|indexcov)?,?)*(?<!,)$" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need alphabetical sorting or Maxime won't let it pass 😄
workflow BAM_GOLEFT_INDEXCOV { | ||
take: | ||
bam // channel: [mandatory] [ meta, bam, bai ] | ||
fasta // channel: [mandatory] [ fasta ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fasta // channel: [mandatory] [ fasta ] | |
fasta // channel: [mandatory] [ fasta ] |
Hi, thank you for the review. FYI I currently don't have the time to work on this. I'll fix things later. |
No worries, we're getting a release ready, so this one will only be included in the next one |
This is a preliminary PR to implement goleft/indexcov in sarek: https://nfcore.slack.com/archives/CE6SDEDAA/p1696931394165659 . For now I haven't looked at the indentation, syntax, doc, etc...
Basically the BAMs are re-indexed (see below) and collected into indexcov. The output of indexcov is a set of png, html, bed.gz, etc... files.
NEW FILE:
modules/local/samtools_index_for_indexcov/main.nf
instead of using the original bam files, I choose to re-index the bam, ignoring the unmapped, secondary, etc... It's quite fast as we only need a BAM header and a BAM index.NEW FILE:
modules/local/goleft/indexcov/main.nf
the process for indexcov itself. I know I should have used the nf-core/module I designed but somehow it was easier for me to handle the 'input:' from sarek. Also I don't know what is meta.id' in that context.NEW_FILE:
subworkflows/local/bam_goleft_indexcov/main.nf
the glue for the two processes above. I don't like that id:"PREFIX".NEW FILE:
modules/goleft.indexcov.config
a config for goleft/indexcov . I cannot make it work: the publishDir, the files are currently not copied intoresults
NEW FILE:
modules/indexcov_reindex.config
a config for re-indexing the bamMODIFIED:
subworkflows/local/bam_variant_calling_germline_all/main.nf
. I only added the subworkflow for indexcov here, just after manta. I'm not sure it's the best place to insert it.MODIFIED:
conf/test_full_germline.config
. I added indexcov to the list of toolsMODIFIED: nextflow_schema.json, same, I added indexcov in the pattern.MODIFIED: nextflow.config to include the configs above.
PR checklist
nf-core lint
).nextflow run . -profile test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).