Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering of Haplotypecaller VC should only be done for the single-sample route #1025

Closed
amizeranschi opened this issue May 16, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@amizeranschi
Copy link
Contributor

Description of the bug

Basically a similar error as one reported not too long ago on the GATK forums:
https://gatk.broadinstitute.org/hc/en-us/community/posts/4412714547611-FilterVariantTranches-Error-no-variants-with-INFO-score-key-CNN-2D-

The fix mentioned on that page was to make sure that the VCF file passed as input to CNNScoreVariants was a proper VCF file, and NOT a GVCF file.

I've checked the file passed as input to FilterVariantTranches in the working directory where the error occurred, and it appears to be a GVCF file.

The nextflow command that I'm running is:

nextflow run nf-core/sarek -r dev -latest \
--input sample-sheet.csv \
--outdir sarek-test-jointVC \
-profile docker \
--max_time '336.h' \
--max_cpus '64' \
--max_memory '256.GB' \
--fasta ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae.R64-1-1-chr.fa \
--fasta_fai ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae.R64-1-1-chr.fa.fai \
--dict ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae.R64-1-1-chr.dict \
--save_mapped \
--save_output_as_bam \
--vep_cache ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae \
--bwa ${main_dir}/reproducible-sarek-error/bwa-index \
--igenomes_ignore \
--genome null \
--joint_germline \
--tools haplotypecaller,vep \
--dbsnp ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae.vcf.gz \
--dbsnp_tbi ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae.vcf.gz.tbi \
--dbsnp_vqsr '--resource:ensemblvcf,known=false,training=true,truth=true,prior=10.0 Saccharomyces_cerevisiae.vcf.gz' \
--known_indels ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae_indels.vcf.gz \
--known_indels_tbi ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae_indels.vcf.gz.tbi \
--known_indels_vqsr '--resource:ensemblsnps,known=false,training=true,truth=true,prior=10.0 Saccharomyces_cerevisiae_indels.vcf.gz' \
--known_snps ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae_snps.vcf.gz \
--known_snps_tbi ${main_dir}/reproducible-sarek-error/Saccharomyces_cerevisiae_snps.vcf.gz.tbi \
--known_snps_vqsr '--resource:ensemblindels,known=false,training=true,truth=true,prior=10.0 Saccharomyces_cerevisiae_snps.vcf.gz' \
--vep_out_format tab \
--vep_include_fasta

Command used and terminal output

No response

Relevant files

nextflow-error.log.txt

System information

No response

@amizeranschi amizeranschi added the bug Something isn't working label May 16, 2023
@FriederikeHanssen
Copy link
Contributor

🤦‍♀️ good catch. I think a work around until this is fixed should be to add --skip_haplotypecaller_filter

@amizeranschi
Copy link
Contributor Author

What does --skip_haplotypecaller_filter do, exactly? I couldn't find any mention of it on https://nf-co.re/sarek/dev/parameters

@FriederikeHanssen
Copy link
Contributor

ah sorry, my mistake, actually it is --skip_tools haplotypecaller_filter ( i was looking in the wrong place myself)

@amizeranschi
Copy link
Contributor Author

OK, I'll try running once with --skip_tools haplotypecaller_filter for testing purposes. But please, have a look if you can fix the GVCF issue as well, because we will actually require variant filtration for this joint calling use case.

@amizeranschi
Copy link
Contributor Author

For the record, the pipeline finished successfully with --skip_tools haplotypecaller_filter.

@FriederikeHanssen FriederikeHanssen changed the title Error with FilterVariantTranches in joint VC: Bad input: VCF contains no variants or no variants with INFO score key "CNN_1D" Filtering of Haplotypecaller VC should only be done for the single-sample route May 24, 2023
@FriederikeHanssen
Copy link
Contributor

Hey! Let's keep it open until the param thing is fixed or we will forget it :)

@amizeranschi
Copy link
Contributor Author

Fixed in #1050.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants