Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CatVariant not handling allele "*" in gatk-haplotype-joint #1057

Closed
ghost opened this issue Oct 9, 2015 · 1 comment
Closed

CatVariant not handling allele "*" in gatk-haplotype-joint #1057

ghost opened this issue Oct 9, 2015 · 1 comment

Comments

@ghost
Copy link

ghost commented Oct 9, 2015

I am trying bcbio on an exome trio using joint calling with various callers. My run of gatk-haplotype-joint is failing with the following error.

##### ERROR stack trace 
htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 649: unparsable vcf record with allele *, for input source: /media/jawhite/Data/662-exome-trio/work/joint/gatk-haplotype-joint/trio-bwa-j/1/trio-bwa-j-1_0_15543565.vcf.gz
    at htsjdk.variant.vcf.AbstractVCFCodec.generateException(AbstractVCFCodec.java:793)
    at htsjdk.variant.vcf.AbstractVCFCodec.checkAllele(AbstractVCFCodec.java:578)
    at htsjdk.variant.vcf.AbstractVCFCodec.parseSingleAltAllele(AbstractVCFCodec.java:618)
    at htsjdk.variant.vcf.AbstractVCFCodec.parseAlleles(AbstractVCFCodec.java:548)
    at htsjdk.variant.vcf.AbstractVCFCodec.parseVCFLine(AbstractVCFCodec.java:342)
    at htsjdk.variant.vcf.AbstractVCFCodec.decodeLine(AbstractVCFCodec.java:285)
    at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:263)
    at htsjdk.variant.vcf.AbstractVCFCodec.decode(AbstractVCFCodec.java:60)
    at htsjdk.tribble.TabixFeatureReader$FeatureIterator.readNextRecord(TabixFeatureReader.java:150)
    at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:183)
    at htsjdk.tribble.TabixFeatureReader$FeatureIterator.next(TabixFeatureReader.java:125)
    at org.broadinstitute.gatk.tools.CatVariants.execute(CatVariants.java:300)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
    at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
    at org.broadinstitute.gatk.tools.CatVariants.main(CatVariants.java:317)
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 3.2-18-g478145d):
##### ERROR

Line 649 does indeed have an ALT entry of A,*, presumably generated by HaplotypeCaller. I looked at the CatVariants code on Github, and it appears to handle * alleles for ALT in the current version. Is this a version problem? I haven't figured out how to upgrade the htsjdk version that bcbio uses.

chapmanb added a commit that referenced this issue Oct 9, 2015
- Fixes required options in CombineVariants for moving to GATK 3.4
  framework #1057
- Handle bgzip inputs to variant validation with renaming.
- Default to use QUAL scores for variant calling for any
  cases where we can (cc @scatreux)
- Avoid CWL tests on standard quick test runs.
@chapmanb
Copy link
Member

chapmanb commented Oct 9, 2015

Sorry about the issue and thanks for reporting it. We package the MIT licensed parts of the GATK in a separate script/jar to avoid needing to install GATK when running non-variant calling parts of the GATK code. This package was, as you noticed, out of date and on version 3.2. I updated it to 3.4-46 to match the latest release so it should hopefully work cleanly for you now if you upgrade with:

bcbio_nextgen.py upgrade -u development --tools

Hope this gets everything running cleanly for you. Thanks again for the report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant