Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

contig header lines are malformed and breaks GATK VariantAnnotator #74

Closed
ian1roberts opened this Issue Sep 24, 2012 · 1 comment

Comments

Projects
None yet
2 participants
Collaborator

ian1roberts commented Sep 24, 2012

Apologies if this issue has already been dealt with in a more recent version ...

When trying to add effects generating by snpEff to PyVCF filtered file using GATK VariantAnnotator receive following error

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A GATK RUNTIME ERROR has occurred (version 1.6-13-g91f02df):
##### ERROR
##### ERROR Please visit the wiki to see if this is a known problem
##### ERROR If not, please post the error, with stack trace, to the GATK forum
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Invalid VCFSimpleHeaderLine: key=contig name=null
##### ERROR ------------------------------------------------------------------------------------------

It appears to be the result of improperly written contig header lines

PyVCF writes as a dictionary

 (amp)[ir210@beast vars]$ grep 'contig' gatk_run2_vars.vcf
##contig={'length': '249250621', 'assembly': 'hg19', 'ID': 'chr1'}
##contig={'length': '135534747', 'assembly': 'hg19', 'ID': 'chr10'}
##contig={'length': '135006516', 'assembly': 'hg19', 'ID': 'chr11'}

but they should take the following form :

(amp)[ir210@beast vars]$ grep 'contig' gatk_t1-t2_raw.vcf
##contig=<ID=chr1,length=249250621,assembly=hg19>
##contig=<ID=chr10,length=135534747,assembly=hg19>
##contig=<ID=chr11,length=135006516,assembly=hg19>
##contig=<ID=chr11_gl000202_random,length=40103,assembly=hg19>
Owner

jamescasbon commented Sep 25, 2012

That looks like a regression to me, fancy submitting a test case?

@jamescasbon jamescasbon pushed a commit that referenced this issue Jan 2, 2013

James Casbon Merge pull request #81 from chapmanb/master
Correctly format contig output lines from writer, making output VCFs compatible with GATK. Fixes #74
9ca7798

@gotgenes gotgenes pushed a commit to gotgenes/PyVCF that referenced this issue May 13, 2014

@chapmanb chapmanb Correctly format contig output lines from writer, making output VCFs …
…compatible with GATK. Fixes #74
ead9a86

@gotgenes gotgenes pushed a commit to gotgenes/PyVCF that referenced this issue May 13, 2014

James Casbon Merge pull request #81 from chapmanb/master
Correctly format contig output lines from writer, making output VCFs compatible with GATK. Fixes #74
0292557
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment