New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VCF header genotype reserved key FT cardinality clobbered by htsjdk #1535

Closed
heuermh opened this Issue May 17, 2017 · 0 comments

Comments

Projects
1 participant
@heuermh
Member

heuermh commented May 17, 2017

See test failures at
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2020/

The VCF specification does not say what the FORMAT header line for FT should be

FT : sample genotype filter indicating if this genotype was “called” (similar in concept
to the FILTER field). Again, use PASS to indicate that all filters have been passed, a
semi-colon separated list of codes for filters that fail, or ‘.’ to indicate that filters
have not been applied. These values should be described in the meta- information
in the same way as FILTERs (String, no white-space or semi-colons permitted)

htsjdk uses Number=.
https://github.com/samtools/htsjdk/blame/master/src/main/java/htsjdk/variant/vcf/VCFStandardHeaderLines.java#L160

ADAM attempts to use Number=1
https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/org/bdgenomics/adam/converters/DefaultHeaderLines.scala#L153

Our test resources use Number=1
https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/test/resources/random.vcf#L19
https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/test/resources/sorted.vcf#L20

When reading from VCF and writing to VCF, as in VariantContextRDDSuite, Number=1 is retained.

When reading from VCF, writing to Parquet, reading from Parquet, and writing to VCF, as in TransformGenotypesSuite, Number=. is written.

The solution would be to update our DefaultHeaderLines and all the test resources to use Number=..

@fnothaft fnothaft closed this in b7762c2 May 31, 2017

@heuermh heuermh modified the milestone: 0.23.0 Jul 22, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment