Added support for importing dosages as call values#5077
Added support for importing dosages as call values#5077danking merged 13 commits intohail-is:masterfrom tmwong2003:tmwong2003/merge-dosage-patch
Conversation
tpoterba
left a comment
There was a problem hiding this comment.
I think the ability to load entry floats as float32 is a great change, but I think all the language about "dosage" is a bit use-case-specific and misleading. My proposal would be to add a parameter to import_vcf called entry_float_type or something, which defaults to hl.tfloat64.
This would get passed in as a config param, and would let us do the following in the headerField calculation:
case (VCFHeaderLineType.Float, false) => entryFloatType
I don't think this will be a huge change from what you have, and will mostly involve deleting code.
tpoterba
left a comment
There was a problem hiding this comment.
a few more small changes, and good to go.
|
looks like a compile error in TestUtils.scala. The rest looks good, will approve when tests pass |
I fixed Should I rebase and squash all of my commits into a single commit, or are you OK with merging my branch with the discrete feature commits (i.e., non-"Merge remote-tracking branch" commits)? |
This PR adds support for optionally loading a dosage values as the call value for genotype loci. This is to support VCF files (such as the ones we use at 23andMe) that have dosages (
DS) instead of genotype call information (typicallyGT).