Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for importing dosages as call values #5077

Merged
merged 13 commits into from Jan 14, 2019

Conversation

Projects
None yet
4 participants
@tmwong2003
Copy link
Contributor

commented Jan 4, 2019

This PR adds support for optionally loading a dosage values as the call value for genotype loci. This is to support VCF files (such as the ones we use at 23andMe) that have dosages (DS) instead of genotype call information (typically GT).

@tmwong2003 tmwong2003 changed the title Added support for importing dosages into entries Added support for importing dosages as call values Jan 4, 2019

@tpoterba
Copy link
Collaborator

left a comment

I think the ability to load entry floats as float32 is a great change, but I think all the language about "dosage" is a bit use-case-specific and misleading. My proposal would be to add a parameter to import_vcf called entry_float_type or something, which defaults to hl.tfloat64.

This would get passed in as a config param, and would let us do the following in the headerField calculation:

    case (VCFHeaderLineType.Float, false) => entryFloatType

I don't think this will be a huge change from what you have, and will mostly involve deleting code.

looking again

@tpoterba
Copy link
Collaborator

left a comment

a few more small changes, and good to go.

Show resolved Hide resolved hail/python/hail/methods/impex.py Outdated
Show resolved Hide resolved hail/python/test/hail/methods/test_impex.py Outdated
Show resolved Hide resolved hail/python/test/hail/methods/test_statgen.py Outdated
Show resolved Hide resolved hail/python/test/hail/methods/test_statgen.py Outdated
Show resolved Hide resolved hail/src/main/scala/is/hail/io/vcf/LoadVCF.scala Outdated
Show resolved Hide resolved hail/src/main/scala/is/hail/io/vcf/HtsjdkRecordReader.scala Outdated
Show resolved Hide resolved hail/src/main/scala/is/hail/io/vcf/LoadVCF.scala
Show resolved Hide resolved .gitignore Outdated

@tpoterba tpoterba self-assigned this Jan 9, 2019

Theodore Wong
@tpoterba

This comment has been minimized.

Copy link
Collaborator

commented Jan 10, 2019

looks like a compile error in TestUtils.scala.

The rest looks good, will approve when tests pass

addressed

@tmwong2003

This comment has been minimized.

Copy link
Contributor Author

commented Jan 10, 2019

looks like a compile error in TestUtils.scala.

The rest looks good, will approve when tests pass

I fixed TestUtils.scala; the issue was a missing parameter in a call to the (changed) MatrixVCFReader.

Should I rebase and squash all of my commits into a single commit, or are you OK with merging my branch with the discrete feature commits (i.e., non-"Merge remote-tracking branch" commits)?

@tpoterba
Copy link
Collaborator

left a comment

looks good to me!

@danking danking merged commit 8097463 into hail-is:master Jan 14, 2019

1 check passed

hail-ci-0-1 successful build
Details

@tmwong2003 tmwong2003 deleted the tmwong2003:tmwong2003/merge-dosage-patch branch Jan 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.