Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for importing dosages as call values #5077

Merged
merged 13 commits into from Jan 14, 2019
Merged

Added support for importing dosages as call values #5077

merged 13 commits into from Jan 14, 2019

Conversation

@tmwong2003
Copy link
Contributor

@tmwong2003 tmwong2003 commented Jan 4, 2019

This PR adds support for optionally loading a dosage values as the call value for genotype loci. This is to support VCF files (such as the ones we use at 23andMe) that have dosages (DS) instead of genotype call information (typically GT).

@tmwong2003 tmwong2003 changed the title Added support for importing dosages into entries Added support for importing dosages as call values Jan 4, 2019
hail/python/hail/ir/matrix_reader.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@tpoterba tpoterba left a comment

I think the ability to load entry floats as float32 is a great change, but I think all the language about "dosage" is a bit use-case-specific and misleading. My proposal would be to add a parameter to import_vcf called entry_float_type or something, which defaults to hl.tfloat64.

This would get passed in as a config param, and would let us do the following in the headerField calculation:

    case (VCFHeaderLineType.Float, false) => entryFloatType

I don't think this will be a huge change from what you have, and will mostly involve deleting code.

hail/python/test/hail/stats/test_linear_mixed_model.py Outdated Show resolved Hide resolved
@tpoterba tpoterba dismissed their stale review Jan 8, 2019

looking again

Copy link
Collaborator

@tpoterba tpoterba left a comment

a few more small changes, and good to go.

hail/python/hail/methods/impex.py Outdated Show resolved Hide resolved
hail/python/test/hail/methods/test_impex.py Outdated Show resolved Hide resolved
hail/python/test/hail/methods/test_statgen.py Outdated Show resolved Hide resolved
hail/python/test/hail/methods/test_statgen.py Outdated Show resolved Hide resolved
hail/src/main/scala/is/hail/io/vcf/LoadVCF.scala Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
@tpoterba tpoterba self-assigned this Jan 9, 2019
@tpoterba
Copy link
Collaborator

@tpoterba tpoterba commented Jan 10, 2019

looks like a compile error in TestUtils.scala.

The rest looks good, will approve when tests pass

@tpoterba tpoterba dismissed their stale review Jan 10, 2019

addressed

@tmwong2003
Copy link
Contributor Author

@tmwong2003 tmwong2003 commented Jan 10, 2019

looks like a compile error in TestUtils.scala.

The rest looks good, will approve when tests pass

I fixed TestUtils.scala; the issue was a missing parameter in a call to the (changed) MatrixVCFReader.

Should I rebase and squash all of my commits into a single commit, or are you OK with merging my branch with the discrete feature commits (i.e., non-"Merge remote-tracking branch" commits)?

Copy link
Collaborator

@tpoterba tpoterba left a comment

looks good to me!

@danking danking merged commit 8097463 into hail-is:master Jan 14, 2019
1 check passed
@tmwong2003 tmwong2003 deleted the tmwong2003/merge-dosage-patch branch Jan 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants