Skip to content
This repository

Options for reporting genotypes. #2

Closed
arq5x opened this Issue · 7 comments

2 participants

Aaron Quinlan James Casbon
Aaron Quinlan

Currently, we can do the following:

>>> for sample in record.samples:
...     print sample['GT']
'1|2'
'2|1'
'2/2'

It would be nice to have a built in method that looks at the ref and alt alleles and converts the encoded genotypes into DNA alleles (GTS == genotypes using Sequence).

>>> for sample in record.samples:
...     print sample['GTS']
'A|C'
'C|A'
'C/C'

Also, an option that returns the standard numeric encoding for genotypes: 0 == hom_ref, het == 1, hom_alt == 2, unknown (./.) == -1. This would allow one to easily compute useful popgen statistics such as HWE, pi_hat, and conduct multi-dimensional scaling comparisons.

>>> for sample in record.samples:
...     print sample['GTN']
1
2
0 
-1
etc.
James Casbon
Owner
Aaron Quinlan

Hi James,

Yeah, the idea of a samples object makes the most sense to me. The default behavior could just mimic the current functionality, but specific methods could be created to return a dict or list of tuples for the scenarios above.

So are you the "official" maintainer of this library now?

James Casbon
Owner
James Casbon jamescasbon closed this issue from a commit
James Casbon use ordered dict for samples, fixes #2 bc8c85e
James Casbon
Owner

Oops, wrong issue number in commit. Didn't mean to close, but it appears this cannot be reopened!

James Casbon
Owner

I created a branch in which I added a sample object, see issue-2-sample-objects

Perhaps you can add your method there?

Aaron Quinlan

Thanks @jamescasbon , this looks good. I am swamped for the next few days, but I have some existing functions for this in a project I am working on and can make a first pass at this early next week.

James Casbon
Owner
Chris Lasher gotgenes referenced this issue from a commit in gotgenes/PyVCF
James Casbon use ordered dict for samples, fixes #2 1501163
Chris Lasher gotgenes referenced this issue from a commit in gotgenes/PyVCF
James Casbon merge @arq5x's work on call objects. fixes #2,#10 732e1bb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.