Options for reporting genotypes. #2

Closed
arq5x opened this Issue Jan 12, 2012 · 7 comments

2 participants

@arq5x

Currently, we can do the following:

>>> for sample in record.samples:
...     print sample['GT']
'1|2'
'2|1'
'2/2'

It would be nice to have a built in method that looks at the ref and alt alleles and converts the encoded genotypes into DNA alleles (GTS == genotypes using Sequence).

>>> for sample in record.samples:
...     print sample['GTS']
'A|C'
'C|A'
'C/C'

Also, an option that returns the standard numeric encoding for genotypes: 0 == hom_ref, het == 1, hom_alt == 2, unknown (./.) == -1. This would allow one to easily compute useful popgen statistics such as HWE, pi_hat, and conduct multi-dimensional scaling comparisons.

>>> for sample in record.samples:
...     print sample['GTN']
1
2
0 
-1
etc.
@jamescasbon
@arq5x

Hi James,

Yeah, the idea of a samples object makes the most sense to me. The default behavior could just mimic the current functionality, but specific methods could be created to return a dict or list of tuples for the scenarios above.

So are you the "official" maintainer of this library now?

@jamescasbon
@jamescasbon jamescasbon pushed a commit that closed this issue Jan 16, 2012
James Casbon use ordered dict for samples, fixes #2 bc8c85e
@jamescasbon
Owner

Oops, wrong issue number in commit. Didn't mean to close, but it appears this cannot be reopened!

@jamescasbon jamescasbon reopened this Jan 16, 2012
@jamescasbon
Owner

I created a branch in which I added a sample object, see issue-2-sample-objects

Perhaps you can add your method there?

@arq5x

Thanks @jamescasbon , this looks good. I am swamped for the next few days, but I have some existing functions for this in a project I am working on and can make a first pass at this early next week.

@jamescasbon jamescasbon pushed a commit that closed this issue Jan 23, 2012
James Casbon merge @arq5x's work on call objects. fixes #2,#10 7dfe3ed
@jamescasbon
@gotgenes gotgenes pushed a commit to gotgenes/PyVCF that referenced this issue May 13, 2014
James Casbon use ordered dict for samples, fixes #2 1501163
@gotgenes gotgenes pushed a commit to gotgenes/PyVCF that referenced this issue May 13, 2014
James Casbon merge @arq5x's work on call objects. fixes #2,#10 732e1bb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment