Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GL fields in output VCF don't sum to 1 #18

Closed
atcg opened this issue Sep 21, 2015 · 2 comments
Closed

GL fields in output VCF don't sum to 1 #18

atcg opened this issue Sep 21, 2015 · 2 comments

Comments

@atcg
Copy link

atcg commented Sep 21, 2015

Hello,
I've been trying to output a VCF file to look at IBD between samples, and I was curious if the GL's in the VCF file are scaled in any way. It says they have been scaled to add up to 1 in the header of the VCF, but they don't (the GP fields do sum to 1).

Here's the command I used:
angsd0.902/angsd -bam bamlist.txt -out OUTPUT -uniqueOnly 1 -only_proper_pairs 1 -remove_bads 1 -doCounts 1 -nThreads 27 -doMaf 1 -doMajorMinor 2 -GL 2 -minMapQ 20 -minQ 30 -dovcf 1 -doPost 2

And here's what my VCF looks like:

fileformat=VCFv4.2(angsd version)

FORMAT=<ID=GT,Number=1,Type=Integer,Description="Genotype">

FORMAT=<ID=GP,Number=G,Type=Float,Description="Genotype Probabilities">

FORMAT=<ID=PL,Number=G,Type=Float,Description="Phred-scaled Genotype Likelihoods">

FORMAT=<ID=GL,Number=G,Type=Float,Description="scaled Genotype Likelihoods (these are really llh eventhough they sum to one)">

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ind0 ind1

scaffold_0 1616 . T A . PASS . GP:GL 0.666631,0.333333,0.000035:0.000000,-0.301007,-4.277052 0.666578,0.333333,0.000088:0.000000,-0.300972,-3.876948

Thanks very much for any insight!
Evan

@ANGSD
Copy link
Owner

ANGSD commented Sep 22, 2015

Ok, that is a documentation bug in the header of the vcf. I’ve changed this in the latest commit on github.

It is loglikeratio to the most likely, and scaled as log10.
Let me know if you find other issues with the vcf. I dont think this is the most used part of angsd.

On 22 Sep 2015, at 01:47, Evan McCartney-Melstad notifications@github.com wrote:

Hello,
I've been trying to output a VCF file to look at IBD between samples, and I was curious if the GL's in the VCF file are scaled in any way. It says they have been scaled to add up to 1 in the header of the VCF, but they don't (the GP fields do sum to 1).

Here's the command I used:
angsd0.902/angsd -bam bamlist.txt -out OUTPUT -uniqueOnly 1 -only_proper_pairs 1 -remove_bads 1 -doCounts 1 -nThreads 27 -doMaf 1 -doMajorMinor 2 -GL 2 -minMapQ 20 -minQ 30 -dovcf 1 -doPost 2

And here's what my VCF looks like:

##fileformat=VCFv4.2(angsd version)
##FORMAT=
##FORMAT=
##FORMAT=
##FORMAT=
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT ind0 ind1
scaffold_0 1616 . T A . PASS . GP:GL 0.666631,0.333333,0.000035:0.000000,-0.301007,-4.277052 0.666578,0.333333,0.000088:0.000000,-0.300972,-3.876948

Thanks very much for any insight!
Evan


Reply to this email directly or view it on GitHub #18.

@ANGSD
Copy link
Owner

ANGSD commented Oct 12, 2015

Ok, I'm closing this issue. Feel free to reopen if needed

@ANGSD ANGSD closed this as completed Oct 12, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants