Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Genotypes wrongly displayed in VCF tracks (JBrowse version 1.11.3) #488
We are currently using JBrowse 1.11.3 to display different features of newly assembled genomes. The current version we use (1.11.3) do not display heterozygote genotypes properly.
This is the first line of the VCF:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT D_A007 S_A005 S_A002 D_A010 S_A001 S_A006 S_A008 D_A003 S_B992 S_B991 D_A009
lp23s00001 1254 . G A 5710.69 PASS AC=14;AF=0.636;AN=22;BaseQRankSum=-2.405;DP=315;Dels=0.00;FS=4.126;HaplotypeScore=0.8621;InbreedingCoeff=0.2139;MLEAC=14;MLEAF=0.636;MQ=51.05;MQ0=23;MQRankSum=1.793;QD=21.88;ReadPosRankSum=0.086;EFF=INTERGENIC(MODIFIER||||||||) GT:AD:DP:GQ:PL 1/1:0,20:21:54:693,54,0 0/1:14,16:30:99:545,0,411 0/0:25,0:25:69:0,69,925 1/1:2,23:26:26:768,26,0 0/1:13,8:21:99:198,0,252 0/0:28,0:28:75:0,75,972 0/1:17,18:35:99:483,0,522 1/1:1,33:34:96:1152,96,0 0/1:17,15:32:99:485,0,481 1/1:0,20:21:48:622,48,0 1/1:2,28:30:69:832,69,0
But in the browser the genotype table says (I added pipes to separate columns here):
variant 9 81.82%
Thus, JBrowse counts 4 heterozygous sites but then shows same genotype in the table. The only way to distinguish them is the coverage support of each allele (DP), S_A001 and S_A005 having support for both in the example. However, the three genotypes are labeled as ref(G), instead of G/A.
Do you know how to fix this?
Just in case, gives some more clue:
Thanks in advance,
I see what you mean after looking into it (it's just that the full genotype table doesn't show the alternative alleles correctly if i understand correctly)
Apparently this code actually worked in JBrowse 1.10.6 but has since been changed to a non-working code here
This commit I think is made to be account for rendering "multipart" divs inside the genotype table, such as shown in this screenshot
However, the genotype field is of course the real issue. The code for "_mungeGenotypeValue" was introduced in this version but it loses functionality from older code that it was refactoring from. I propose the following fix
Given that I don't think that there are going to be "multipart" genotypes (no need to draw little boxes like in my screenshot), then i think using "var value_parse = value.values;" is safe. I'm not actually sure what the code was doing before, because it was receiving data such as array.map(["0|0"],...) for example, and then the line "gtIndex = parseInt( gtIndex );" would receive a string like "0|0" which would just return 0 and thus we lose the genotype information.
I think we need to do the "splitting" and "joining" of the string, and i'm not sure why this disappeared from the code, so that's what my proposed fix does!
Thanks for your quick reponse Colin,
we applied the patch and it works. Now genotypes look fine, in the example:
On 11 June 2014 08:41, Colin email@example.com wrote:
Yes, it does. tTe 11 genotypes are properly reported now. Sorry, I just
On 11 June 2014 15:05, Colin firstname.lastname@example.org wrote:
After commit this change, it seems that I still have problems showing the correct genotypes: I don't get the genotypes with bases separated by a slash (/). I wonder whether this is the same problem raised here or a different one. The multi-sample VCF I'm trying to visualize looks like this on JBrowse:
Shouldn't those be ref(A)/ref(A), T/T, and ref(A)/ref(A)?