-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixed Diploid / Haploid calls result in incorrect zygosity call #227
Comments
yes, a lot of the features like |
I just got bitten by this too. It looks like a lot of things are broken for mixed ploidy. htslib/vcf.h has this to say about
However, cyvcf uses "ndst" instead of the returned value here: Lines 1354 to 1356 in 036b58f
and here: Line 1489 in 036b58f
Curiously, thiis means I guess it would be easy to fix these cases. But at first glance it seems that fixing |
@grahamgower , thanks for pointing this out. I'll make these changes now and see if I can get Variant.genotype.array() fixed as well. |
with the changes recommended by @grahamgower along with a fix in the helper function and these genotypes shown in original comment:
I now get:
for I have pushed that fix and will make a release soon. Would be good if either/both of you could verify this matches what you expect. I'd also reocmmend to use the |
btw, I left this open in case you uncover more issues. Given test-cases as above, I think they can be fixed. |
Thanks a bunch @brentp! So to be clear, which features should I avoid if I want to support ploidy>2 in my application? |
for ploidy > 2, avoid:
The gt_ and num_ attributes like those were added since cyvcf2 was (as the name implies) copying the API of cyvcf which was in cython, but did not use htslib. |
Perfect, thanks again @brentp. And to answer your earlier question...
Yes, this matches my expectations. |
Hi, thanks for the very quick fix, still an issue I think: 3 GTs of "0/1 0 0/1" (strict_gt=True) gives me:
Expected num_hom_ref = 1 |
PS hello @grahamgower long time no see! |
Hey @davmlaw! Sorry, I didn't even notice it was your issue! I see you're still going strong with variantgrid. :) |
Thanks for following up. I just pushed 8c41b9c so that the variant with gentoypes:
and code:
now gives:
note that |
Hi, works for me, just ran it through the VCF I had problems with before and everything was as I expected. Happy to close the issue if you are |
This is fixed in v0.30.13. Thanks for reporting and providing the test-cases! |
DRAGEN 05.021.604.3.7.7 produced a VCF with a genotype call of "1" on chrX for a male. This seems valid - the VCF spec says "Haploid calls, e.g. on Y, male non-pseudoautosomal X, or mitochondrion, are indicated by having only one allele value"
CyVCF2 doesn't seem to handle this if the VCF row contains a mix of diploid / haploid calls (eg chrX record joint called with a Male/Female)
Attached a minimal version to reproduce below:
test.vcf.gz
Outputs:
Both Haploid works - recognises 1 as HOM ALT
Both Diploid works - explicitly include the missing genotype call:
One Diploid / One Haploid is wrong - it treats the haploid as missing a zygosity call - but it also ploidy and PL width (usually ploidy+1) diverge:
Expected result: GT=1 should be consistently read as HOM ALT, regardless of other genotypes on the line
The text was updated successfully, but these errors were encountered: