-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcftools norm -m-both lost some record after about 1K records #336
Comments
Not only the second allele of multiallelic record may be lost, but also some record of single allele may be lost too. |
PS Thank you for the bug report! |
I failed to create a public gist. so I sent the vcf file as an attachment to wkretzsch@gmail.com. Best Regards |
Thanks, I'll take a look. |
I can confirm the existence of a bug. I have narrowed down the problem to a small VCF with 79 sites. If I remove a site from the front or the end, then the bug disappears. I have posted the file here. However, I am not familiar with This is the code I used to replicate the error:
where the first number should be the same as the sum of the second and third number. I am using bcftools version:
|
It appears as if the first site is being dropped (position |
Yes. Also the number of lines modified is not updated: Lines total/modified/skipped: 78/0/0 I'm having a look, but also not familiar with bcftools norm. |
The number of lines modified is not updated too for bcftools 1.2. |
I can prevent specific examples from dropping lines by changing buffer limits (e.g. increase above 100 in vcfnorm.c line1486 to make the example above work), but there is a fundamental over-writing issue that will take a while to figure out. |
vcfnorm.c line1486 is same as bcftools 1.2, so another fundamental over-writing issue maybe the real reason? |
Sure. |
@wangyugui Should be fixed on the test case from you and @wkretzsch, but let us know if it hasn't fixed up the issue with your original VCF. |
PS I can confirm this code passes fine using |
This problem is fixed by this patch in my case. |
bcftools norm -m-both works correctly for the first 1K records , but lost some of them after about 1K records.
bcftools version 1.2 does NOT have this problem, but the last source in github have this problem.
the feature of this vcf file
1)VCF 4.2
2)it have 200 samples
The text was updated successfully, but these errors were encountered: