New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BCFTools 1.9 - Consensus error "fasta sequence does not mach REF allele". #888
Comments
Thank you for the bug report, this should be fixed. Is there any chance you could provide a tarball with a small reproducible test case? |
Hi there! Thanks for replying. I have attached a small sample of the data, in which the error is reproducible.
And using bcftools 1.9-51-g20a170e the result is:
Checking the sequence with samtools faidx:
Thanks again, and please, let me know if there is anything else I can do to help in fixing this issue. |
Thank you for the test case, this is now fixed by 253a1fd |
I'm still experiencing this error, and I'm running the current dev branch (I checked that my consensus.c matches that from 253a1fd above);
But I have the same problem as the original post on this issue:
and yet:
results from the command
|
I am not able to reproduce the error. Can you make sure that |
Yup, it's the only version I've got installed:
This is happening with the 1000 Genomes VCF against Chr4 from the appropriate reference, hs37d5. It's typically pretty far along the chromosome, but not always; here's the resulting length of the consensus fasta for the first bunch of individuals (the ref Chr4 is 191154276 bases):
I can upload the data to reproduce this, it's fairly big, of course. |
BTW, it always occurs on repeat extensions, and the bug is always on the first base of the REF allele, e.g.
Presumably the VCF is saying this individual has an extra TGAA inserted in that short repeat. (I've been studying Huntington's Disease, which is a CAG repeat extension in the HTT gene which is luckily near the beginning of Chr4 and hasn't been affected by this bug for ranges within 0-5Mb.) Do you know of another tool that folks use to pull individuals' genomes out of the 1kG VCF and corresponding reference? |
Here's a very slightly different case: the REF is notated with a single base, but it is once again an extension of a four-base repeat:
|
I was able to reproduce the problem now and fixed. Let me know if you encounter any other issue. |
Nice work! Thanks!!! Happy holidays!! |
Hi, I think the problem still exists. I used bcftools consensus and got this error : The fasta sequence does not match the REF allele at NC_035902.1:45610288: [I have the last bcftools version, downloaded from https://github.com/samtools/bcftools] Thanks and happy holidays, Maxime |
Hi guys,
I have been looking to generate a consensus sequence. I mapped reads with Bowtie 2, then I used bcftools mpileup, and bcftools call. For this I used:
After mpileup and call, I used bcftools consensus, and got the following message:
The following is the info from the bcf file at said region:
Using samtools faidx for region 12426 on the same sequence returns the following:
Using bcftools downloaded from GitHub (following the instructions here http://samtools.github.io/bcftools/):
I get the same error:
However, using an older version of bcftools,
I get no error, rather the warning:
The site JH651516.1:12426 overlaps with another variant, skipping...
and the FASTA file is generated.
The text was updated successfully, but these errors were encountered: