Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consensus error "fasta sequence does not mach REF allele" #938

Closed
MaximePolicarpo opened this issue Dec 26, 2018 · 8 comments
Closed

Consensus error "fasta sequence does not mach REF allele" #938

MaximePolicarpo opened this issue Dec 26, 2018 · 8 comments

Comments

@MaximePolicarpo
Copy link

MaximePolicarpo commented Dec 26, 2018

Hi everyone,

I tried to use bcftools consensus on my data but i got an error, already reported here : #888

I saw that the issue was solved and downloaded the new commit made with the problem supposed to be solved but it still fails :

The fasta sequence does not match the REF allele at NC_035902.1:45610288:
.vcf: [AT] <- (REF)
.vcf: [ATT] <- (ALT)
.fa: [TT]ATTATTATTATTATTATTATTATTATTATATTTT

Result_bcf_call.vcf.gz
Astyanax_Surface_EyegenesRegions.fasta.gz

I attached the vcf file as well as the reference so you can try it. (VCF file and Reference file)

Thanks and happy holidays,

Maxime

@pd3
Copy link
Member

pd3 commented Dec 26, 2018

Can you check that you are using the latest github version of bcftools? bcftools --version

@MaximePolicarpo
Copy link
Author

bcftools --version give return this:

bcftools 1.9
Using htslib 1.9
Copyright (C) 2018 Genome Research Ltd.
License Expat: The MIT/Expat license
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

I saw that there was a new commit 5 days ago on https://github.com/samtools/bcftools and i downloaded it directly from this repository.

Maxime

@pd3
Copy link
Member

pd3 commented Dec 26, 2018

That's the latest versioned release 1.9. You'll need to get the latest github version as described here http://samtools.github.io/bcftools/howtos/install.html

@MaximePolicarpo
Copy link
Author

I think the error still happen with this version :/

Here is what i get using bcftools --version when installed from http://samtools.github.io/bcftools/howtos/install.html

bcftools 1.9-93-g27c9e5d
Using htslib 1.9
Copyright (C) 2018 Genome Research Ltd.
License Expat: The MIT/Expat license
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Is this the good version ? Sorry for the inconvenience,

Maxime

@pd3
Copy link
Member

pd3 commented Dec 26, 2018

OK, it is a genuine problem then. Can you first run bcftools norm to normalize the indels? The fasta you kindly provided with your test case is an excerpt so I cannot run it myself.

@MaximePolicarpo
Copy link
Author

I tried it but failed with the same error : Here is my full pipeline with the files attached [I didn't upload the original bam file but only the VCF resulting from bcftools mpileup] :

bcftools mpileup -R Regions.bed -O v -f Zip_File_Bcftools/Genome_Reference.fasta ../Bam_files/Tinaja11_vs_SurfaceGenome_sorted.bam -o VCF_Tinaja11.vcf
bgzip VCF_Tinaja11.vcf
vcf-sort VCF_Tinaja11.vcf.gz > VCF_Sorted_Tinaja11.vcf
bcftools norm -f Zip_File_Bcftools/Genome_Reference.fasta VCF_Sorted_Tinaja11.vcf -O v -o VCF_Sorted_Norm_Tinaja11.vcf

bcftools call -mv VCF_Sorted_Norm_Tinaja11.vcf -o Results_Call_Tinaja11.vcf
bgzip Results_Call_Tinaja11.vcf
tabix -p vcf Results_Call_Tinaja11.vcf.gz

bcftools consensus -f Zip_File_Bcftools/Regions.fasta Results_Call_Tinaja11.vcf.gz -o Test_consensus.fa ###bcftools consensus only on regions defined on the bed file that I used for mpileup. Fail with the error "fasta sequence does not match..."

bcftools consensus -f Zip_File_Bcftools/Genome_Reference.fasta Results_Call_Tinaja11.vcf.gz -o Test_consensus.fa ##Same command but with the whole genome to see if it changes anything : Same error
[Whole genome fasta not provided as it is too big even when compressed]

Regions.bed.tar.gz
Regions.fasta.tar.gz
VCF_Tinaja11.vcf.gz
VCF_Sorted_Norm_Tinaja11.vcf.gz

I don't know if i ran bcftools norm too late or at the wrong step.

Best,

Maxime

@pd3 pd3 closed this as completed in 9589876 Dec 27, 2018
@pd3
Copy link
Member

pd3 commented Dec 27, 2018

Thank you for the test files, this should be now fixed, hopefully for good. I also expanded the documentation to show the indel normalizing and filtering step http://samtools.github.io/bcftools/howtos/consensus-sequence.html

@MaximePolicarpo
Copy link
Author

Thank you a lot ! 👍
Maxime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants