bcftools mpileup + call seemingly confident in allele only present in a few reads? #2185

emos8710 · 2024-05-17T13:41:49Z

Hi!

I have an issue that is puzzling me. I have two salmonella samples that I'm comparing and expect them to be highly similar if not identical. I have mapped them with bowtie2 (three different versions as I was investigating if it was the mapping that was the issue) and created a mpileup and called variants with bcftools v 1.14 and 1.17 (no difference in the result)

bcftools call -m -v -O v --ploidy 1

I'm getting some variant calls that I don't understand. Here is the best example. There are three copies of each sample as I was checking if there was any difference between bowtie2 versions.

NC_003197.2	1349066	.	C	T	720.935	.	DP=815;VDB=0.00653457;SGB=-21.0274;RPBZ=-12.8027;MQBZ=-8.78292;MQSBZ=4.75308;BQBZ=6.99271;SCBZ=0;FS=0;MQ0F=0.00368098;AC=3;AN=6;DP4=4,76,70,101;MQ=37	GT:PL	0:0,112	0:0,112	0:0,118	1:255,0	1:255,0	1:255,0

I have also tried running the mpileup with a MQ threshold of 20, which barely makes a difference.
For sample 1 there are 23 reads supporting REF and 126 reads supporting ALT. For sample 2 there are 8 reads supporting REF and 112 reads supporting ALT (plus 2 reads supporting a second ALT). However bcftools seems pretty confident in the REF call for sample 1 and very confident in the ALT call for sample 2. Why is this? I would expect more uncertainty, so I could filter this position :)

The text was updated successfully, but these errors were encountered:

Lilu-guo · 2024-05-17T15:17:02Z

Yes, I also found that the specified MQ threshold didn't seem to work.
For example, when using -q 20, it still uses 30 as the threshold; and when using -q 40 (greater than 30), it uses 40 normally.

Maybe we can discuss bcftools further. It will be more efficient to use WeChat or other instant tools.

pd3 · 2024-05-21T15:15:02Z

I am unable to comment without seeing the data, ideally a small bam slice + corresponding fasta reference chunk.

EDIT: This script can be used to create a small test case https://github.com/pd3/mpileup-tests/blob/main/misc/create-bam-test

emos8710 · 2024-05-23T14:32:44Z

Thanks for your reply, I will get back with some data!

pd3 added the requires-test-case label May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bcftools mpileup + call seemingly confident in allele only present in a few reads? #2185

bcftools mpileup + call seemingly confident in allele only present in a few reads? #2185

emos8710 commented May 17, 2024

Lilu-guo commented May 17, 2024

pd3 commented May 21, 2024 •

edited

emos8710 commented May 23, 2024

bcftools mpileup + call seemingly confident in allele only present in a few reads? #2185

bcftools mpileup + call seemingly confident in allele only present in a few reads? #2185

Comments

emos8710 commented May 17, 2024

Lilu-guo commented May 17, 2024

pd3 commented May 21, 2024 • edited

emos8710 commented May 23, 2024

pd3 commented May 21, 2024 •

edited