Description
I am genotyping 10 individuals with mpileup and call commands and I am getting some strange results.
I am using this code:
bcftools mpileup --skip-indels --annotate FORMAT/AD,FORMAT/DP,INFO/AD -C 0 -d 550 -Ou -f $ref $name | bcftools call -mv -Ov -o calls.vcf
variable $name has the 10 bam files of each individual and $ref the reference genome.
Here is one of the lines of the final vcf:
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Number of high-quality bases">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths (high-quality bases)">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
........GT:PL:DP:AD 1/1:31,6,0:26:13,13 1/1:0,7,1:26:14,12 1/1:8,9,0:38:17,21 1/1:0,16,1:44:28,16 1/1:28,6,0:24:14,10 1/1:3,5,0:22:11,11 1/1:9,0,10:17:11,6 1/1:2,0,6:20:10,10 1/1:10,12,0:57:33,24 1/1:6,0,10:24:9,15
so, it looks so strange for me that the pipeline calls all the 10 genotypes 1/1 when the allele depths are similar in every case.
If I just look at the allele depth I would say all the genotypes should be 0/1
What am I missing?
And yes, I can see the Shred-scaled genotype likelihoods, and several case looks to indicate homozygous 1/1 (here for example 1/1:31,6,0:26:13,13), but in other cases, wouldn't be heterozygous? (here for example 1/1:28,6,0:24:14,10).
And another final question. If I am getting all individuals 1/1, in other words.. a monomorphic site.....why is this site present in the final vcf? I would spect at least one allele of ALT (or REF).