You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
so, it looks so strange for me that the pipeline calls all the 10 genotypes 1/1 when the allele depths are similar in every case.
If I just look at the allele depth I would say all the genotypes should be 0/1
What am I missing?
And yes, I can see the Shred-scaled genotype likelihoods, and several case looks to indicate homozygous 1/1 (here for example 1/1:31,6,0:26:13,13), but in other cases, wouldn't be heterozygous? (here for example 1/1:28,6,0:24:14,10).
And another final question. If I am getting all individuals 1/1, in other words.. a monomorphic site.....why is this site present in the final vcf? I would spect at least one allele of ALT (or REF).
The text was updated successfully, but these errors were encountered:
Starting with the last question, homozygous alternate genotypes count as variants relative to the reference genome.
Regarding the genotypes, it depends on the actual reads and base qualities, and BAQ realignment - try to switch it off. The results look dodgy, a small test case would be required to understand and explain what is going on
Ok, I understand now why I have some monomorphic sites, I hadn't thought it but it makes sense.
On the other hand, I've already tried running the pipeline with BAQ realignment off, and the results are practically identical. All the genotypes are 1/1 and just change few allele depths, but nothing important, just one or two values.
it continue looks strange genotyping but I guess it is due to bases qualities and the internal algorithm of the pipeline.
I am genotyping 10 individuals with mpileup and call commands and I am getting some strange results.
I am using this code:
bcftools mpileup --skip-indels --annotate FORMAT/AD,FORMAT/DP,INFO/AD -C 0 -d 550 -Ou -f $ref $name | bcftools call -mv -Ov -o calls.vcf
variable $name has the 10 bam files of each individual and $ref the reference genome.
Here is one of the lines of the final vcf:
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Number of high-quality bases">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths (high-quality bases)">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
........GT:PL:DP:AD 1/1:31,6,0:26:13,13 1/1:0,7,1:26:14,12 1/1:8,9,0:38:17,21 1/1:0,16,1:44:28,16 1/1:28,6,0:24:14,10 1/1:3,5,0:22:11,11 1/1:9,0,10:17:11,6 1/1:2,0,6:20:10,10 1/1:10,12,0:57:33,24 1/1:6,0,10:24:9,15
so, it looks so strange for me that the pipeline calls all the 10 genotypes 1/1 when the allele depths are similar in every case.
If I just look at the allele depth I would say all the genotypes should be 0/1
What am I missing?
And yes, I can see the Shred-scaled genotype likelihoods, and several case looks to indicate homozygous 1/1 (here for example 1/1:31,6,0:26:13,13), but in other cases, wouldn't be heterozygous? (here for example 1/1:28,6,0:24:14,10).
And another final question. If I am getting all individuals 1/1, in other words.. a monomorphic site.....why is this site present in the final vcf? I would spect at least one allele of ALT (or REF).
The text was updated successfully, but these errors were encountered: