Skip to content

SNP indel variant quality metrics and filtering (v2.6 or newer)

Hannes Pétur Eggertsson edited this page Apr 29, 2022 · 4 revisions

v2.6 has all the same metrics as v2.5 and older, see page for v2.5 and older first. In v2.6 we introduced several new quality metrics tailored to multi-allelic variants that are specified for each alternative allele, instead of one metric for all alleles. These are:

  • QDalt, ALT quality by depth
  • SBalt, read strand bias per ALT
  • MMalt, mismatch percentage in the graph alignment per ALT
  • CRalt, clipped bases percentage in the graph alignment per ALT
  • SDalt, alignment score difference (AS-XS) given by mapper per ALT
  • MQalt, root-mean square mapping quality per ALT

We also trained a new logistic regression classifier, INFO/AAScore, which gives a score between 0 and 1 for each alternative allele.

For single sample calling it is recommended to keep only on the "FILTER=PASS" variants and additionally, in larger genotyping runs, we suggest to keep only alternative allelese with INFO/AAScore > 0.5. This threshold might be adjusted for different applications.