Skip to content

Filtering of polymorphisms according to thresholds of quality score, read depth and frequency of base calls

Notifications You must be signed in to change notification settings

ratope/VcfFilter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

VcfFilter

Filtering of polymorphisms according to thresholds of quality score, read depth and frequency of base calls.

Reference: "Polymorphisms and minihaplotypes in the VvNAC26 gene associate with berry size variation in grapevine" (Tello J et al., BMC Plant Biology 2015)

The script tar_FilteringVariants.pl takes and extended VCF file and filter SNP and Indels according to thresholds of quality score, read depth and frequency of base calls. These are the parameters and thresholds for the filtering:

1.  SNP_HOMOZ_MIN_RATIO (minimum RATIO for polimorphism of type "homozygous SNP") = 0.942,
2.  SNP_HOMOZ_MIN_DEPTH (minimum DEPTH for polimorphism of type "homozygous SNP") = 10,
3.  SNP_HOMOZ_MIN_QUAL  (minimum QUAL for polimorphism of type "homozygous SNP") = 76,
4.  SNP_HETER_MIN_RATIO (minimum RATIO for polimorphism of type "heterozygous SNP") = 0.38,
5.  SNP_HETER_MIN_DEPTH (minimum DEPTH for polimorphism of type "heterozygous SNP") = 20,
6.  SNP_HETER_MIN_QUAL (minimum QUAL for polimorphism of type "heterozygous SNP") =  57,
7.  INDEL_HOMOZ_MIN_RATIO (minimum RATIO for polimorphism of type "homozygous Indel") = 0.83,
8.  INDEL_HOMOZ_MIN_DEPTH (minimum DEPTH for polimorphism of type "homozygous Indel") = 14,
9.  INDEL_HOMOZ_MIN_QUAL (minimum QUAL for polimorphism of type "homozygous Indel") = 43,
10. INDEL_HETER_MIN_RATIO (minimum RATIO for polimorphism of type "heterozygous Indel") = 0.28,
11. INDEL_HETER_MIN_DEPTH (minimum DEPTH for polimorphism of type "heterozygous Indel") = 38,
12. INDEL_HETER_MIN_QUAL (minimum QUAL for polimorphism of type "heterozygous Indel") = 33

with RATIO = ratio "calls_alternative_allele/calls_reference_allele", DEPTH = coverage (num of reads) for the position of the polimorphism QUAL = quality of the polimorphism calling (determined by the variant caller, samtools)

  • The thresholds 1 to 12 are the mean less one standard deviation value for the distribution of the corresponding parameter in the entire set of samples. For example, taking into account all the depth values in the set of heterozygous SNPs, the mean value is 80 and the standard deviation is 60, so the threshold for the parameter SNP_HETER_MIN_DEPTH is 20.

About

Filtering of polymorphisms according to thresholds of quality score, read depth and frequency of base calls

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages