forked from samtools/bcftools
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
There is now a maximum quality value, defaulting to 60 so above Illumina values. This is useful or PacBio CCS which produces abnormally high qualities leading to over-certainty in calls, especially for incorrect genotype assignment. SNP base quality is now the minimum of this base qual plus the base either side. CCS data commonly has a neighbouring low qual value adjacent to in incorrect but high quality substitution. However it turns out this is also beneficial to the HG002 Illumina data. An example of the effect for Illumina 60x data (with adjustment on min QUAL to get the same FN rate as the scores are slightly suppressed now): Before: SNP Q>0 / Q>=20 / Filtered SNP TP 262541 / 262287 / 262283 SNP FP 2313 / 1442 / 1405 SNP GT 287 / 269 / 269 SNP FN 1769 / 2023 / 2027 After: SNP Q>0 / Q>=15 / Filtered SNP TP 262503 / 262298 / 262294 SNP FP 1787 / 1349 / 1312 -6.6% SNP GT 283 / 268 / 268 ~= SNP FN 1807 / 2012 / 2016 -0.5% For 31x PacBio CCS, the quality assignment suppression is more pronounced due to the excessively high QUALs in the bam records, but the FN/FP tradeoff is the same (again tuning to try and compare with ~= FN): Before: SNP Q>0 / Q>=47 / Filtered SNP TP 263847 / 263693 / 263693 SNP FP 4908 / 3089 / 3089 SNP GT 804 / 793 / 793 SNP FN 463 / 617 / 617 After: SNP Q>0 / Q>=15 / Filtered SNP TP 263779 / 263692 / 263692 SNP FP 3493 / 2973 / 2973 -10% SNP GT 509 / 503 / 503 -37% SNP FN 531 / 618 / 618 ~= Finally, added -X,--config STR option to specify sets of parameters tuned at specific platforms. Earlier experiments showed the indel caller on PacBio CCS data needs very different gap-open and gap-extend parameters. It also benefits a bit from raising minimum qual to 5 and because this is a new option we also enable partial realignment by default.
- Loading branch information
1 parent
98bbd48
commit f4e8f72
Showing
3 changed files
with
55 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters