Skip to content

Releases: bioinformatics-centre/BayesTyper

v1.5

01 Apr 22:56
Compare
Choose a tag to compare

Release featuring:

  • Noise parameter estimation: Changed noise parameter estimation so that all variation types (except nested) are now used. This allows BayesTyper to run on variant sets containing few or even no SNVs. In addition, the minimum requirement on the number of variants needed for noise estimation have been removed and replaced with a warning.

  • Noise genotyping mode: Added new genotyping mode (--noise-genotyping) where noise parameters and genotypes are estimated jointly instead of sequentially. This allows for uncertainty in the noise estimates to be directly propagated into the genotype posteriors. For larger genomes the noise estimates are generally fairly stable, however for smaller genomes with few variants this is often not the case. Also, all variants even nested are used for noise estimation in this mode. Note, that this mode will in most cases be slower and require more memory than the default.

  • Seeding and threading: Fixed seeding so that identical results (within floating-point error) are attained between different runs independently of the number of threads used. Before the same number of threads were needed in order to get identical results using the same seed.

  • Genotype quality: Added genotype quality (GQ) as a sample attribute to the bayesTyper genotype output. The quality is calculated from the maximum genotype posterior probability (GPP) and is Phred-scaled.

  • Filters: Removed the --min-homozygote-genotypes filter from bayesTyper genotype. Due to several improvements to BayesTyper over the last couple of releases this filter is not as important as it used to be. Note, that it is still possible to apply the filter using bayesTyperTools filter.

  • Haplotype option: Renamed the option for setting the maximum number of haplotype candidates per sample to --max-number-of-sample-haplotypes and increased its default value to 32. A higher value has been shown to give better results when genotyping a small number of samples. Note, that this increase might result in longer computation time especially for more complex variant clusters.

  • Prior option: Changed the default parameters of the gamma distributed noise rate prior (--noise-rate-prior) to better reflect the expected Illumina error rate.

  • Insertion alleles: Added support for insertions in bayesTyperTools convertAllele. The sequences stored in the variant attributes SEQ or SVINSSEQ are now used as the inserted sequence for <INS> alleles. In addition, a fasta file containing the inserted sequences can be given with >"name" matching <"name">. Furthermore, support for partial insertions (Manta output) where the center and length is unknown has been added.

  • Scripts: Removed addMaxGenotypePosterior since it is no longer relevant now that genotype qualities are calculated during genotyping. Added filterAlleleCallsetOrigin script that can filter alleles based on their origin (ACO).

  • General: Made smaller improvements to the inference algorithm. Converted some common asserts related to input data to more readable error messages.

v1.4.1

29 Jan 02:48
Compare
Choose a tag to compare

Patch fixing:

  • Bug resulting in variants being incorrectly excluded when the reference allele in the vcf file is uppercase and the reference is lowercase.
  • Bug resulting in the code trying to estimate genomic parameters from kmers with a multiplicity above 2 causing it to occasionally fail during kmer classification.

v1.4

18 Oct 23:02
Compare
Choose a tag to compare

Release featuring:

  • Sparsity estimation: Fixed bug when estimating the sparsity parameter used for the population prior. This fix should result in better estimates for complex clusters.
  • Ploidy input file: The ploidy of each chromosome for each gender (female and male) can now be specified using --chromosome-ploidy-file in bayesTyper genotype. Ploidy levels 0, 1 (haploid) and 2 (diploid) are supported. Human ploidy levels are assumed if no file is given (see wiki for more details).
  • Genomic parameter estimation: Genomic parameters are now estimated using either haploid or diploid k-mers. The ploidy level with the highest number of informative k-mers is used for estimation.
  • Noise parameter estimation: Noise parameters are now estimated using SNVs across all supported ploidy levels. In addition, SNVs in clusters are now also used in parameter estimation.
  • Error handling: Incorrect inputs now produces more informative error messaging.

v1.3.1

18 Jun 08:39
Compare
Choose a tag to compare

Patch fixing incompatibility with bcftools merge.

v1.3

15 Jun 12:14
55bf411
Compare
Choose a tag to compare

Major overhaul of BayesTyper. Important new features:

  • New interface: bayesTyper has been refactored into BayesTyper cluster and BayesTyper genotype. The cluster part partitions the variants into units that can then be genotyped. Please refer to the new readme for details on how to update your pipeline.
  • Much reduced memory: Only graphs and k-mers for a single unit need to reside in memory at the same time; the rest remains on disk. A bloom filter stores information about k-mers shared across units. This construct ensures that memory usage is (almost) independent of the number of candidate variants.
  • Cluster support: Each unit can be genotyped independently and hence distributed across nodes on a cluster followed by simple concatenation of the unit vcf files (e.g. using bcftools concat).
  • Simultaneous genotyping and filtering: No need to run bayesTyperTools filter. Hard filters are now applied up front by bayesTyper genotype. Genotypes can still be refiltered using bayesTyperTools filter after genotyping if necessary.
  • Parallel read bloom generation: bayesTyperTools makeBloom can now use multiple threads (and scales very well with the number of threads).
  • snakemake workflow: We have added an example snakemake workflow to the repo - this can orchestrate the entire pipeline straight from BAM(s) over variant candidates to final genotypes.

v1.2

26 Feb 14:14
5ae8dcb
Compare
Choose a tag to compare

This release contains the following major changes to BayesTyper:

  • New haplotype generation approach based on Bloom filters.

    • Reduced memory usage, especially for low coverage data.
    • Removal of singleton k-mers is no longer needed for high coverage data.
  • Variant alleles longer than 500,000 nts are now excluded by default.

    • Reduced computation time and memory usage.
    • Can be changed using the option --max-allele-length.

Please note that gzip parsing is currently not working in the static build.

v1.1

16 Aug 09:07
Compare
Choose a tag to compare
Update README.md

v1.0

26 Jul 13:44
Compare
Choose a tag to compare
BayesTyper (v1.0)

v0.9: First release

01 Jul 11:03
Compare
Choose a tag to compare

First release