Skip to content

Releases: HKU-BAL/Clair3

v0.1-r12

20 Aug 03:21
2dd8b44
Compare
Choose a tag to compare
  1. CRAM input is supported (#117).
  2. Bumped up dependencies' version to "Python 3.9" (#96), "TensorFlow 2.8", "Samtools 1.15.1", "WhatsHap 1.4".
  3. VCF DP tag now shows raw coverage for both pileup and full-alignment calls (before r12, sub-sampled coverage was shown for pileup calls if average DP > 144, (#128).
  4. Fixed Illumina representation unification out-of-range error in training (#110).

v0.1-r11.1

13 Jun 03:28
Compare
Choose a tag to compare
v0.1-r11.1 Pre-release
Pre-release

Users, please ignore this pre-release. This pre-release is for Zenodo to pull and archive Clair3 for the first time.

v0.1-r11

04 Apr 10:16
e8c2e50
Compare
Choose a tag to compare
  1. Variant calling ~2.5x faster than v0.1-r10 tested with ONT Q20 data, with feature generation in both pileup and full-alignment now implemented in C (co-contributors @cjw85, @ftostevin-ont, @EpiSlim).
  2. Added the lightning-fast longphase as an option for phasing. Enable using longphase with option --longphase_for_phasing. New option is disabled by default to align with the default behavior of the previous versions, but we recommend enable when calling human variants with ≥20x long-reads).
  3. Added --min_coverage and --min_mq options (#83).
  4. Added --min_contig_size option to skip calling variants in short contigs when using genome assembly as input.
  5. Reads haplotagging after phasing before full-alignment calling now integrated into full-alignment calling to avoid generating an intermediate BAM file.
  6. Supported .csi BAM index for large references (#90). For more speedup details, please check Notes on r11.

v0.1-r11 minor 2 patches are included in all installation options

v0.1-r10

13 Jan 12:43
Compare
Choose a tag to compare
  1. Added a new ONT Guppy5 model (r941_prom_sup_g5014). Click here for some benchmarking results. This sup model is also applicable to reads called using the hac and fast mode. The old r941_prom_sup_g506 model that was fine-tuned from the Guppy3,4 model is obsoleted.

  2. Added --var_pct_phasing option to control the percentage of top ranked heterozygous pile-up variants used for WhatsHap phasing.

v0.1-r9

01 Dec 12:05
Compare
Choose a tag to compare

Added the --enable_long_indel option to output indel variant calls >50bp (#64), Click here to see more benchmarking results.

v0.1-r8

11 Nov 13:59
Compare
Choose a tag to compare
  1. Added the --enable_phasing option that adds a step after Clair3 calling to output variants phased by Whatshap (#63).
  2. Fixed unexpected program termination on successful runs.

v0.1-r7

19 Oct 09:10
6cd8994
Compare
Choose a tag to compare
  1. Increased var_pct_full in ONT mode from 0.3 to 0.7. Indel F1-score increased ~0.2%, but took ~30 minutes longer to finish calling a ~50x ONT dataset.
  2. Expand fall through to next most likely variant if network prediction has insufficient read coverage (#53 commit 09a7d18, contributor @ftostevin-ont), accuracy improved on complex Indels.
  3. Streamized pileup and full-alignment training workflows. Reduce diskspace demand in model training (#55 commit 09a7d18, contributor @ftostevin-ont).
  4. Added mini_epochs option in Train.py, performance slightly improved in training a model for ONT Q20 data using mini-epochs(#60, contributor @ftostevin-ont).
  5. Massively reduced disk space demand when outputting GVCF. Now compressing GVCF intermediate files with lz4, five times smaller with little speed penalty.
  6. Added --remove_intermediate_dirto remove intermediate files as soon as no longer needed (#48).
  7. Renamed ONT pre-trained models with Medaka's naming convention.
  8. Fixed training data spilling over to validation data (#57).

v0.1-r6

04 Sep 13:47
ab47f45
Compare
Choose a tag to compare
  1. Reduced memory footprint at the SortVcf stage(#45).
  2. Reduced ulimit -n (number of files simultaneously opened) requirement (#45, #47).
  3. Added Clair3-Illumina package in bioconda(#42)

v0.1-r5

19 Jul 15:11
Compare
Choose a tag to compare
  1. Modified data generator in model training to avoid memory exhaustion and unexpected segmentation fault by Tensorflow (contributor @ftostevin-ont ).
  2. Simplified dockerfile workflow to reuse container caching (contributor @amblina).
  3. Fixed ALT output for reference calls (contributor @wdecoster).
  4. Fixed a bug in multi-allelic AF computation (AF of [ACGT]Del variants was wrong before r5).
  5. Added AD tag to the GVCF output.
  6. Added the --call_snp_only option to only call SNP only (#40).
  7. Added pileup and full-alignment output validity check to avoid workflow crashing (#32, #38).

v0.1-r4

28 Jun 13:44
Compare
Choose a tag to compare
  1. Install via bioconda.
  2. Added an ONT Guppy2 model to the images (ont_guppy2). Click here for more benchmarking results. The results show you have to use the Guppy2 model for Guppy2 or earlier data. 3. Added google colab notebooks for quick demo. 4. Fixed a bug then there are too few variant candidates (#28).