Skip to content

Releases: nanoporetech/dorado

v0.7.0

21 May 17:30
Compare
Choose a tag to compare

[0.7.0] (21 May 2024)

This release of Dorado introduces new and more accurate v5 models for improved basecalling. It also adds a new subcommand, dorado correct, for single-read error correction to help Nanopore based de novo assemblies of haploid or diploid genomes. In addition, this release contains a slew of bug fixes, stability enhancements and updates to barcode classification.

New feature highlights

  1. DNA, RNA and duplex basecalling models with improved single read accuracy.
  2. Support for 4mC_5mC methylation calling in DNA and all-context m6A and pseU in RNA.
  3. dorado correct subcommand for single-read error correction of haploid and diploid genomes (for assembly pipelines).
  4. Poly(A) tail estimation for plasmids and transcripts with interrupted tails.
  5. Support for --junc-bed minimap2 splice option.
  6. Faster BAM indexing and sorting code.

Changes to default behavior

  1. Data type of mean Q-score tag (qs) updated to float.
  2. Adapter trimming is enabled when poly(A) estimation is requested.

All key changes

  • 7a09ca3 - Add v5 basecalling models for DNA, RNA and duplex
  • 159b73c - Add new models for calling DNA and RNA base modifications (4mC_5mC, m6A, pseU)
  • be8ac08 - Add dorado correct support for read error correction
  • 67dc5ba - Poly(A) estimation for plasmids and interrupted tails
  • 381f6c3 - Enable adapter trimming when poly(A) estimation is requested
  • d6b0f68 - Change data type of mean Q-score (qs tag) to float
  • f938c41 - List supported models in structured format
  • 70ff95d - Enable dorado summary to run on trimmed BAM files
  • 6373792 - Detect presence of midstrand barcodes to reduce false positive classifications
  • 68d40da - Add support for --junc-bed minimap2 splice option
  • c443f75 - Output BAM instead of SAM from dorado trim command
  • a3dce7e - Support dorado demux from input folders with mix of PG and SQ headers
  • 08e2c7b - Speed up sorting and merging of BAM files
  • b8de2d9 - Set maximum memory sizes in minimap2
  • b8de2d9 - Calculate scaling for RNA on non-adapter signal only
  • c88e9f7 - Update CMake Minimum Version to 3.23

v0.6.2

10 May 03:13
Compare
Choose a tag to compare

[0.6.2] (9 May 2024)

This release of Dorado disables trimming of the rapid adapter during basecalling which was causing some RBK datasets to have a high unclassified rate during demux.

  • a64492b - Fix bug with loading reverse aligned records in dorado demux and trim
  • 6cc278f - Disable rapid adapter trimming to prevent signal overtrimming in some RBK datasets

v0.6.1

24 Apr 01:29
Compare
Choose a tag to compare

[0.6.1] (23 April 2024)

This release of Dorado fixes bugs in dorado aligner related to using presets incorrectly and in dorado demux which were causing demultiplexed outputs to be malformed.

  • 3e060db - Skip stripping of SQ header lines in dorado demux --no-classify
  • a2abf83 - Fix incorrect overriding of minimap2 options when minimap2 preset is specified
  • 1cc207a - Fix bug causing unclassified records from dorado demux to be unreadable by samtools
  • 2982771 - Fix issue with allocating memory on unused GPU during basecalling
  • fa79f4a - Fix reverse strand alignments when re-mapping a SAM/BAM file with dorado aligner
  • 3b2c825 - Propagate sv tag to split reads
  • 11675a5 - Fix bug where errors were being swallowed in HtsFile class
  • 73046e1 - Fix typo in Warnings.cmake

v0.6.0

02 Apr 13:54
Compare
Choose a tag to compare

[0.6.0] (2 April 2024)

This release of Dorado improves performance for short read basecalling and RBK barcode classification rates, introduces sorted and indexed BAM generation in Dorado aligner and demux, and updates the minimap2 version and default mapping preset. It also adds GPU information to the output BAM or FASTQ and includes several other improvements and bug fixes.

New feature highlights

  1. --emit-summary option to generate summary files from dorado demux and dorado aligner.
  2. Support for loading inputs from/saving outputs to a folder fordorado demux and dorado aligner
  3. --bed-file option in dorado aligner to capture alignments hits in specific intervals of the reference. Hits per read stored in the bh:i tag.
  4. --sort-bam option in dorado demux to output sorted reads when input is sorted and barcodes are not trimmed.

Changes to default behavior

  1. Default mapping preset for dorado aligner updated to lr:hq.
  2. dorado trim and dorado demux now output unaligned records by default (i.e. all alignment information such as tags and headers removed).

Backwards incompatible changes

  1. New scoring parameters for barcode classification to support an updated classification algorithm. Older scoring config files will no longer be compatible.

All key changes

  • dc22d7f - Update method for barcode classification
  • e65eaf4 - Improve basecalling speed on short reads
  • f0b829d - Emit sorted, indexed BAM files from dorado demux and dorado aligner
  • 913f062 - Add DS:gpu information to output FASTQ and SAM/BAM files
  • c459890 - Added support for demux and aligner reading from a folder and a --recursive option
  • d994a4d - Add --emit-summary option to dorado demux and dorado aligner
  • 246b9b9 - Add --bed-file argument to dorado aligner
  • f6b6554 - Add --sort-bam option to dorado demux
  • 9b49ae5 - Update to minimap2-2.27 and use lr:hq as default mapping preset
  • a0f9462 - Add RG and st tags to FASTQ for consistency with BAM
  • ae47155 - Calculate mean Q-score for RNA on bases after the poly(A)
  • 3cf15fa - Trimming rapid adapter from raw signal
  • b40d001 - Improve read splitting for RBK
  • 9d3af87 - Trim low-quality data from reads with end reason mux_change or unblock_mux_change
  • ec106d6 - Improve performance of calling modified bases on NVIDIA GPUs
  • 77c5599 - Improve Apple silicon auto batch sizing
  • b4fdb24 - Fix bug with MM/ML tags not updating correctly with dorado trim
  • bacd354 - Remove invalidated tags if running dorado demux or dorado trim on aligned BAM
  • b6077db - Fix bug with modbase model auto detection on @v0
  • ba0d708 - Ensure ts set to zero if --no-trim or --estimate-poly-a enabled
  • 12c5a3e - Fix duplicate SQ lines in header of aligned BAM
  • 9dc052d - Ensure read group header lines include custom barcodes
  • e8fb085 - Skip barcode trimming when running poly(A) estimation
  • bbe6ad6 - Handle issues related to user locale
  • bdc05e3 - Fix bug using simplex-only model complex and --modified-bases{-models} arguments
  • b31e5c8 - Fix resume loading for split reads
  • 2919fe0 - Fix bug with custom barcode arrangements
  • 98763da - Fix bug when aligner writing to stdout
  • 74b4b53 - Fix regression with calling modified bases on macOS
  • 3929003 - Perform an allocation-less matmul when using torch
  • 6f283a5 - Prevent CUDA OOM due to small allocations
  • 0fa2c2f - Fix Cuda OOM during batch size calculation
  • 7506d44 - Add support for additional barcodes
  • 13ba5af - Add deprecation warning for FAST5
  • b5dc9f8 - Update to Koi v0.4.5
  • c9c5ad0 - Update to POD5 v0.2.4
  • 901f700 - Improve error reporting when the device string is invalid for CUDA devices
  • e3442ec - Log errors reported by Metal and enable warnings
  • e61cfe4 - Output Dorado commandline arguments in logs
  • de59f33 - Move default download path for third-party libraries into the build folder
  • d7defcc - Log a warning message if running on Apple Silicon with less than 16GB RAM
  • 8dfd180 - Consolidate pipeline node input thread handling
  • 4018823 - Update DEV.md to install the correct package

v0.5.3

06 Feb 15:10
Compare
Choose a tag to compare

[0.5.3] (06 Feb 2024)

This release of Dorado fixes a bug causing low Poly(A) estimation rates in RNA.

  • 59a083c - Fix RNA Poly(A) tail estimated in the absence of adapter trimming.
  • f0f9883 - Clarify ns tag in Dorado SAM spec.

v0.5.2

19 Jan 10:00
Compare
Choose a tag to compare

[0.5.2] (18 Jan 2024)

This release of Dorado fixes a bug causing malformed CIGAR strings, prevents crashing when calling modifications with duplex, and improves adapter and primer trimming support.

  • 062e5e3 - Fix malformed CIGAR string for non-primary alignment
  • 0a057bb - Fix duplex modifications crash
  • d453db2 - Add missing support for RAD adapter detection and trimming
  • 8c2d004 - Correctly trim modbase tags for reverse strand alignments
  • 76f24b2 - Update custom barcode documentation
  • 9959654 - Only require standardisation parameters if standardisation is active

v0.5.1

21 Dec 21:49
Compare
Choose a tag to compare

[0.5.1] (21 Dec 2023)

This release of Dorado fixes bugs with adapter trimming and custom barcodes, introduces a more accurate 6mA model, and adds several quality of life improvements.

  • 9a46392 - Replace use of constant with a parameter from custom barcode file.
  • 1893d69 - Decouple basecall library from models library.
  • e42761c - Allow RNA adapter trimming to be skipped.
  • a510d53 - Prevent simultaneous usage of multiple modbase models affecting the same canonical base.
  • 371a252 - Fix incorrect sample count in the ns tag with sequence trimming.
  • 9f532ff - Remove modbase tags for non-primary alignments except when soft clipping is enabled.
  • 52431e6 - Update 6mA model.
  • 7109c1c - Remove superfluous clamp from Metal model implementation.
  • 5fa4de7 - Refactor decoder interfaces.
  • a3dfc94 - Improve README for adapter trimming.
  • 3bfb1f0 - Fix bug with out-of-order primer trimming positions.
  • b1302ae - Allow alignment to be skipped for disconnected clients.
  • 55d09f9 - Update HDF5 pre-built library location.
  • 2048ad5 - Decrease httplib connection timeout.
  • aae47b1 - Refactor codebase to unify interfaces and reduce dependencies.
  • 6ed81c5 - Run separate modbase models in different CUDA streams.
  • decb9e7 - Update build settings to simplify integration into basecall server.
  • e8b07e2 - Report warning and skip FAST5 files when datasets contain FAST5 and POD5 files.
  • 6c984a0 - Enable Xcode builds.
  • 6d31793 - Split Metal LSTM kernel into multiple command buffers.
  • 364d15d - Fix bug with passing custom barcode file into basecaller command.
  • 951e3c3 - Allow read to override adapter/primer trimming defaults.
  • d6e2a80 - Clean up model auto download directories.
  • c552351 - Improve error handling during model auto download.
  • 936d408 - Report incorrect results warning for CPU basecalling on TX2.

v0.5.0

05 Dec 19:30
Compare
Choose a tag to compare

[0.5.0] (5 Dec 2023)

This release of Dorado introduces new, more accurate, and faster v4.3 basecalling models. It also enables hemi-methylation basecalling of duplex reads. Dorado now supports DNA primer and adapter trimming, custom barcode arrangements and sequences, and can automatically select the correct model for your data. Furthermore, this release introduces speed and memory enhancements for basecalling on Apple silicon and various stability improvements.

  • 1415969 - Add v4.3 basecalling models
  • b7d4b38 - Support for modified bases with duplex basecalling (hemi-methylation)
  • 30e639c - Primer and adapter trimming
  • fb85a70 - Enable automatic model selection
  • 16e5b6a - Support for custom barcode arrangements and sequences
  • 46bbfdd - Add barcode column to summary file
  • e9f060c - Improve the precision of read splitting
  • 4102ffc - Increase speed of v4.3 model execution
  • 0a07110 - Prevent progress bar from --resume-from logging excessive dots
  • 20b5637 - Ensure that aligner outputs SAM when not piped to a file
  • 942a35a - Add MN tag to ouput BAM to help downstream tools interpret modified base tags
  • f0ac935 - Added modbase model name to BAM files in RG header section.
  • a7fa371 - Improve performance of HAC and SUP on Apple silicon
  • 152d5fd - Improvements to auto batch sizing on Apple silicon
  • b0767a6 - Fix bug causing segfault with summary command on Windows
  • 1c2c6a9 - Make AVX reverse_complement implementation preserve nucleotide case
  • 4a4dd1c - Use updated Koi functions for small LSTM layers, final convolutional layer in LSTM models, and final linear layer

v0.4.3

14 Nov 19:41
Compare
Choose a tag to compare

[0.4.3] (14 Nov 2023)

This release of Dorado introduces a new RNA m6A modified base model and initial support for poly(A)/poly(T) tail length estimation. It also introduces duplex performance enhancements and bug fixes to improve the stability of Dorado.

  • 803e3a7 - Add RNA m6A DRACH-context model
  • 0f282cd - Add poly(A)/poly(T) tail length estimation support for RNA and cDNA
  • 54e14ca - Add RNA read splitting
  • 2dc1f03 - Enable RNA adapter trimming
  • 80114c0 - Correctly update CIGAR and POS entries when trimming barcodes
  • 4b2025c - Add documentation for sample sheet support
  • 641cb08 - Reduce host memory footprint for duplex basecalling
  • 7c1c0f0 - Reduce working reads size, in particular for duplex.
  • 831f0a9 - Fix pairing check for split reads in duplex basecalling
  • b630567 - Account for split reads during progress tracking
  • 383fe02 - Update to Koi v0.4.1
  • 873c6b1 - Fix warnings about ONLY_C_LOCAL mismatches in PCH builds
  • 52cbabf - Encapsulate date dependency
  • 8fb8a4d - Disable Cutlass LSTM codepath for 128-wide LSTM layers because this kernel does not work
  • 6a9dad9 - Enable warnings as errors at build time
  • 5aaef31 - Address auto batchsize issues on unified memory Linux systems
  • 92b5a67 - Reduce compilation times
  • 062e3fd - Minor speed improvements to CPU beam search

v0.4.2

30 Oct 09:27
Compare
Choose a tag to compare

[0.4.2] (30 Oct 2023)

This release of Dorado fixes a bug with the CpG-context 5mC/5hmC model calling all contexts and adds beta support for using a barcode alias from a sample sheet.

  • 90a4d01 - Fix motif for 5mCG_5hmCG compatible with dna_r10.4.1_e8.2_400bps_sup@v4.2.0
  • 616b951 - Beta support for sample sheet aliasing