Skip to content

Commit

Permalink
Merge branch 'slee/release-v0.4.3' into 'master'
Browse files Browse the repository at this point in the history
Release v0.4.3

See merge request machine-learning/dorado!704
  • Loading branch information
tijyojwad committed Nov 14, 2023
2 parents 9c62776 + 2e14ba5 commit 656766b
Show file tree
Hide file tree
Showing 4 changed files with 35 additions and 11 deletions.
24 changes: 24 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,30 @@

All notable changes to Dorado will be documented in this file.

# [0.4.3] (14 Nov 2023)

This release of Dorado introduces a new RNA m6A modified base model and initial support for poly(A)/poly(T) tail length estimation. It also introduces duplex performance enhancements and bug fixes to improve the stability of Dorado.

* 803e3a7ce2590b1c95b4754117185983ac2ad560 - Add RNA m6A DRACH-context model
* 0f282cde507a36bf91863270bd0323564235c15b - Add poly(A)/poly(T) tail length estimation support for RNA and cDNA
* 54e14ca01e7391c8857989da7db086a4591375a1 - Add RNA read splitting
* 2dc1f039cac7f3e6cd082b77a5b020fed5488e2f - Enable RNA adapter trimming
* 80114c08c45bc902843a2e18b5949ebf5cfefdf2 - Correctly update CIGAR and POS entries when trimming barcodes
* 4b2025c57fd3b87b2ce6cd52be07adfd9ae5acf9 - Add documentation for sample sheet support
* 641cb08b457d727c3da682185c6fe491df49dab2 - Reduce host memory footprint for duplex basecalling
* 7c1c0f04d93113d4dd2c632bdcd242304b54d270 - Reduce working reads size, in particular for duplex.
* 831f0a91f0100c2586720f6026450fdbae1a8d21 - Fix pairing check for split reads in duplex basecalling
* b63056743be6e5442f2f5af65a36c592bbf96184 - Account for split reads during progress tracking
* 383fe0226bfa7956705376ac5e4a32096ff80c45 - Update to Koi v0.4.1
* 873c6b11e0113735b21305afce5057138558388d - Fix warnings about `ONLY_C_LOCAL` mismatches in PCH builds
* 52cbabff83de3c9fb6f1a0db9194828b92418855 - Encapsulate `date` dependency
* 8fb8a4df567ba22df6a298f4e30277a0d47ceaa4 - Disable Cutlass LSTM codepath for 128-wide LSTM layers because this kernel does not work
* 6a9dad907af8dd2b4e556d49a329a8a0fbc5c32c - Enable warnings as errors at build time
* 5aaef312027836ffbd6e2b944e6cd3ba4a259267 - Address auto batchsize issues on unified memory Linux systems
* 92b5a6792fca4d2bb2b76727ec486efe8bdfae97 - Reduce compilation times
* 062e3fd53f58380070efff660303b71c03cd02c0 - Minor speed improvements to CPU beam search


# [0.4.2] (30 Oct 2023)

This release of Dorado fixes a bug with the CpG-context 5mC/5hmC model calling all contexts and adds beta support for using a barcode alias from a sample sheet.
Expand Down
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@ If you encounter any problems building or running Dorado, please [report an issu

## Installation

- [dorado-0.4.2-linux-x64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-linux-x64.tar.gz)
- [dorado-0.4.2-linux-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-linux-arm64.tar.gz)
- [dorado-0.4.2-osx-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-osx-arm64.zip)
- [dorado-0.4.2-win64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.2-win64.zip)
- [dorado-0.4.3-linux-x64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-linux-x64.tar.gz)
- [dorado-0.4.3-linux-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-linux-arm64.tar.gz)
- [dorado-0.4.3-osx-arm64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-osx-arm64.zip)
- [dorado-0.4.3-win64](https://cdn.oxfordnanoportal.com/software/analysis/dorado-0.4.3-win64.zip)

## Platforms

Expand Down Expand Up @@ -201,12 +201,12 @@ SQK-RPB004_barcode03.bam
unclassified.bam
```

#### Using a Sample Sheet
`dorado` is able to use a sample sheet to restrict the barcode classifications to only those present, and to apply aliases to the detected classifications. This is enabled by passing the path to a sample sheet to the `--sample-sheet` argument when using the `basecaller` or `demux` commands. See [here](documentation/SampleSheets.md) for more information.
#### Using a sample sheet
Dorado is able to use a sample sheet to restrict the barcode classifications to only those present, and to apply aliases to the detected classifications. This is enabled by passing the path to a sample sheet to the `--sample-sheet` argument when using the `basecaller` or `demux` commands. See [here](documentation/SampleSheets.md) for more information.

### Poly(A) tail estimation

Dorado has initial support for estimating poly(A) tail lengths for DNA and RNA. Note that Oxford Nanopore cDNA reads sequence in two different orientations and transcript poly(A) length estimation handles both (A and T homopolymers). This feature can be enabled by passing `--estimate-poly-a` to the `basecaller` command. It is disabled by default. The estimated tail length is stored in the `pt:i` tag of the output record. Reads for which the tail length could not be estimated will not have the `pt:i` tag.
Dorado has initial support for estimating poly(A) tail lengths for cDNA and RNA. Note that Oxford Nanopore cDNA reads are sequenced in two different orientations and Dorado poly(A) tail length estimation handles both (A and T homopolymers). This feature can be enabled by passing `--estimate-poly-a` to the `basecaller` command. It is disabled by default. The estimated tail length is stored in the `pt:i` tag of the output record. Reads for which the tail length could not be estimated will not have the `pt:i` tag.

## Available basecalling models

Expand Down Expand Up @@ -273,7 +273,7 @@ Below is a table of the available basecalling models and the modified basecallin
| :-------- | :------- | :--- | :--- |
| **rna004_130bps_fast@v3.0.1** | N/A | N/A | 4 kHz |
| **rna004_130bps_hac@v3.0.1** | N/A | N/A | 4 kHz |
| **rna004_130bps_sup@v3.0.1** | 6mA_DRACH | v1 | 4 kHz |
| **rna004_130bps_sup@v3.0.1** | m6A_DRACH | v1 | 4 kHz |
| rna002_70bps_fast@v3 | N/A | N/A | 3 kHz |
| rna002_70bps_hac@v3 | N/A | N/A | 3 kHz |

Expand Down
2 changes: 1 addition & 1 deletion cmake/DoradoVersion.cmake
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
set(DORADO_VERSION_MAJOR 0)
set(DORADO_VERSION_MINOR 4)
set(DORADO_VERSION_REV 2)
set(DORADO_VERSION_REV 3)

find_package(Git QUIET)
if(GIT_FOUND AND EXISTS "${PROJECT_SOURCE_DIR}/.git")
Expand Down
4 changes: 2 additions & 2 deletions documentation/SampleSheets.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sample Sheet specification
# Sample sheet specification

`dorado` can make use of a MinKNOW-compatible sample sheet containing data used to identify a particular classification of read. To apply a sample sheet, provide the path to the appropriate CSV file using the `--sample-sheet` argument:

Expand All @@ -20,7 +20,7 @@ Note that `dorado` currently uses the sample sheet only for barcode filtering an

In the case of `demux`, the sample sheet must contain a 1-to-1 mapping of `barcode` identifiers to `flow_cell_id`/`position_id` - i.e. all entries in the `barcode` column must be unique.

#### Column Headers
#### Column headers

A sample sheet may only contain the column names below:
| | | |
Expand Down

0 comments on commit 656766b

Please sign in to comment.