This is a very fast basecaseller which can basecall reads as fast as they come from MinION on ordinary laptop.
If you find this work useful, please cite (there will be preprint about his updates coming soon):
- Tested only on 64bit linux (external parties made it work on MacOS, see below).
- Only R9.4.1 for now.
- On AMD CPUs it is advised to use:
export MKL_DEBUG_CPU_TYPE=5
- You need python3 (tested with python 3.9). PyO3 package needs at least python3.5.
- In some situations you might need to do
export OMP_NUM_THREADS=1
- Install Rust (programming language, not game and besides you should already have it ;) ). You can view instuctions here: https://www.rust-lang.org/tools/install and just run
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
- Ask for nightly version
rustup default nightly-2021-01-03
(other nightly versions might also work, but this one is tested). - Prepare your conda environment (
conda create python=3.9 --name deepnanoblitz
). You do not have to do this, if you manage your Python packages in different way. And activate it (conda activate deepnanoblitz
). - Clone this repository (
git clone https://github.com/fmfi-compbio/deepnano-blitz.git
) - Go inside the package (
cd deepnano-blitz
) - Run
python setup.py install
- Change Rust version to whatever you like (not needed)
- Prepare your environment (conda, virtualenv, ...) with python3.6.
- Warning outdated: On Linux machine with python3.6 (and CPU with AVX2 instructions) you can use:
pip install dist/deepnano2-0.1-cp36-cp36m-linux_x86_64.whl
We heavily rely on MKL libraries (downloaded from here https://anaconda.org/intel/mkl-static/files via https://github.com/fmfi-compbio/deepnano-blitz/blob/master/build.rs#L17). Changing download URL (and maybe some filenames) should work. If yes, please send us pointer to tested configuration (we will then add platform detection and relevant code branches to master).
Try one off (ordered by increasing accuracy and decresing speed):
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 48 --beam-size 1
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 48
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 56
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 64
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 80
deepnano2_caller.py --output out.fasta --directory reads_directory/ --network-type 96
You can also increase number of threads:
deepnano2_caller.py --output out.fasta --directory reads_directory/ --threads 4
Or ask for fastq output or compressed output:
deepnano2_caller.py --output out.fastq --directory reads_directory/ --output-format fastq
deepnano2_caller.py --output out.fasta.gz --directory reads_directory/ --gzip-output
For more accurate (but much slower) basecalling run:
deepnano2_caller.py --output out.fasta --directory reads_directory/ --threads 16 --network-type 256
You can check the installion via:
deepnano2_caller.py --output testx.fastq --reads test_sample/*.fast5 --network-type 64 --beam-size 5 --threads 1 --output-format fastq
And compare it to test_sample/expected.fastq
import deepnano2
import os
import numpy as np
def med_mad(x, factor=1.4826):
"""
Calculate signal median and median absolute deviation
"""
med = np.median(x)
mad = np.median(np.absolute(x - med)) * factor
return med, mad
def rescale_signal(signal):
signal = signal.astype(np.float32)
med, mad = med_mad(signal)
signal -= med
signal /= mad
return signal
network_type = "48"
beam_size = 5
beam_cut_threshold = 0.01
weights = os.path.join(deepnano2.__path__[0], "weights", "rnn%s.txt" % network_type)
caller = deepnano2.Caller(network_type, weights, beam_size, beam_cut_threshold)
# Minimal size for calling is STEP*3 + PAD*6 + 1 (STEP and PAD are defined in src/lib.rs)
signal = np.random.normal(size=(1000*3+10*6+1))
signal = rescale_signal(signal)
print(caller.call_raw_signal(signal))
When using only very short first part of read (and especially when you lowered the STEP), you might need to cutoff bad part of the read (usually called the stall) using this (trim function is adopted from nanonet):
def trim(signal, window_size=40, threshold_factor=3.0, min_elements=3):
med, mad = med_mad(signal[-(window_size*25):])
threshold = med + mad * threshold_factor
num_windows = len(signal) // window_size
for pos in range(num_windows):
start = pos * window_size
end = start + window_size
window = signal[start:end]
if len(window[window > threshold]) > min_elements:
if window[-1] > threshold:
continue
return end
return 0
# Call this before normalization
start = trim(signal)
signal = signal[start:]
Or alternativelly you can use this triming function (courtesy of https://github.com/Gogis0):
def trim_blank(sig, window=300):
N = len(sig)
variances = [np.var(sig[i:i+window]) for i in range(N//2, N-window, window)]
mean_var = np.mean(variances)
trim_idx = 20
while window > 5:
while np.var(sig[trim_idx: trim_idx + window]) < 0.3*mean_var:
trim_idx += 1
window //= 2
return trim_idx
Run on subset of reads from Klebsiela dataset and also on human dataset.
MinION theoretical maximum is 2M signals/s (which in reality never happens, so realistic number is around 1.5 M signals/s).
Speed is given in signals/sec.
Network type | Laptop 1 core | Laptop 4 cores | Xeon 1 core | Xeon 4 cores | Klebs Mapped % | Klebsiela 10% acc | Klebs median acc | Klebs 90% acc | Human mapped % | Human median acc | Human 90% acc |
---|---|---|---|---|---|---|---|---|---|---|---|
w48, beam1 | 1.4M | 4.4M | 1.0M | 4.1M | 98.6 | 73.0 | 84.0 | 88.8 | 84.3 | 80.6 | 86.8 |
w48, beam5 | 1.3M | 3.8M | 920.6K | 3.8M | 98.8 | 75.2 | 85.1 | 89.5 | 84.9 | 81.6 | 87.7 |
w56, beam5 | 941.8K | 2.8M | 649.7K | 2.6M | 98.9 | 76.1 | 85.9 | 90.2 | 86.1 | 82.3 | 88.5 |
w64, beam5 | 728.8K | 2.1M | 486.3K | 2.1M | 99.0 | 77.2 | 86.6 | 90.8 | 85.5 | 83.4 | 89.3 |
w80, beam5 | 477.6K | 1.5M | 351.5K | 1.4M | 99.0 | 77.3 | 87.3 | 91.6 | 86.1 | 84.3 | 89.8 |
w96, beam5 | 324.1K | 1.0M | 249.1K | 1.0M | 99.3 | 79.0 | 88.4 | 92.4 | 87.4 | 85.9 | 91.0 |
guppy 3.4.4 fast | 87.9K | 328.6K | 66.4K | 264.4K | 99.5 | 79.6 | 88.4 | 92.5 | 89.1 | 85.1 | 91.0 |
guppy 3.4.4 hac | 9.5K | 35.1K | 7.2K | 29.1K | 99.5 | 81.6 | 90.6 | 94.5 | 89.6 | 87.4 | 93.3 |
There is also GPU version of basecaller, which is slightly worse and slower than guppy, but does not require compute capability 6.2 (anything which can run Pytorch 1.0 is good enough).
If you want to use GPU version, we recommend setting up Conda environment with pytorch first.
It can be run like this:
deepnano2_caller_gpu.py --output out.fasta --directory reads_directory/
If you have new GPU (like RTX series) this might be faster:
deepnano2_caller_gpu.py --output out.fasta --directory reads_directory/ --half --batch-size 2048
Note, that we are using 16 threads. Results on klebsiella dataset.
Basecaller | Time to basecall | 10%-percentile accuracy | Median accuracy | 90%-percentile accuracy |
---|---|---|---|---|
Guppy 3.3.0 hac, 16 threads XEON E5-2695 v4 | 18m 16s | 82.5% | 89.8% | 93.8% |
DN-blitz big, 16 threads XEON E5-2695 v4 | 12m 5s | 82.1% | 89.1% | 93.2% |
DN-blitz big, 16 threads XEON E5-2695 v4, beam | 12m 30s | 83.4% | 89.8% | 93.6% |
-------------------------------------------------- | ---- | -------- | ---- | ------- |
Guppy 3.3.0 hac, Titan XP GPU | 36s | 82.5% | 89.8% | 93.8% |
DN-blitz big, Titan XP GPU | 54s | 82.2% | 89.2% | 93.2% |
-------------------------------------------------- | ---- | -------- | ---- | ------- |
DN-blitz big, Quadro M1200 GPU (laptop) | 4m 5s | 82.2% | 89.2% | 93.2% |
Guppy 3.3.0 hac, Quadro M1200 GPU (laptop) | N/A | N/A | N/A | N/A |
TODO: beam search for GPU version and RTX results