Using BLAST databases now requires a preprocessing step using the command prepdb. The command line is: diamond prepdb -d /path/to/database. This call runs quickly and will write some small auxiliary files into the database directory.
Improved performance of searching small query files.
Added the "iterative" search mode (option --iterate) to search the query dataset with increasing sensitivity, only searching queries at the target sensitivity that do not produce a significant alignment at a lower sensitivity search. For example, using --sensitive --iterate will first search the query file at default sensitivity, and search all query sequences again in --sensitive mode that fail to align in the first round.
Added the "global ranking" mode (option -g) to set a limit on the number of Smith Waterman extensions per query, with the target sequences ranked by their ungapped extension scores.
Added the --fast sensitivity mode that is faster and less sensitive than the default mode.
Reduced the time for loading target sequences from BLAST databases.
Added the contiguous-seed mode (option --algo ctg) to improve performance for small query files.
Added support for using --comp-based-stats (3,4) in combination with --ext full.
Fixed a bug that could cause a Traceback error when using --comp-based-stats (3,4) in rare cases.
Changed the full_sseq output field to always contain unmasked sequences.
Fixed an issue that could cause target output order to be nondeterministic in case of identically scoring hits.
Added support for reading zstd-compressed input files (auto-detected) and writing zstd-compressed output files (option --compress zstd) (requires compilation using cmake -DWITH_ZSTD=ON).
Compilation with BLAST database support requires the zstd library.
Added error message when reading protein sequences from FASTA files that only contain DNA letters (can be disabled using --ignore-warnings).
Added support for directly using BLAST database files instead of the Diamond-formatted .dmnd database files. This feature is not yet available through all release channels. It can currently be accessed by downloading the GitHub release version or by compiling from source. Taxonomy features are not yet supported for BLAST databases.
Added the option --seqidlist to filter the database by sequence accession (only supported for BLAST databases).
Fixed a bug that caused the --dbsize option not to function correctly.
Changed the computation of expected values to use the method described in Park, Y., Sheetlin, S., Ma, N. et al. New finite-size correction for local alignment score distributions. BMC Res Notes5, 286 (2012).
Enabled the use of a custom scoring matrix without having to specify the statistical parameters (option --custom-matrix).
Added support for compositional matrix adjust as described in Yi-Kuo Yu, Stephen F. Altschul, The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions, Bioinformatics, Volume 21, Issue 7, 1 April 2005, Pages 902–911. Three additional modes have been added that can be enabled by setting --comp-based-stats (2,3,4) (the feature is not enabled by default and does not support translated searches at the moment).
Fixed a bug that could cause incorrect alignment coordinates, gaps counts and sequence identities being reported by diamond view.
Targets are sorted by bit score instead of e-value in the alignment output when the --top parameter is used.
Disabled support of custom scoring matrices for the DAA format.
Fixed a bug that caused the use of a custom scoring matrix not to function correctly.
Fixed an issue that caused the portable binary not to function on systems that did not support AVX.
Added the option --no-unlink to prevent unlinking of temporary files.