BURST v0.99.5a
New:
- Miscellaneous fixes
- New CUBICLUST clustering engine (disabled by default; must build with -D CUBICLUST to enable). Much higher quality clusters in much less RAM, but very time-consuming.
- New alignment mode: "ANY". This introduces a new guarantee: if any valid match above the desired threshold exists in the database, report any one such match, even if not a best or minimizing match. Slightly faster for contaminant screening.
- Compressive database generation: Now reduces database size when many redundant genomes are present by dynamically adjusting shearing points and deduplicating repeats. This is a lossless abstraction -- all information about location and original references is retained, and no alignments are jeapordized. Additionally, this mode is faster than previous sheared DB methods if run without fingerprints/clustering. Enable with
-d DNA -s
(shearing is required for this mode). - The --dbpartition flag has been introduced to save memory when using compressive genomics databases. Using N partitions will consume ~1/N the RAM but at the expense of some duplication detection (the search range is limited to within partitions). Alignment quality is unaffected.
Binaries for old systems (pre-AVX) or Macs are available on request (if they don't come soon).