Skip to content

v0.2.0

Choose a tag to compare

@github-actions github-actions released this 23 Feb 15:07
· 83 commits to master since this release

New v2 search functions

The main new feature of this v2 release is search_encoded_patterns, written by @rickbeeloo,
for searching many short and equal length patterns (say 23 bp; should be at most
32 or else 64) in parallel. This is 2x or more faster than the previous search_patterns.

(Improved docs and examples for these new features are still TODO. This release was mostly triggered by breaking changes to the cigar output.)

Breaking changes

  • Cigar output now follows the SAM spec and reverses I and D
    compared to before. I now means the pattern contains a character that is not
    in the text.

Further new search functions

v2 also adds new variants of the existing Searcher::search method:

  • search_patterns: search multiple similar-length patterns in a single text.
    Useful for searching in short texts where chunking them in pieces via the
    normal search has large overhead. This uses one SIMD lane per pattern.
  • search_texts: search a single pattern in multiple similar-length texts.
    This uses one SIMD lane per text.
  • search_many: a more high-level wrapper that takes a list of
    patterns and texts and does an all-vs-all search on multiple threads, using
    a user-specified underlying algorithm.

Further changes

  • feat: Support AVX512 for pattern tiling.
  • feat: moved pretty printing from bin/grep.rs to publicly available Match::pretty_print.
  • feat: Match now contains text_idx and pattern_idx for multi-search variants.
  • feat: Add Searcher::only_best_match and Searcher::without_trace.
  • perf: Collect matches into an internal Vec before returning that.
  • fix: sassy grep would crash on printing reverse complement matches with overhang.
  • fix: fix issues with duplicate reported matches in overhang
  • misc: debug-printing a Match now uses stringified cigar.
  • misc: Add python bindings for search_many