v0.2.0
New v2 search functions
The main new feature of this v2 release is search_encoded_patterns, written by @rickbeeloo,
for searching many short and equal length patterns (say 23 bp; should be at most
32 or else 64) in parallel. This is 2x or more faster than the previous search_patterns.
(Improved docs and examples for these new features are still TODO. This release was mostly triggered by breaking changes to the cigar output.)
Breaking changes
- Cigar output now follows the SAM spec and reverses
IandD
compared to before.Inow means the pattern contains a character that is not
in the text.
Further new search functions
v2 also adds new variants of the existing Searcher::search method:
search_patterns: search multiple similar-length patterns in a single text.
Useful for searching in short texts where chunking them in pieces via the
normalsearchhas large overhead. This uses one SIMD lane per pattern.search_texts: search a single pattern in multiple similar-length texts.
This uses one SIMD lane per text.search_many: a more high-level wrapper that takes a list of
patterns and texts and does an all-vs-all search on multiple threads, using
a user-specified underlying algorithm.
Further changes
- feat: Support AVX512 for pattern tiling.
- feat: moved pretty printing from
bin/grep.rsto publicly availableMatch::pretty_print. - feat:
Matchnow containstext_idxandpattern_idxfor multi-search variants. - feat: Add
Searcher::only_best_matchandSearcher::without_trace. - perf: Collect matches into an internal Vec before returning that.
- fix:
sassy grepwould crash on printing reverse complement matches with overhang. - fix: fix issues with duplicate reported matches in overhang
- misc: debug-printing a
Matchnow uses stringified cigar. - misc: Add python bindings for
search_many