Skip to content

MHAP v1.5b1

Compare
Choose a tag to compare
@skoren skoren released this 23 Feb 18:25
· 9 commits to v1.5b1 since this release

Major updates:

  • Eliminate repetitive k-mer filtering in index lookup, why filter k-mers when you can down-weight them.
  • Increased performance of ordered k-mer second stage filter.

Changelog:

  • Implemented weighted (discretized td-idf) MinHashing in first-stage filter.
  • Random subsampling in second-stage filter.
  • k-mer size is now unlimited.
  • Reduced memory footprint and disk footprint of binary sketch representation, allowing a larger set of sequences to fit in memory.

Known Issues:

  • If no repeat k-mer filter is specified, MHAP will use an experimental implementation of a count-min sketch to identify repeat k-mers and down-weight them. This option has not been full tested and may not always work. Users should always specify a filter file using the -f option.

Please see documentation at http://mhap.readthedocs.org/en/