Skip to content

Releases: marbl/MHAP

MHAP 2.1.3

12 Mar 17:21
Compare
Choose a tag to compare

This is a minor update:

  • added ability to store full sequence ID in binary format (previously this was only supported with fasta inputs)
  • improve allocation strategy for repeat filtering so large sets of filtered k-mers don't cause long startup times

MHAP 2.1.1

04 Oct 21:45
Compare
Choose a tag to compare

Fixes for incorrectly processed tf-idf flags.

MHAP 2.1

13 Jul 18:09
Compare
Choose a tag to compare

Changelog:

  • Added repeat aggression flag (--repeat-weight) that controls how aggressively the transition is from no suppression to maximum suppression.
  • Added an option to also supress rare k-mers (--supress-noise), defined as k-mers not listed in the k-mer filter file (-f).
  • Various bugs

MHAP 2.0

02 Mar 19:26
Compare
Choose a tag to compare

Changelog:

  • Up to 10X speedup (5x average).
  • Second-stage filter is now a bottom-sketch rather than random sampling, improving memory usage and speed.
  • Distance (1-identity) is reported from Jaccard score using the mash distance
  • Complete switch to fastutil collections API for speed/memory improvements.
  • Maven build system to consolidate into single jar and remove lib directory dependency.
  • Code cleanup.
  • Bug fixes.

MHAP 1.6

25 May 14:19
Compare
Choose a tag to compare

Changelog:

  • Improved weighting (discretized td-idf) in first-stage filter.
  • Code cleanup, leading to 15% speedup
  • Support for bzip2/gzip input files
  • Bug fixes

MHAP v1.5b1

23 Feb 18:25
Compare
Choose a tag to compare

Major updates:

  • Eliminate repetitive k-mer filtering in index lookup, why filter k-mers when you can down-weight them.
  • Increased performance of ordered k-mer second stage filter.

Changelog:

  • Implemented weighted (discretized td-idf) MinHashing in first-stage filter.
  • Random subsampling in second-stage filter.
  • k-mer size is now unlimited.
  • Reduced memory footprint and disk footprint of binary sketch representation, allowing a larger set of sequences to fit in memory.

Known Issues:

  • If no repeat k-mer filter is specified, MHAP will use an experimental implementation of a count-min sketch to identify repeat k-mers and down-weight them. This option has not been full tested and may not always work. Users should always specify a filter file using the -f option.

Please see documentation at http://mhap.readthedocs.org/en/

MHAP v1.0

23 Feb 17:54
Compare
Choose a tag to compare

Minor update. Changelog:

  • Fix issues #3 and #4
  • Update overlap validation in EstimateROC
  • Minor speed improvements

Please see documentation at http://mhap.readthedocs.org/en/

MHAP v0.1

13 Jul 23:52
Compare
Choose a tag to compare
MHAP v0.1 Pre-release
Pre-release

First release of MHAP. Please see documentation at http://mhap.readthedocs.org/en/