Assets 4

Features

  • Triangular matrix - A new triangle command computes a lower triangular
    distance matrix in relaxed Phylip format. This streamlines all-pairs distance
    commands and avoids computational redundancy.
  • Custom IDs - The ID and Comment fields of a sketch can now
    be set with -I and -C. Only applies to the first sketch for multi-sketch files.
  • Read pooling - If multiple input files are given in read mode (-r), e.g.
    paired ends as in mash sketch -r read1.fq read2.fq, they will now pool to the same sketch, avoiding the need for concatenation.

@ondovb ondovb released this Sep 23, 2017 · 34 commits to master since this release

Assets 4

Mash's first major version increment focuses on a new top level command, screen, which estimates containment within (rather than distance to) a read set for many sketches simultaneously.

Features

  • Screen - A new command that estimates how well sketches are contained
    within a set of reads.
  • Hash seed parameter - The seed of the hash function can now be set with -S. Note that if it is changed from the default (42), any sketches created will not work with older versions of Mash (they will appear to old versions to have no sketches, causing an error or empty output).

Fixes

  • Fix for large sketch files (Issue #16)
  • Fix for multi-threaded sketching (Issue #41)

@ondovb ondovb released this Aug 24, 2016 · 117 commits to master since this release

Assets 4

Features

  • JSON sketch dumps - Sketches can now be converted to text in JSON format
    for interoperability with other tools. Metadata, such as k-mer and hash function
    information, are included with the hashes themselves, which are represented as
    unsigned integers.

Fixes

  • Fix for stdin sketch input (Issue #32)

@ondovb ondovb released this Apr 16, 2016 · 133 commits to master since this release

Assets 4

Features

  • Read sketching
    • Minimum k-mer copy number (-m) for more precise and flexible filtering
      than Bloom filter.
    • Genome size and coverage estimation for improved p-values and optional
      termination at sufficient coverage (-c).
  • Parallelism
    • Parallel sketching (-p), if more than one sketch is being created.
    • Parallel distance for all comparisons, not just multiple files.
  • Alphabets
    • Amino acid (-a) and arbitrary alphabets (-z).
    • Case sensitivity option (-Z), which allows lowercase masking.
  • Information
    • A new bounds command for printing expected accuracy for various
      parameters and distances.
    • K-mer copy number histogram available from mash info (-c) for
      sketches made with this version.
    • Tabular mode (-t) and more header information (-H) for mash info.

Fixes

  • Large sketch file crash (Issue #16)
  • Large pairwise comparison crash (Issue #18)

@ondovb ondovb released this Nov 4, 2015 · 194 commits to master since this release

Assets 4
  • fix for buffer overflow that causes "stack smashing" error with recent GCC versions (issue #15)