Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.
sourmash compute *.fq.gz sourmash compare *.sig -o distances sourmash plot distances
Sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (
We recommend using bioconda to install sourmash:
conda install sourmash
This will install the latest stable version of sourmash 2.
You can also use pip to install sourmash:
pip install sourmash
A quickstart tutorial is available.
sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a C++ development environment and the CPython development headers and libraries (for the C++ extension).
The comparison code (
sourmash compare) uses numpy, and the plotting
code uses matplotlib and scipy, but most of the code is usable without
gather you also need
khmer version 2.1+.
Installation with conda
Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:
$ conda create -n sourmash_env sourmash python=3.6.4 $ source activate sourmash_env $ sourmash compute -h
which will install the latest alpha release.
Please ask questions and files issues on Github.
Development happens on github at dib-lab/sourmash.
sourmash is the main command-line entry point;
run it with
python -m sourmash, or do
pip install -e /path/to/repo to
do a developer install in a virtual environment.
sourmash/ directory contains the library code.
Tests require py.test and can be run with
Please see the developer notes for more information.
CTB Dec 2018