Compute and compare MinHash signatures for DNA data sets.
Standard ML Python Scala C++ Jupyter Notebook Makefile
Clone or download
luizirber 2.0.0a9
- avoid calling node.data (#516)
  This makes `sourmash gather` ~40% faster...
- make sourmash compatible with khmer 3 (#508)
  Even tho khmer 3 it's not released, there is an alpha version on bioconda
- manylinux1 wheels and travis build improvements (#507)
- PyPI fixes (#504)
Latest commit 988459c Jul 23, 2018
Permalink
Failed to load latest commit information.
.github PR template & Coc Jun 8, 2016
.travis [MRG] manylinux1 wheels and travis build improvements (#507) Jun 27, 2018
benchmarks Refactor sourmash_lib to remove the Estimators class (#154) Apr 10, 2017
data Update documentation with links to new SBTs and a tutorial (#216) May 16, 2017
demo update notebook Sep 13, 2016
doc [MRG] PyPI fixes (#504) Jun 25, 2018
sourmash 2.0.0a9 Jul 23, 2018
sourmash_lib [MRG] Rename library to sourmash (#374) Mar 10, 2018
tests migrate command (update old indexes) (#494) Jun 22, 2018
third-party updated gitignore Apr 18, 2016
utils Debug and fix SBT search (#484) Jun 5, 2018
.coveragerc Refactor sourmash_lib to remove the Estimators class (#154) Apr 10, 2017
.gitignore [MRG] Rename library to sourmash (#374) Mar 10, 2018
.pyup.yml create pyup.io config file (#235) May 19, 2017
.travis.yml [MRG] manylinux1 wheels and travis build improvements (#507) Jun 27, 2018
CODE_OF_CONDUCT.rst PR template & Coc Jun 8, 2016
CONTRIBUTING.md contributing Jun 10, 2016
Dockerfile ok, better understanding Jun 12, 2016
LICENSE remove MSU from license Jun 29, 2016
MANIFEST.in [MRG] Rename library to sourmash (#374) Mar 10, 2018
Makefile [MRG] manylinux1 wheels and travis build improvements (#507) Jun 27, 2018
README.md [MRG] PyPI fixes (#504) Jun 25, 2018
asv.conf.json Remove comments from config, add a simple target in the Makefile for … Jan 20, 2017
codemeta.json rename paper.json to codemeta.json Jun 7, 2016
index.ipynb update for binder Jun 11, 2016
matplotlibrc tests for fig Jun 3, 2016
paper.bib Fixing bibtex entry (missing commas) Sep 14, 2016
paper.md update for v1.0 Sep 13, 2016
pytest.ini [MRG] Rename library to sourmash (#374) Mar 10, 2018
requirements.txt update khmer version req Dec 3, 2017
setup.py make sourmash compatible with khmer 3 (#508) Jun 27, 2018
tox.ini [MRG] manylinux1 wheels and travis build improvements (#507) Jun 27, 2018

README.md

sourmash

Documentation Build Status codecov DOI

Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

We have demo notebooks on binder that you can interact with:

Binder

Sourmash is published on JOSS.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We currently recommend installing the 2.0 pre-release series. You can use pip to do that like so:

pip install --pre sourmash

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a C++ development environment and the CPython development headers and libraries (for the C++ extension).

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env sourmash python=3.6.4
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the library code.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB June 2018