Skip to content
Compute and compare MinHash signatures for DNA data sets.
Python Rust C Other
Branch: master
Clone or download
luizirber (build) pin virtualenv version for asv, and also run GH actions on ru…
…st version tags (#903)

* pin virtualenv version for asv
* also build rust release tags
Latest commit 4966055 Feb 13, 2020

Files

Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github (build) pin virtualenv version for asv, and also run GH actions on ru… Feb 13, 2020
.travis Bring in rust codebase from luizirber/sourmash-rust (#593) Jan 3, 2019
benchmarks Compute oxidation (#845) Jan 21, 2020
binder [MRG] update documentation & add notebooks to docs (#631) Mar 7, 2019
data Update documentation with links to new SBTs and a tutorial (#216) May 16, 2017
doc more refactoring of MinHash API (#889) Feb 9, 2020
include more refactoring of MinHash API (#889) Feb 9, 2020
sourmash Fix `lca classify` bug with -o (#902) Feb 11, 2020
sourmash_lib [MRG] Rename library to sourmash (#374) Mar 10, 2018
src/core more refactoring of MinHash API (#889) Feb 9, 2020
tests Fix `lca classify` bug with -o (#902) Feb 11, 2020
utils LGTM setup and initial fixes (#736) Sep 28, 2019
.coveragerc Replacing C++ with Rust (#424) Dec 17, 2019
.git_archival.txt [MRG] use setuptools_scm for version management (#471) Dec 15, 2018
.gitattributes Ignore .sig and .sbt files for repo language stats (#846) Jan 15, 2020
.gitignore more refactoring of MinHash API (#889) Feb 9, 2020
.readthedocs.yml Fix read the docs build (#820) Jan 4, 2020
.travis.yml Fix travis conditions and pip usage (#873) Jan 27, 2020
CITATION.cff add citation output (#617) Jan 11, 2019
CODE_OF_CONDUCT.rst PR template & Coc Jun 8, 2016
CONTRIBUTING.md contributing Jun 10, 2016
Cargo.toml Make the sourmash crate library-only (#812) Jan 5, 2020
LICENSE remove MSU from license Jun 29, 2016
MANIFEST.in Replacing C++ with Rust (#424) Dec 17, 2019
Makefile Fix travis conditions and pip usage (#873) Jan 27, 2020
README.md Make the sourmash crate library-only (#812) Jan 5, 2020
asv.conf.json Fix asv benchmarks (#509) Oct 21, 2019
codemeta.json rename paper.json to codemeta.json Jun 7, 2016
matplotlibrc tests for fig Jun 3, 2016
netlify.toml Replacing C++ with Rust (#424) Dec 17, 2019
paper.bib Hyperlink DOIs to preferred resolver (#562) Oct 27, 2018
paper.md update for v1.0 Sep 13, 2016
pytest.ini [MRG] Rename library to sourmash (#374) Mar 10, 2018
requirements.txt Indexing in Rust (#773) Dec 16, 2019
setup.py more refactoring of MinHash API (#889) Feb 9, 2020
tox.ini Use Python dev mode on 3.7, and build PRs that merge against any bran… Jan 25, 2020

README.md

sourmash

Documentation Build Status PyPI codecov DOI License: 3-Clause BSD


Compute MinHash signatures for nucleotide (DNA/RNA) and protein sequences.

Usage:

sourmash compute *.fq.gz
sourmash compare *.sig -o distances
sourmash plot distances

sourmash 1.0 is published on JOSS; please cite that paper if you use sourmash (doi: 10.21105/joss.00027):.


The name is a riff off of Mash, combined with @ctb's love of whiskey. (Sour mash is used in making whiskey.)

Primary authors: C. Titus Brown (@ctb) and Luiz C. Irber, Jr (@luizirber).

sourmash is a product of the Lab for Data-Intensive Biology at the UC Davis School of Veterinary Medicine.

Installation

We recommend using bioconda to install sourmash:

conda install -c conda-forge -c bioconda sourmash

This will install the latest stable version of sourmash 3.

You can also use pip to install sourmash:

pip install sourmash

A quickstart tutorial is available.

Requirements

sourmash runs under both Python 2.7.x and Python 3.5+. The base requirements are screed and ijson, together with a Rust environment (for the extension code). We suggest using rustup to install the Rust environment:

curl https://sh.rustup.rs -sSf | sh

The comparison code (sourmash compare) uses numpy, and the plotting code uses matplotlib and scipy, but most of the code is usable without these.

For search and gather you also need khmer version 2.1+.

Installation with conda

Bioconda is a channel for the conda package manager with a focus on bioinformatics software. After installing conda you will need to add the bioconda channel as well as the other channels bioconda depends on. Once you have setup bioconda, you can install sourmash by running:

$ conda create -n sourmash_env -c conda-forge -c bioconda sourmash python=3.7
$ source activate sourmash_env
$ sourmash compute -h

which will install the latest alpha release.

Support

Please ask questions and files issues on Github.

Development

Development happens on github at dib-lab/sourmash.

After installation, sourmash is the main command-line entry point; run it with python -m sourmash, or do pip install -e /path/to/repo to do a developer install in a virtual environment.

The sourmash/ directory contains the Python library and command-line interface code.

The src/core/ directory contains the Rust library implementing core functionality.

Tests require py.test and can be run with make test.

Please see the developer notes for more information.


CTB Jan 2020

You can’t perform that action at this time.