Default reference data files for use with QIIME.
Python
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
licenses
qiime_default_reference
.coveragerc
.gitignore
.travis.yml
CHANGELOG.md
LICENSE
MANIFEST.in
README.rst
setup.py

README.rst

qiime-default-reference

Build Status Coverage Status

qiime-default-reference, canonically pronounced chime default reference, is a Python package containing default reference data files for use with QIIME. The current default reference data is compiled from the Greengenes 16S rRNA database version 13_8. Please see the Attribution section below for more details.

Installation

To install qiime-default-reference:

pip install qiime-default-reference

Running the tests

To run qiime-default-reference's unit tests:

nosetests --with-doctest qiime_default_reference

Attribution

The reference data distributed in this Python package were copied from the Greengenes 16S rRNA database:

An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea. McDonald D, Price MN, Goodrich J, Nawrocki EP, DeSantis TZ, Probst A, Andersen GL, Knight R, Hugenholtz P. ISME J. 2012 Mar;6(3):610-8. doi: 10.1038/ismej.2011.139.

The Greengenes reference data is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

The default template alignment column mask (i.e., "Lane mask") is derived from:

Lane,D.J. (1991) 16S/23S rRNA sequencing. In Stackebrandt,E. and Goodfellow,M. (eds), Nucleic Acid Techniques in Bacterial Systematics. John Wiley and Sons, New York, pp. 115-175.

Lane mask was originally available in ARB.

Lane mask taken from here.

Usage

qiime-default-reference makes it very easy to access the Greengenes 97% OTUs and some other key pieces of Greengenes from within Python. After installing, you can do the following:

import qiime_default_reference

# Get the absolute filepath to the reference sequences in fasta format.
qiime_default_reference.get_reference_sequences()

# Get the absolute filepath to the PyNAST template alignment in fasta format.
qiime_default_reference.get_template_alignment()

# Get the absolute filepath to the reference phylogenetic tree in newick format.
qiime_default_reference.get_reference_tree()

# Get the absolute filepath to the reference taxonomy in tab-separated text format.
qiime_default_reference.get_reference_taxonomy()

# Get the alignment column mask (currently the Lane mask) as 1s and 0s.
# This will be str/bytes in Python 2 (they're the same), and bytes in Python 3.
qiime_default_reference.get_template_alignment_column_mask()

Getting Help

Please post your questions about this repository/package on the QIIME Forum.