Python Jupyter Notebook
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



Build Status Coverage Status

ghost-tree is a bioinformatics tool that combines sequence data from two genetic marker databases into one phylogenetic tree that can be used for diversity analyses. One database is used as a "foundation tree" because it provides better phylogeny across all phyla, and the other database provides finer taxonomic resolution.

For application to ITS, you don't need to install ghost-tree but can use our pre-built trees. The tree that you download needs to match the UNITE database that you used (or plan to use) for your ITS analyses. The trees directory contains pre-built reference phylogenetic trees for the UNITE QIIME reference files available here.

The most recent ghost-trees we've created are in our tree repository.

If you use ghost-tree in published research, please cite our software publication in Microbiome. Thank you!

J. Fouquier, J.R. Rideout, E. Bolyen, J. Chase, A. Shiffer, D. McDonald, R. Knight, J.G. Caporaso, and S.T. Kelley. ghost-tree: creating hybrid-gene phylogenetic trees for diversity analysis. Microbiome. (February 2016) DOI: 10.1186/s40168-016-0153-6

Using ghost-tree.nwk files for your analyses:

To use the ghost-tree.nwk files in scripts, such as in QIIME, you will need to filter your .biom table so that it doesn't contain extra OTUs that will cause to fail. Note: We understand that QIIME isn't the only downstream use for ghost-tree, but this has been a popular user request.

This file,, can be downloaded and used to create ghost_tree_tips.txt output file containing only the accession numbers from the ghost-tree.nwk that you will be using for your diversity analyses. You must have skbio installed to use See scikit-bio for install directions. scikit-bio is very handy! You'll love it.

You will then use ghost_tree_tips.txt output file (containing the accession numbers from the ghost-tree.nwk) to filter your .biom table so that it contains only the OTUs that are in the ghost-tree.nwk that you are using.

The script, will filter your .biom table.

Use the required arguments in and also include the following two arguments: -e, --otu_ids_to_exclude_fp (provide the text file containing OTU ids to exclude) --negate_ids_to_exclude (this will keep OTUs in otu_ids_to_exclude_fp, rather than discard them)

You should then have your filtered .biom table, a ghost-tree.nwk, and a mapping file, which will then allow you to use in QIIME.


If you are an experienced developer or are interested in trying out the ghost-tree tool via command line, then you will need to follow the following directions.

ghost-tree requires two external software tools to build a hybrid-tree or the "ghost-tree":

MUSCLE (Version 3.8.31):

FastTree (Version >2.1.7):

To optionally regroup the extension OTUs (recommended), ghost-tree requires SUMACLUST:

SUMACLUST (Version 1.0.01):

Please install the software and make sure that it is in your PATH variable. To test this you need to be able to type “muscle", "fasttree" or "sumaclust” in the command line and see the corresponding “usage” or “help” documentation for each software tool.

You will then need to "clone" the ghost-tree repository to download all of the necessary files. You can then find the and install it via "pip install -e ."

Typing the command ghost-tree will then display the ghost-tree help page that provides command and subcommand help documentation.

You should also check out our "ipython notebook". See the ipynb install directions here. For a detailed explanation of how to create your own ghost-tree.nwk using the command line tool, see our ghost-tree .ipynb workflow.

This project is currently under active development and may evolve without notice. So, please update your local repository, and check here for changes. If you're interested in contributing, please contact @JTFouquier.


If you have any trouble, please email I am happy to help! :)

I am also interested in improving documentation, so please let me know if you find errors or have suggestions!