Skip to content
Gavin Douglas edited this page Apr 6, 2021 · 30 revisions

PICRUSt2 (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) is a software for predicting functional abundances based only on marker gene sequences. Check out the paper here.

"Function" usually refers to gene families such as KEGG orthologs and Enzyme Classification numbers, but predictions can be made for any arbitrary trait. Similarly, predictions are typically based on 16S rRNA gene sequencing data, but other marker genes can also be used.

On this wiki you will find descriptions of the scripts, installation instructions, and workflows. See the right side-bar for details.

PICRUSt2 includes these and other improvements over the original version:

  • Allow users to predict functions for any 16S sequences. Representative sequences from OTUs or amplicon sequence variants (e.g. DADA2 and deblur output) can be used as input by taking a sequence placement approach
  • Database of reference genomes used for prediction has been expanded by >10X.
  • Addition of hidden-state prediction algorithms from the castor R package.
  • Allows output of MetaCyc ontology predictions that will be comparable with common shotgun metagenomics outputs.
  • Inference of pathway abundances now relies on MinPath, which makes these predictions more stringent.

PICRUSt2 Flowchart

Citations

PICRUSt2 wraps a number of tools to generate functional predictions from amplicon sequences. The PICRUSt2 paper can be found here. However, if you use PICRUSt2 you also need to cite the below tools.

For phylogenetic placement of reads:

For hidden state prediction:

For pathway inference:

  • MinPath (paper, website) - A modified version of this tool from the HMP project is packaged with PICRUSt2. This tool was released under the GNU General Public License.