Generating shuffled predictions

It can be helpful to compare the PICRUSt2 output tables with tables based on shuffling the predictions across all amplicon sequence variants (ASVs). The script shuffle_predictions.py was added in v2.4.0 to make this task easier. This script randomizes the ASV labels for all predicted genomes (so all the same individual predicted genomes are the same - they just are linked to different ASV abundances across samples).

This is how you could run the command with the tutorial data:

shuffle_predictions.py -i EC_predicted.tsv.gz \
                           -o EC_predicted_shuffled \
                           -r 5 \
                           -s 131

Where -r specifies how many random replicates to make and -s 131 specifies a random seed so that the same shuffled tables will be output reproducibly if this seed were used again.

The gene family and pathway-level prediction tables can then be generated from these shuffled tables by running the standard PICRUSt2 commands. Below is an example of how to quickly run metagenome_pipeline.py and pathway_pipeline.py on all shuffled tables with a bash loop.

# Make folders for shuffled output
mkdir EC_metagenome_out_shuffled
mkdir pathways_out_shuffled

for i in {1..5}; do
    
    # Define in and out file paths.
    EC_SHUFFLED="EC_predicted_shuffled/EC_predicted_shuf"$i".tsv.gz"
    OUT_META="EC_metagenome_out_shuffled/rep"$i
    OUT_PATHWAYS="pathways_out_shuffled/rep"$i
    
    # PICRUSt2 scripts to get prediction abundance tables for gene and pathway levels, respectively.
    metagenome_pipeline.py -i ../table.biom -m marker_predicted_and_nsti.tsv.gz -f $EC_SHUFFLED \
                       -o $OUT_META \
                       --strat_out
    
     pathway_pipeline.py -i $OUT_META/pred_metagenome_contrib.tsv.gz \
                         -o $OUT_PATHWAYS \
                         -p 1
done

These shuffled tables are especially helpful to get a baseline for how the predicted functional data differentiates samples (e.g. based on ordination or differential abundance testing) when the predicted ASV genomes are assigned randomly.

Please first check our FAQ if you have any questions about PICRUSt2.

For other general questions and comments about PICRUSt2 please search the PICRUSt google group. If the question has not been previously answered then please make a new thread.

To report a bug or to make a feature request please make a new issue at the top of this page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating shuffled predictions

Home

Major bug reports and announcements

Key limitations

Installation

Workflow

Tutorial

QIIME 2 plugin

Validation with paired metagenomes

FAQ

Clone this wiki locally