# 07/04/2020

Source: https://github.com/picrust/picrust2/wiki/q2-picrust2-Tutorial

1. PICRUSt2

In [1]:
cd /xdisk/tfaily/mig2020/extra/nathaliagg/sulfate_experiment/microbial_16S/qiime2

### Load module and environment

In [2]:
module load anaconda

 /cm/local/apps/environment-modules/4.0.0//bin /cm/shared/uaapps/pbspro/19.2.4/sbin /cm/shared/uaapps/pbspro/19.2.4/bin
 /cm/shared/uaapps/pbspro/19.2.4/share/man


In [3]:
source activate qiime2-2019.10

(qiime2-2019.10) 

: 1

### 1. PICRUSt

The required inputs are `--i-table` and `--i-seq`, which need to correspond to QIIME2 artifacts of types `FeatureTable[Frequency]` and `FeatureData[Sequence]`, respectively. The Feature Table needs to contain the abundances of ASVs (i.e., a BIOM table) and the sequence file needs to be a FASTA file containing the sequences for each ASV.

The most important options available in the plugin are for selecting the number of threads (`--p-threads`), the hidden-state prediction (HSP) method (`--p-hsp-method`), and the maximum NSTI value (`--p-max-nsti`).

The `--p-max-nsti` option specifies how distantly placed a sequence needs to be in the reference phylogeny before it is excluded. The default cut-off is 2. In human datasets used for testing PICRUSt2 the only ASVs above this default cut-off were 18S sequences erroneously in 16S datasets, which suggests this cut-off is highly lenient. For environmental datasets a higher proportion of ASVs may be thrown out based on this default cut-off.

Note in this case the `--p-hsp-method pic` the phylogenetic independent contrast hidden-state prediction method is indicated since it is fastest. However, it's recommend that in practice users use the `--p-hsp-method mp` method.

In [5]:
qiime picrust2 full-pipeline \
   --i-table denoise_dada2/table.qza \
   --i-seq denoise_dada2/rep-seqs.qza \
   --output-dir q2-picrust2_output \
   --p-threads 26 \
   --p-hsp-method pic \
   --p-max-nsti 2 \
   --verbose


This is the set of poorly aligned input sequences to be excluded: 17cd9435a2ca03715c8da1928300828a, e9febe97c5ec827cde0eec0447f87085, 00caa959b22c156dc1211c0e29468729, bad1661ff25ea838d6ed821038878a9a, d17a87b1cb92a9aaeaa86ea1de877712, ea433e1cd3c065ae5036637f193256c4, 2ec424c155efd3dca0caf5d1ce9c3fe5, bdc590da8ddc2a8f359cc2c257239cae, 8db5477fa52649386c151336eb75de39, 3a96524e83d010667e404a77a77c4c1a, 4d8b84b9971f85397a9077c5e850844f, 04aab49fd34a8db1de8ca1057ca74a09, c38c8730e9075351c6ed2477281e191d, 8e43dffc2d273426d6a18c1c9c1373b4, 6b10feb901fde9d2433b0ffc037e45b5, 3c286b5e68db34d1869991ef925a56d1, 695d5aa9d781c8aa113f98b11bb6a9d7, eba19505d7c9449912e29855a9ab8732, 20e9e3037b68aee3edad26522213117f, 1179757c4106e868e132599c5d18b14f, 0c5783fa09ebfd27d65f5c0d8ce4234d, 21c29e25241c6aebdcedddaa8b2ba727, 1511bb0c00dcb06012e623074eef470a, 0aa51061c994844ad7e34c881363b066, 913e03cecf71b4291a8a0fa37c10e3b5, c69dee280cbc552a65c43ff76fb7b94d, 849900c569bfb864cbb4da62fa3e7834, a31dd8cd8afcf11

: 1

The output files in are:

- ec_metagenome.qza - EC metagenome predictions (rows are EC numbers and columns are samples).  
- ko_metagenome.qza - KO metagenome predictions (rows are KOs and columns are samples).  
- pathway_abundance.qza - MetaCyc pathway abundance predictions (rows are pathways and columns are samples).  


The artifacts are all of type `FeatureTable[Frequency]`, which means they can be used with QIIME2 plugins that process and analyze these datatypes.

For instance, summary information which you can view like any QIIME2 visualization:

In [6]:
qiime feature-table summarize \
   --i-table q2-picrust2_output/pathway_abundance.qza \
   --o-visualization q2-picrust2_output/pathway_abundance.qzv
   
qiime feature-table summarize \
   --i-table q2-picrust2_output/ko_metagenome.qza \
   --o-visualization q2-picrust2_output/ko_metagenome.qzv
   
qiime feature-table summarize \
   --i-table q2-picrust2_output/ec_metagenome.qza \
   --o-visualization q2-picrust2_output/ec_metagenome.qzv

[32mSaved Visualization to: q2-picrust2_output/pathway_abundance.qzv[0m
(qiime2-2019.10) (qiime2-2019.10) [32mSaved Visualization to: q2-picrust2_output/ko_metagenome.qzv[0m
(qiime2-2019.10) (qiime2-2019.10) [32mSaved Visualization to: q2-picrust2_output/ec_metagenome.qzv[0m
(qiime2-2019.10) 

: 1

Note that this file is not in units of relative abundance (e.g. percent) and is instead the sum of the predicted functional abundance contributed by each ASV multiplied by the abundance (the number of input reads) of each ASV.

The above metagenome predictions can be integrated into a number of QIIME2 analysis. For instance, you can quickly calculate diversity metrics based on these tables. The first quartile sample pathway abundance found above was 3,583,038, so we will rarify to this cut-off when calculating the core diversity metrics:

In [7]:
qiime diversity core-metrics \
   --i-table q2-picrust2_output/pathway_abundance.qza \
   --p-sampling-depth 3583038 \
   --m-metadata-file metadata.tsv \
   --output-dir path-abun_core_metrics_out \
   --p-n-jobs 5

qiime diversity core-metrics \
   --i-table q2-picrust2_output/ko_metagenome.qza \
   --p-sampling-depth 30928844 \
   --m-metadata-file metadata.tsv \
   --output-dir ko-abun_core_metrics_out \
   --p-n-jobs 5
   
qiime diversity core-metrics \
   --i-table q2-picrust2_output/ec_metagenome.qza \
   --p-sampling-depth 16857683 \
   --m-metadata-file metadata.tsv \
   --output-dir ec-abun_core_metrics_out \
   --p-n-jobs 5
   


[32mSaved FeatureTable[Frequency] to: path-abun_core_metrics_out/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: path-abun_core_metrics_out/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: path-abun_core_metrics_out/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: path-abun_core_metrics_out/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: path-abun_core_metrics_out/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: path-abun_core_metrics_out/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: path-abun_core_metrics_out/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: path-abun_core_metrics_out/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: path-abun_core_metrics_out/jaccard_emperor.qzv[0m
[32mSaved Visualization to: path-abun_core_metrics_out/bray_curtis_emperor.qzv[0m
(qiime2-2019.10) (qiime2-2019.10) [32mSaved FeatureTable[Frequency] to: ko-abun_core_metrics_out/rarefied_table

: 1

Take a look at `pathabun_core_metrics_out/bray_curtis_emperor.qzv` in the QIIME2 viewer. You should see a ordination plot.

Users are typically most interested in the predicted KEGG orthologs and MetaCyc pathways. If you want to use the tables outside of QIIME 2 you can convert the files to be BIOM format. For example, you can run this command to convert the pathway abundance table to BIOM format:

In [9]:
qiime tools export \
   --input-path q2-picrust2_output/pathway_abundance.qza \
   --output-path pathway_picrust_exported

# biom convert because they get saved by feature-table.biom
biom convert \
   -i pathway_picrust_exported/feature-table.biom \
   -o pathway_picrust_exported/pathway_abundance.tsv \
   --to-tsv
   
qiime tools export \
   --input-path q2-picrust2_output/ec_metagenome.qza \
   --output-path ec_picrust_exported
   
biom convert \
   -i ec_picrust_exported/feature-table.biom \
   -o ec_picrust_exported/ec_metagenome.tsv \
   --to-tsv
   
qiime tools export \
   --input-path q2-picrust2_output/ko_metagenome.qza \
   --output-path ko_picrust_exported
   
biom convert \
   -i ko_picrust_exported/feature-table.biom \
   -o ko_picrust_exported/ko_metagenome.tsv \
   --to-tsv

[32mExported q2-picrust2_output/pathway_abundance.qza as BIOMV210DirFmt to directory pathway_picrust_exported[0m
(qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) [32mExported q2-picrust2_output/ec_metagenome.qza as BIOMV210DirFmt to directory ec_picrust_exported[0m
(qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) [32mExported q2-picrust2_output/ko_metagenome.qza as BIOMV210DirFmt to directory ko_picrust_exported[0m
(qiime2-2019.10) (qiime2-2019.10) (qiime2-2019.10) 

: 1

Amazing! 