The following conda environment for Qiime 2 was used for these analyses:

>qiime2-2022.2

The following commands were run using the command line interface (CLI) of Qiime 2.

# Import Taxa table

In [15]:
!qiime tools import \
 --type 'FeatureData[Taxonomy]' \
 --input-path ./Qiime_input_data/taxa.txt \
 --output-path taxonomy.qza

[32mImported ./Qiime_input_data/taxa.txt as TSVTaxonomyDirectoryFormat to taxonomy.qza[0m
[0m

# Import PT data

In [16]:
!qiime tools import \
--type 'FeatureTable[Frequency]' \
--input-path ./Qiime_input_data/countMerged_PT_HMS.biom \
--input-format BIOMV100Format \
--output-path countMerged_PT_HMS.qza

[32mImported ./Qiime_input_data/countMerged_PT_HMS.biom as BIOMV100Format to countMerged_PT_HMS.qza[0m
[0m

In [17]:
!qiime tools import \
--type 'FeatureTable[Frequency]' \
--input-path ./Qiime_input_data/countMerged_PT_Broad.biom \
--input-format BIOMV100Format \
--output-path countMerged_PT_Broad.qza

[32mImported ./Qiime_input_data/countMerged_PT_Broad.biom as BIOMV100Format to countMerged_PT_Broad.qza[0m
[0m

In [18]:
!qiime tools import \
--type 'FeatureTable[Frequency]' \
--input-path ./Qiime_input_data/countMerged_PT_MDA.biom \
--input-format BIOMV100Format \
--output-path countMerged_PT_MDA.qza

[32mImported ./Qiime_input_data/countMerged_PT_MDA.biom as BIOMV100Format to countMerged_PT_MDA.qza[0m
[0m

# Import BDN data

In [19]:
!qiime tools import \
--type 'FeatureTable[Frequency]' \
--input-path ./Qiime_input_data/countMerged_BDN_HMS.biom \
--input-format BIOMV100Format \
--output-path countMerged_BDN_HMS.qza

[32mImported ./Qiime_input_data/countMerged_BDN_HMS.biom as BIOMV100Format to countMerged_BDN_HMS.qza[0m
[0m

# Deicode for PT (beta diversity that does not require rarefaction)

Martino et al. 2019. mSystems. See the following links:
- https://journals.asm.org/doi/10.1128/mSystems.00016-19
- https://forum.qiime2.org/t/robust-aitchison-pca-beta-diversity-with-deicode/8333

### Harvard Medical School (HMS)

In [20]:
!qiime deicode rpca \
    --i-table countMerged_PT_HMS.qza \
    --o-biplot ./Deicode_outputs/deicode_ordination_HMS.qza \
    --o-distance-matrix ./Deicode_outputs/deicode_distance_HMS.qza

[32mSaved PCoAResults % Properties('biplot') to: ./Deicode_outputs/deicode_ordination_HMS.qza[0m
[32mSaved DistanceMatrix to: ./Deicode_outputs/deicode_distance_HMS.qza[0m
[0m

In [21]:
!qiime emperor biplot \
    --i-biplot ./Deicode_outputs/deicode_ordination_HMS.qza \
    --m-sample-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
    --m-feature-metadata-file taxonomy.qza \
    --o-visualization ./Deicode_outputs/deicode_biplot_HMS.qzv \
    --p-number-of-features 1

[32mSaved Visualization to: ./Deicode_outputs/deicode_biplot_HMS.qzv[0m
[0m

In [22]:
!qiime diversity adonis \
--i-distance-matrix ./Deicode_outputs/deicode_distance_HMS.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--p-formula "investigation" \
--p-n-jobs 8 \
--o-visualization ./Deicode_outputs/deicode_adonis_HMS.qzv

[32mSaved Visualization to: ./Deicode_outputs/deicode_adonis_HMS.qzv[0m
[0m

### MD Anderson (MDA)

In [23]:
!qiime deicode rpca \
    --i-table countMerged_PT_MDA.qza \
    --o-biplot ./Deicode_outputs/deicode_ordination_MDA.qza \
    --o-distance-matrix ./Deicode_outputs/deicode_distance_MDA.qza

[32mSaved PCoAResults % Properties('biplot') to: ./Deicode_outputs/deicode_ordination_MDA.qza[0m
[32mSaved DistanceMatrix to: ./Deicode_outputs/deicode_distance_MDA.qza[0m
[0m

In [24]:
!qiime emperor biplot \
    --i-biplot ./Deicode_outputs/deicode_ordination_MDA.qza \
    --m-sample-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
    --m-feature-metadata-file taxonomy.qza \
    --o-visualization ./Deicode_outputs/deicode_biplot_MDA.qzv \
    --p-number-of-features 1

[32mSaved Visualization to: ./Deicode_outputs/deicode_biplot_MDA.qzv[0m
[0m

In [25]:
!qiime diversity adonis \
    --i-distance-matrix ./Deicode_outputs/deicode_distance_MDA.qza \
    --m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
    --p-formula "investigation" \
    --p-n-jobs 8 \
    --o-visualization ./Deicode_outputs/deicode_adonis_MDA.qzv

[32mSaved Visualization to: ./Deicode_outputs/deicode_adonis_MDA.qzv[0m
[0m

### Broad Institute WGS (Broad_WGS)

In [26]:
!qiime deicode rpca \
    --i-table countMerged_PT_Broad.qza \
    --o-biplot ./Deicode_outputs/deicode_ordination_Broad_WGS.qza \
    --o-distance-matrix ./Deicode_outputs/deicode_distance_Broad_WGS.qza

[32mSaved PCoAResults % Properties('biplot') to: ./Deicode_outputs/deicode_ordination_Broad_WGS.qza[0m
[32mSaved DistanceMatrix to: ./Deicode_outputs/deicode_distance_Broad_WGS.qza[0m
[0m

In [27]:
!qiime emperor biplot \
    --i-biplot ./Deicode_outputs/deicode_ordination_Broad_WGS.qza \
    --m-sample-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
    --m-feature-metadata-file taxonomy.qza \
    --o-visualization ./Deicode_outputs/deicode_biplot_Broad_WGS.qzv \
    --p-number-of-features 1

[32mSaved Visualization to: ./Deicode_outputs/deicode_biplot_Broad_WGS.qzv[0m
[0m

In [28]:
!qiime diversity adonis \
    --i-distance-matrix ./Deicode_outputs/deicode_distance_Broad_WGS.qza \
    --m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
    --p-formula "investigation" \
    --p-n-jobs 8 \
    --o-visualization ./Deicode_outputs/deicode_adonis_Broad_WGS.qzv

[32mSaved Visualization to: ./Deicode_outputs/deicode_adonis_Broad_WGS.qzv[0m
[0m

# Deicode for BDN (beta diversity that does not require rarefaction)

Martino et al. 2019. mSystems. See the following links:
- https://journals.asm.org/doi/10.1128/mSystems.00016-19
- https://forum.qiime2.org/t/robust-aitchison-pca-beta-diversity-with-deicode/8333

### Harvard Medical School (HMS)

In [29]:
!qiime deicode rpca \
    --i-table countMerged_BDN_HMS.qza \
    --o-biplot ./Deicode_outputs/deicode_ordination_HMS_BDN.qza \
    --o-distance-matrix ./Deicode_outputs/deicode_distance_HMS_BDN.qza

[32mSaved PCoAResults % Properties('biplot') to: ./Deicode_outputs/deicode_ordination_HMS_BDN.qza[0m
[32mSaved DistanceMatrix to: ./Deicode_outputs/deicode_distance_HMS_BDN.qza[0m
[0m

In [30]:
!qiime emperor biplot \
    --i-biplot ./Deicode_outputs/deicode_ordination_HMS_BDN.qza \
    --m-sample-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
    --m-feature-metadata-file taxonomy.qza \
    --o-visualization ./Deicode_outputs/deicode_biplot_HMS_BDN.qzv \
    --p-number-of-features 1

[32mSaved Visualization to: ./Deicode_outputs/deicode_biplot_HMS_BDN.qzv[0m
[0m

In [31]:
!qiime diversity adonis \
    --i-distance-matrix ./Deicode_outputs/deicode_distance_HMS_BDN.qza \
    --m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
    --p-formula "investigation" \
    --p-n-jobs 8 \
    --o-visualization ./Deicode_outputs/deicode_adonis_HMS_BDN.qzv

[32mSaved Visualization to: ./Deicode_outputs/deicode_adonis_HMS_BDN.qzv[0m
[0m

# Qiime 2 Core Metrics - PT
Note: Since beta diversity was calculated above using DEICODE above, we are mostly interested in the alpha diversity results. A common rarefaction depth was selected near the 1st quartile of the sample read distribution.

### Harvard Medical School (HMS)
Note the following sample read distribution from R:
```
> summary(rowSums(countMerged_PT_HMS))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
      1.0     618.5    1799.0   51852.0   17551.5 1863068.0 
```

In [35]:
!qiime diversity core-metrics \
--i-table countMerged_PT_HMS.qza \
--p-sampling-depth 600 \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--output-dir ./core_metrics_pt_hms/

[32mSaved FeatureTable[Frequency] to: ./core_metrics_pt_hms/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_hms/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_hms/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_hms/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_hms/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_hms/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_hms/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_hms/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: ./core_metrics_pt_hms/jaccard_emperor.qzv[0m
[32mSaved Visualization to: ./core_metrics_pt_hms/bray_curtis_emperor.qzv[0m
[0m

In [36]:
!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_hms/observed_features_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--o-visualization ./core_metrics_pt_hms/observed_features_vector_significance.qzv

!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_hms/shannon_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--o-visualization ./core_metrics_pt_hms/shannon_vector_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_hms/observed_features_vector_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_hms/shannon_vector_significance.qzv[0m
[0m

In [37]:
!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_hms/jaccard_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_hms/jaccard_distance_matrix_significance.qzv

!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_hms/bray_curtis_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_HMS.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_hms/bray_curtis_distance_matrix_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_hms/jaccard_distance_matrix_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_hms/bray_curtis_distance_matrix_significance.qzv[0m
[0m

### MD Anderson (MDA)
Note the following sample read distribution from R:
```
> summary(rowSums(countMerged_PT_MDA))
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
   157.0    337.5    492.0  27181.9   1731.5 758959.0 
```

In [38]:
!qiime diversity core-metrics \
--i-table countMerged_PT_MDA.qza \
--p-sampling-depth 330 \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
--output-dir ./core_metrics_pt_mda/

[32mSaved FeatureTable[Frequency] to: ./core_metrics_pt_mda/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_mda/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_mda/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_mda/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_mda/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_mda/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_mda/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_mda/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: ./core_metrics_pt_mda/jaccard_emperor.qzv[0m
[32mSaved Visualization to: ./core_metrics_pt_mda/bray_curtis_emperor.qzv[0m
[0m

In [39]:
!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_mda/observed_features_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
--o-visualization ./core_metrics_pt_mda/observed_features_vector_significance.qzv

!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_mda/shannon_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
--o-visualization ./core_metrics_pt_mda/shannon_vector_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_mda/observed_features_vector_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_mda/shannon_vector_significance.qzv[0m
[0m

In [40]:
!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_mda/jaccard_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_mda/jaccard_distance_matrix_significance.qzv

!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_mda/bray_curtis_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_MDA.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_mda/bray_curtis_distance_matrix_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_mda/jaccard_distance_matrix_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_mda/bray_curtis_distance_matrix_significance.qzv[0m
[0m

### Broad Institute WGS only (Broad WGS)
Note the following sample read distribution from R (it is lower than the other seq centers):
```
> summary(rowSums(countMerged_PT_Broad))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
     77     396     733    8981    4636   81314
```

In [41]:
!qiime diversity core-metrics \
--i-table countMerged_PT_Broad.qza \
--p-sampling-depth 400 \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
--output-dir ./core_metrics_pt_broad_WGS/

[32mSaved FeatureTable[Frequency] to: ./core_metrics_pt_broad_WGS/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_broad_WGS/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_broad_WGS/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_pt_broad_WGS/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_broad_WGS/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_pt_broad_WGS/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_broad_WGS/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: ./core_metrics_pt_broad_WGS/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/jaccard_emperor.qzv[0m
[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/bray_curtis_emperor.qzv[0m
[0m

In [42]:
!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_broad_WGS/observed_features_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
--o-visualization ./core_metrics_pt_broad_WGS/observed_features_vector_significance.qzv

!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_pt_broad_WGS/shannon_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
--o-visualization ./core_metrics_pt_broad_WGS/shannon_vector_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/observed_features_vector_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/shannon_vector_significance.qzv[0m
[0m

In [44]:
!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_broad_WGS/jaccard_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_broad_WGS/jaccard_distance_matrix_significance.qzv

!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_pt_broad_WGS/bray_curtis_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_PT_Broad.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_pt_broad_WGS/bray_curtis_distance_matrix_significance.qzv

[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/jaccard_distance_matrix_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_pt_broad_WGS/bray_curtis_distance_matrix_significance.qzv[0m
[0m

# Qiime 2 Core Metrics - BDN
Note: Since beta diversity was calculated above using DEICODE above, we are mostly interested in the alpha diversity results. A common rarefaction depth was selected near the 1st quartile of the sample read distribution.

### Harvard Medical School (HMS)
Note the following sample read distribution from R:
```
> summary(rowSums(countMerged_BDN_HMS))
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
    164.0     600.5    1607.0   22539.8    4843.0 1154723.0
```

In [45]:
!qiime diversity core-metrics \
--i-table countMerged_BDN_HMS.qza \
--p-sampling-depth 600 \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
--output-dir ./core_metrics_bdn_hms/

[32mSaved FeatureTable[Frequency] to: ./core_metrics_bdn_hms/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_bdn_hms/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_bdn_hms/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ./core_metrics_bdn_hms/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_bdn_hms/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ./core_metrics_bdn_hms/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ./core_metrics_bdn_hms/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: ./core_metrics_bdn_hms/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: ./core_metrics_bdn_hms/jaccard_emperor.qzv[0m
[32mSaved Visualization to: ./core_metrics_bdn_hms/bray_curtis_emperor.qzv[0m
[0m

In [46]:
!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_bdn_hms/observed_features_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
--o-visualization ./core_metrics_bdn_hms/observed_features_vector_significance.qzv

!qiime diversity alpha-group-significance \
--i-alpha-diversity ./core_metrics_bdn_hms/shannon_vector.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
--o-visualization ./core_metrics_bdn_hms/shannon_vector_significance.qzv

[32mSaved Visualization to: ./core_metrics_bdn_hms/observed_features_vector_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_bdn_hms/shannon_vector_significance.qzv[0m
[0m

In [47]:
!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_bdn_hms/jaccard_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_bdn_hms/jaccard_distance_matrix_significance.qzv

!qiime diversity beta-group-significance \
--i-distance-matrix ./core_metrics_bdn_hms/bray_curtis_distance_matrix.qza \
--m-metadata-file ./Qiime_input_data/metadataSamplesMerged_BDN_HMS.txt \
--m-metadata-column investigation \
--o-visualization ./core_metrics_bdn_hms/bray_curtis_distance_matrix_significance.qzv

[32mSaved Visualization to: ./core_metrics_bdn_hms/jaccard_distance_matrix_significance.qzv[0m
[0m[32mSaved Visualization to: ./core_metrics_bdn_hms/bray_curtis_distance_matrix_significance.qzv[0m
[0m

# Figures

- Outputted .qzv files can be viewed on https://view.qiime2.org/ using Emperor (https://biocore.github.io/emperor/) specifically to see the beta diversity PCoA plots and capture screenshots