In [3]:
import pandas as pd, numpy as np
from emperor import Emperor, nbinstall
from skbio import OrdinationResults

For phylogenetic diversity, we need to generate a rooted tree with deblur sotu sequences so that ancestral relatedness is captured 

In [20]:
! qiime feature-table tabulate-seqs \
  --i-data ../rep_sotus.qza \
  --o-visualization ../rep_sotus.qzv

[32mSaved Visualization to: ../rep_sotus.qzv[0m


In [21]:
! qiime alignment mafft \
  --i-sequences ../rep_sotus.qza \
  --o-alignment ../aligned_rep_sotus.qza

[32mSaved FeatureData[AlignedSequence] to: ../aligned_rep_sotus.qza[0m


In [22]:
! qiime alignment mask \
  --i-alignment ../aligned_rep_sotus.qza \
  --o-masked-alignment ../masked_aligned_rep_sotus.qza

[32mSaved FeatureData[AlignedSequence] to: ../masked_aligned_rep_sotus.qza[0m


In [23]:
! qiime phylogeny fasttree \
  --i-alignment ../masked_aligned_rep_sotus.qza \
  --o-tree ../unrooted-deblur-tree.qza

[32mSaved Phylogeny[Unrooted] to: ../unrooted-deblur-tree.qza[0m


In [24]:
! qiime phylogeny midpoint-root \
  --i-tree ../unrooted-deblur-tree.qza \
  --o-rooted-tree ../rooted-deblur-tree.qza

[32mSaved Phylogeny[Rooted] to: ../rooted-deblur-tree.qza[0m


Next, we rarefy the deblur sotu table to an even sampling depth so that technical variation between sequencing efforts does not mask biological signal. 
Here we use QIIME2's core-metrics method. Sampling depth was chosen to be 2000 after interactive inspection of deblur sotu table in QIIME2 viewer. 182 samples (#sequences > 2000) were retained

In [27]:
! qiime diversity core-metrics \
  --i-phylogeny ../rooted-deblur-tree.qza \
  --i-table ../deblur_table_sotus_ucase_hdf5.qza \
  --p-sampling-depth 2000 \
  --output-dir ../core-metrics-2k-results

[32mSaved SampleData[AlphaDiversity] to: ../core-metrics-2k-results/faith_pd_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../core-metrics-2k-results/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../core-metrics-2k-results/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../core-metrics-2k-results/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ../core-metrics-2k-results/unweighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../core-metrics-2k-results/weighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../core-metrics-2k-results/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../core-metrics-2k-results/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ../core-metrics-2k-results/unweighted_unifrac_pcoa_results.qza[0m
[32mSaved PCoAResults to: ../core-metrics-2k-results/weighted_unifrac_pcoa_results.qza[0m
[32mSaved PCoAResults to: ../core-metrics-2k-results/jacc

In [31]:
! qiime emperor plot --help

Usage: qiime emperor plot [OPTIONS]

  Generate visualization of your ordination.

Options:
  --i-pcoa PATH           Artifact: PCoAResults  [required]
                          The principal
                          coordinates matrix to be plotted.
  --m-metadata-file PATH  Metadata file or artifact viewable as metadata. This
                          option may be supplied multiple times to merge
                          metadata  [required]
                          The sample metadata.
  --p-custom-axis TEXT    [optional]
                          A sample metadata category containing
                          continuous values that should be included as an axis
                          in the Emperor plot.
  --o-visualization PATH  Artifact: Visualization  [required if not passing
                          --output-dir]
  --output-dir DIRECTORY  Output unspecified results to a directory
  --cmd-config PATH       Use config file for command options
  --verbo

In [None]:
# mf = load_mf('keyboard/mapping-file.txt')
# res = OrdinationResults.read('keyboard/unweighted-unifrac.even1000.txt')
# x = Emperor(res, mf, remote=True)
# x

We’ll first test for associations between discrete metadata categories and **alpha diversity** data

In [45]:
! qiime diversity alpha-group-significance \
  --i-alpha-diversity ../core-metrics-2k-results/faith_pd_vector.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --o-visualization ../core-metrics-2k-results/faith-pd-group-significance.qzv

[32mSaved Visualization to: ../core-metrics-2k-results/faith-pd-group-significance.qzv[0m


In [46]:
! qiime diversity alpha-group-significance \
  --i-alpha-diversity ../core-metrics-2k-results/evenness_vector.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --o-visualization ../core-metrics-2k-results/evenness-group-significance.qzv

[32mSaved Visualization to: ../core-metrics-2k-results/evenness-group-significance.qzv[0m


We can visualize the clustering between samples by converting **beta diveristy** results in q2 visualizers

In [36]:
! qiime emperor plot --i-pcoa ../core-metrics-2k-results/unweighted_unifrac_pcoa_results.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --o-visualization ../core-metrics-2k-results/unweighted_unifrac_pcoa_results.qzv \
  --output-dir ../core-metrics-2k-results/unweighted_vis_dump

[32mSaved Visualization to: ../core-metrics-2k-results/unweighted_unifrac_pcoa_results.qzv[0m


In [38]:
! qiime emperor plot --i-pcoa ../core-metrics-2k-results/unweighted_unifrac_pcoa_results.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --p-custom-axis age \
  --o-visualization ../core-metrics-2k-results/unweighted_unifrac_pcoa_results_age.qzv \
  --output-dir ../core-metrics-2k-results/unweighted_age_vis_dump

[32mSaved Visualization to: ../core-metrics-2k-results/unweighted_unifrac_pcoa_results_age.qzv[0m


In [40]:
! qiime emperor plot --i-pcoa ../core-metrics-2k-results/weighted_unifrac_pcoa_results.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --o-visualization ../core-metrics-2k-results/weighted_unifrac_pcoa_results.qzv \
  --output-dir ../core-metrics-2k-results/weighted_vis_dump

[32mSaved Visualization to: ../core-metrics-2k-results/weighted_unifrac_pcoa_results.qzv[0m


In [41]:
! qiime emperor plot --i-pcoa ../core-metrics-2k-results/weighted_unifrac_pcoa_results.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --p-custom-axis age \
  --o-visualization ../core-metrics-2k-results/weighted_unifrac_pcoa_results_age.qzv \
  --output-dir ../core-metrics-2k-results/weighted_age_vis_dump

[32mSaved Visualization to: ../core-metrics-2k-results/weighted_unifrac_pcoa_results_age.qzv[0m


Next we’ll analyze sample composition in the context of discrete metadata using PERMANOVA (first described in Anderson (2001)) using the beta-group-significance command.

In [51]:
! qiime diversity beta-group-significance \
  --i-distance-matrix ../core-metrics-2k-results/unweighted_unifrac_distance_matrix.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --m-metadata-category exposure_type \
  --o-visualization ../core-metrics-2k-results/unweighted-unifrac-exposure-group-significance.qzv \
  --p-pairwise

[32mSaved Visualization to: ../core-metrics-2k-results/unweighted-unifrac-exposure-group-significance.qzv[0m


In [52]:
! qiime diversity beta-group-significance \
  --i-distance-matrix ../core-metrics-2k-results/unweighted_unifrac_distance_matrix.qza \
  --m-metadata-file ../haddad_6week_metadata_sampMatch.txt \
  --m-metadata-category cage_number \
  --o-visualization ../core-metrics-2k-results/unweighted-unifrac-cage-group-significance.qzv \
  --p-pairwise

[32mSaved Visualization to: ../core-metrics-2k-results/unweighted-unifrac-cage-group-significance.qzv[0m


We also save a refied version of the biom file for downstream usage (matching samples etc.)

In [1]:
! qiime feature-table rarefy --help

Usage: qiime feature-table rarefy [OPTIONS]

  Subsample frequencies from all samples without replacement so that the sum
  of frequencies in each sample is equal to sampling-depth.

Options:
  --i-table PATH              Artifact: FeatureTable[Frequency]  [required]
                              The feature table to be rarefied.
  --p-sampling-depth INTEGER  [required]
                              The total frequency that each sample
                              should be rarefied to. Samples where the sum of
                              frequencies is less than the sampling depth will
                              be not be included in the resulting table.
  --o-rarefied-table PATH     Artifact: FeatureTable[Frequency]  [required if
                              not passing --output-dir]
                              The resulting rarefied
                              feature table.
  --output-dir DIRECTORY      Output unspecified results to a directory
  --cmd-

In [2]:
! qiime feature-table rarefy --i-table ../deblur_table_sotus_ucase_hdf5.qza --p-sampling-depth 2000 \
  --o-rarefied-table ../deblur_table_sotus_ucase_hdf5_rare2k.qza

[32mSaved FeatureTable[Frequency] to: ../deblur_table_sotus_ucase_hdf5_rare2k.qza[0m


** exporting below **

Based in visual inspection, PCoA based on unweighted unifrac show meaningful clustering.

In [2]:
! qiime tools export \
  ../core-metrics-2k-results/unweighted_unifrac_distance_matrix.qza \
  --output-dir ../core-metrics-2k-results/unweighted_unifrac_distance_matrix.txt

rarefied biom table

In [4]:
! qiime tools export \
  ../deblur_table_sotus_ucase_hdf5_rare2k.qza \
  --output-dir ../deblur_table_sotus_ucase_hdf5_rare2k

In [None]:
! cp ../deblur_table_sotus_ucase_hdf5_rare2k/feature-table.biom ../deblur_table_sotus_ucase_hdf5_rare2k.biom

In [None]:
! biom convert -i ../deblur_table_sotus_ucase_hdf5_rare2k.biom