In [1]:
import os
import qiime2 as q2
import pandas as pd
from qiime2 import Visualization

data_dir = 'poop_data'
if not os.path.isdir(data_dir):
    os.makedirs(data_dir)
    
# do not increase this value!
n_jobs = 3
    
%matplotlib inline

<a id='function'></a>
## Functional redundancy: GEN_sex

In [16]:
! qiime feature-table filter-samples \
    --i-table $data_dir/Differential_abundance/table_abund_species.qza \
    --m-metadata-file $data_dir/metadata.tsv \
    --p-where "[GEN_sex]='female'" \
    --o-filtered-table $data_dir/Functional/pathway_abundance_female.qza

[32mSaved FeatureTable[Frequency] to: poop_data/Functional/pathway_abundance_female.qza[0m
[0m

In [18]:
! qiime diversity core-metrics \
  --i-table $data_dir/Functional/pathway_abundance_female.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-sampling-depth 10000 \
  --p-n-jobs $n_jobs \
  --output-dir $data_dir/Functional/core-metrics-picrust2-GEN_sex

[32mSaved FeatureTable[Frequency] to: poop_data/Functional/core-metrics-picrust2-GEN_sex/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_sex/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_sex/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_sex/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-picrust2-GEN_sex/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-picrust2-GEN_sex/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-picrust2-GEN_sex/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-picrust2-GEN_sex/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: poop_data/Functional/core-metrics-picrust2-GEN_sex/

In [19]:
! qiime diversity adonis \
  --i-distance-matrix $data_dir/Functional/core-metrics-picrust2-GEN_sex/jaccard_distance_matrix.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-formula 'HEA_ibd*HEA_cdiff*HEA_antibiotic_history*HEA_bowel_movement_quality' \
  --o-visualization $data_dir/Functional/core-metrics-picrust2-GEN_sex/adonis-results.qzv

[32mSaved Visualization to: poop_data/Functional/core-metrics-picrust2-GEN_sex/adonis-results.qzv[0m
[0m

In [20]:
Visualization.load(f'{data_dir}/Functional/core-metrics-picrust2-GEN_sex/jaccard_emperor.qzv')

In [21]:
Visualization.load(f'{data_dir}/Functional/core-metrics-picrust2-GEN_sex/adonis-results.qzv')

<a id='function'></a>
## Functional redundancy: GEN_bmi_cat

In [23]:
! qiime feature-table filter-samples \
    --i-table $data_dir/Differential_abundance/table_abund_species.qza \
    --m-metadata-file $data_dir/metadata.tsv \
    --p-where "[GEN_bmi_cat]='Overweight'" \
    --o-filtered-table $data_dir/Functional/pathway_abundance_Overweight.qza

[32mSaved FeatureTable[Frequency] to: poop_data/Functional/pathway_abundance_Overweight.qza[0m
[0m

In [25]:
! qiime diversity core-metrics \
  --i-table $data_dir/Functional/pathway_abundance_Overweight.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-sampling-depth 10000 \
  --p-n-jobs $n_jobs \
  --output-dir $data_dir/Functional/core-metrics-picrust2-GEN_bmi_cat

[32mSaved FeatureTable[Frequency] to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: poop_data/Functiona

In [26]:
! qiime diversity adonis \
  --i-distance-matrix $data_dir/Functional/core-metrics-picrust2-GEN_bmi_cat/jaccard_distance_matrix.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-formula 'HEA_ibd*HEA_cdiff*HEA_antibiotic_history*HEA_bowel_movement_quality' \
  --o-visualization $data_dir/Functional/core-metrics-picrust2-GEN_bmi_cat/adonis-results.qzv

[32mSaved Visualization to: poop_data/Functional/core-metrics-picrust2-GEN_bmi_cat/adonis-results.qzv[0m
[0m

In [27]:
Visualization.load(f'{data_dir}/Functional/core-metrics-picrust2-GEN_bmi_cat/jaccard_emperor.qzv')

In [28]:
Visualization.load(f'{data_dir}/Functional/core-metrics-picrust2-GEN_bmi_cat/adonis-results.qzv')

<a id='longitudinal'></a>
## Longitudinal resilience analysis: GEN_sex

In [31]:
! qiime diversity core-metrics \
  --i-table $data_dir/Differential_abundance/table_abund_F_M_spec.qza\
  --m-metadata-file $data_dir/metadata.tsv \
  --p-sampling-depth 1800 \
  --p-n-jobs $n_jobs \
  --output-dir $data_dir/Functional/core-metrics-Gen_sex-with-male

[32mSaved FeatureTable[Frequency] to: poop_data/Functional/core-metrics-Gen_sex-with-male/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-Gen_sex-with-male/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-Gen_sex-with-male/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: poop_data/Functional/core-metrics-Gen_sex-with-male/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-Gen_sex-with-male/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: poop_data/Functional/core-metrics-Gen_sex-with-male/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-Gen_sex-with-male/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: poop_data/Functional/core-metrics-Gen_sex-with-male/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: poop_data/Functional/core-metrics-Gen_sex-w

In [9]:
# this creates an interactive line plot — useful for looking at changes in alpha and beta diversity across time
! qiime longitudinal volatility \
  --m-metadata-file $data_dir/metadata.tsv \
  --m-metadata-file $data_dir/Functional/core-metrics-Gen_sex-with-male/observed_features_vector.qza \
  --m-metadata-file $data_dir/Functional/core-metrics-Gen_sex-with-male/shannon_vector.qza \
  --m-metadata-file $data_dir/Functional/core-metrics-Gen_sex-with-male/jaccard_pcoa_results.qza \
  --p-default-group-column 'diet'\
  --p-default-metric 'observed_features' \
  --p-state-column 'month' \
  --p-individual-id-column 'host_subject_id' \
  --o-visualization $data_dir/core-metrics-with-mothers/volatility.qzv

[32mSaved FeatureTable[Frequency] to: w10_data/filtered-table-deblur-with-mothers.qza[0m
[0mUsage: [94mqiime diversity core-metrics[0m [OPTIONS]

  Applies a collection of diversity metrics (non-phylogenetic) to a feature
  table.

[1mInputs[0m:
  [94m[4m--i-table[0m ARTIFACT [32mFeatureTable[Frequency][0m
                       The feature table containing the samples over which
                       diversity metrics should be computed.        [35m[required][0m
[1mParameters[0m:
  [94m[4m--p-sampling-depth[0m INTEGER
    [32mRange(1, None)[0m     The total frequency that each sample should be
                       rarefied to prior to computing diversity metrics.
                                                                    [35m[required][0m
  [94m[4m--m-metadata-file[0m METADATA...
    (multiple          The sample metadata to use in the emperor plots.
     arguments will    
     be merged)                                                     [35m[

<a id='lme'></a>
## 4.1 applying a statistical test to longitudinal data

We will probably not get to this in class, but this section will show you how to apply a statistical test to quantitatively answer some of the questions posed above regarding longitudinal variation and resilience. You only need to run the code below and examine the results; there are no questions for you to answer.

From our analysis above, it looks like there is an initial "disruption" in the composition of the microbiota following birth (i.e., the microbiota of infants are very dissimilar to their mothers during an initial chaotic period), and an eventual "return" to normalcy (i.e., the microbiota form a stable community that better resembles an adult gut). However, the rate of return differs between some groups, which we can view here as an indication that some groups are more "resilient" than others (or more properly they develop an adult-like microbiome and stabilize more quickly). 

Here we will use a [linear mixed effects model](https://en.wikipedia.org/wiki/Mixed_model) as a statistical test to examine individual infants' trajectories of development as a comparison of resilience. We will examine developmental trajectories in infants only, not in their mothers, so will use our initial `core-metrics` results. We specify a formula consisting of a dependent variable (here `observed_features`, i.e., ASV richness) and several independent variables and their interactions (`month*diet*delivery`) to test their association with variation in the dependent variable. We also specify random effects to incorporate in the model to account for individual variation in baseline and slope: a random intercept for each individual is included by default, and we specify `month` to include a random slope for each individual. 

You can read more about the QIIME 2 implementation of this test, and its interpretation, [here](https://docs.qiime2.org/2022.8/tutorials/longitudinal/#linear-mixed-effect-models).

Run the code below. The results should indicate that delivery mode has a significant impact on Shannon diversity (see the `delivery[T.Vaginal]` row in the model results section), and that there is a signficant interaction between delivery mode and age on Shannon diversity (see the `month:delivery[T.Vaginal]` row in the model results).


In [38]:
! qiime longitudinal linear-mixed-effects \
  --m-metadata-file $data_dir/metadata.tsv \
  --m-metadata-file $data_dir/Diversity/core-metrics-results/shannon_vector.qza \
  --p-random-effects 'month'\
  --p-formula 'shannon_entropy~month*diet*delivery' \
  --p-state-column 'month' \
  --p-individual-id-column 'host_subject_id' \
  --o-visualization $data_dir/core-metrics-results/lme-shannon.qzv



[32mSaved Visualization to: w10_data/core-metrics-results/lme-shannon.qzv[0m
[0m

In [18]:
Visualization.load(f'{data_dir}/core-metrics-results/lme-shannon.qzv')