**Functional redundancy and resilience**

Examination of functional redundancy and reslience in the context of microbial biodiversity measurements.
1. compare taxonomic and functional diversity in these same samples to evaluate _functional redundancy_ in microbial communities.
2. examine longitudinal change in these biodiversity metrics to measure _resilience_ in microbial communities

- Metadata : sample_meta_data (done in A)
- Diversity analysis : alpha = core-metrics-results (done in F), beta = core-metrics-results-bd (done in G)
- Metagenome content predicted by PICRUST2 : (done in I)


<a id='setup'></a>
## 0. Setup

In [None]:
import os
import qiime2 as q2
import pandas as pd
from qiime2 import Visualization

data_dir = 'w10_data'
if not os.path.isdir(data_dir):
    os.makedirs(data_dir)
    
# do not increase this value!
n_jobs = 3
    
%matplotlib inline

In [None]:
data_dir ='project_data'

In [None]:
md = q2.Metadata.load(data_dir + '/sample_meta_data.tsv').to_dataframe()
pd.DataFrame([str(sorted(md[col].astype(str).unique())) for col in md.columns],
             index=pd.Index(md.columns, name='Column'), columns=['Values'])

<a id='function'></a>
## 1.a Functional redundancy calculation

Next we will look at predicted gene pathway information to compare taxonomic vs. functional diversity patterns. 
We will use the `core-metrics` pipeline on the `pathway_abundance.qza` table, which consists of PICRUST2-predicted gene pathway counts. 

In [None]:
# Will that be an interest to sample the file like we did in exercise with mother / child, according to categorical variable (age?)
! qiime feature-table filter-samples \
    --i-table $data_dir/pathway_abundance.qza \ # adjust
    --m-metadata-file $data_dir/sample_meta_data.tsv \
    --p-where "[mom_or_child]='C'" \ # adjust to variable of interest
    --o-filtered-table $data_dir/pathway_abundance_sorted.qza

! qiime diversity core-metrics \
  --i-table $data_dir/pathway_abundance_sorted.qza \
  --m-metadata-file $data_dir/sample_meta_data.tsv \
  --p-sampling-depth 100000 \
  --p-n-jobs $n_jobs \
  --output-dir $data_dir/core-metrics-picrust2  #?

! qiime diversity adonis \
  --i-distance-matrix $data_dir/core-metrics-picrust2/jaccard_distance_matrix.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-formula 'month*diet*delivery' \ #adjust the formula: should we keep the same long one ?
  --o-visualization $data_dir/core-metrics-picrust2/adonis-results.qzv

In [None]:
Visualization.load(f'{data_dir}/core-metrics-picrust2/bray_curtis_emperor.qzv')

In [None]:
Visualization.load(f'{data_dir}/core-metrics-picrust2/adonis-results.qzv')

<a id='procrustes'></a>
## 1.b Comparing ordinations

One way to compare beta diversity ordination results directly is with [Procrustes analysis](https://en.wikipedia.org/wiki/Procrustes_analysis). This method rotates and scales two ordinations to align them as best as possible. We can view the transformed PCoA coordinates together in a single plot to visually compare the ordinations. 



In [None]:
! qiime diversity procrustes-analysis \
  --i-reference $data_dir/core-metrics-results/jaccard_pcoa_results.qza \ # on prendra sûrement pas jaccard 
  --i-other $data_dir/core-metrics-picrust2/jaccard_pcoa_results.qza \ # la aussi, on change la matrice de distances
  --output-dir $data_dir/core-metrics-picrust2/procrustes/

! qiime emperor procrustes-plot \
  --i-reference-pcoa $data_dir/core-metrics-picrust2/procrustes/transformed_reference.qza \
  --i-other-pcoa $data_dir/core-metrics-picrust2/procrustes/transformed_other.qza \
  --m-metadata-file $data_dir/sample_meta_data.tsv \
  --o-visualization $data_dir/core-metrics-picrust2/procrustes-pcoa-plot.qzv

In [None]:
Visualization.load(f'{data_dir}/core-metrics-picrust2/procrustes-pcoa-plot.qzv')

#here we say how good does this fit look 
#does this indicates that ASV and pathway abundances are very similar or dissimilar

<a id='longitudinal'></a>
## 2.a Longitudinal resilience analysis

Now let's examine resilience. It measures the ability of an ecosystem to recover from disturbance, e.g., by repopulation of the species and ecosystem functions necessary in that system.

We will use the `q2-longitudinal` plugin to examine temporal dynamics, in relation to XXXXX. The XXXXX microbiota serve as a baseline of stabilized adult microbiota to which the XXXXX microbiota can be compared (to measure resilience as the rate of (re-)stabilization).


In [None]:
! qiime feature-table filter-samples \
    --i-table $data_dir/filtered-table-deblur.qza \ # where?
    --m-metadata-file $data_dir/sample_meta_data.tsv \
    --o-filtered-table $data_dir/filtered-table-deblur-with-mothers.qza

# Repeat our core-metrics diversity analysis from before, this time with other category
! qiime diversity core-metrics \
  --i-table $data_dir/filtered-table-deblur-with-mothers.qza \
  --m-metadata-file $data_dir/sample_meta_data.tsv \
  --p-sampling-depth 1800 \
  --p-n-jobs $n_jobs \
  --output-dir $data_dir/core-metrics-with-mothers

# this creates an interactive line plot — useful for looking at changes in alpha and beta diversity across time
! qiime longitudinal volatility \
  --m-metadata-file $data_dir/sample_meta_data.tsv \
  --m-metadata-file $data_dir/core-metrics-with-mothers/observed_features_vector.qza \
  --m-metadata-file $data_dir/core-metrics-with-mothers/shannon_vector.qza \
  --m-metadata-file $data_dir/core-metrics-with-mothers/jaccard_pcoa_results.qza \
  --p-default-group-column 'diet'\
  --p-default-metric 'observed_features' \
  --p-state-column 'month' \
  --p-individual-id-column 'host_subject_id' \
  --o-visualization $data_dir/core-metrics-with-mothers/volatility.qzv

In [None]:
Visualization.load(f'{data_dir}/core-metrics-with-mothers/volatility.qzv')

<a id='lme'></a>
## 2.b Applying a statistical test to longitudinal data

#adjust text for our data if used (not so sure)
From our analysis above, it looks like there is an initial "disruption" in the composition of the microbiota following birth (i.e., the microbiota of infants are very dissimilar to their mothers during an initial chaotic period), and an eventual "return" to normalcy (i.e., the microbiota form a stable community that better resembles an adult gut). However, the rate of return differs between some groups, which we can view here as an indication that some groups are more "resilient" than others (or more properly they develop an adult-like microbiome and stabilize more quickly). 

Here we will use a [linear mixed effects model](https://en.wikipedia.org/wiki/Mixed_model) as a statistical test to examine individual infants' trajectories of development as a comparison of resilience. We will examine developmental trajectories in infants only, not in their mothers, so will use our initial `core-metrics` results. We specify a formula consisting of a dependent variable (here `observed_features`, i.e., ASV richness) and several independent variables and their interactions (`month*diet*delivery`) to test their association with variation in the dependent variable. We also specify random effects to incorporate in the model to account for individual variation in baseline and slope: a random intercept for each individual is included by default, and we specify `month` to include a random slope for each individual. 

The results should indicate that delivery mode has a significant impact on Shannon diversity (see the `delivery[T.Vaginal]` row in the model results section), and that there is a signficant interaction between delivery mode and age on Shannon diversity (see the `month:delivery[T.Vaginal]` row in the model results).


In [None]:
! qiime longitudinal linear-mixed-effects \
  --m-metadata-file $data_dir/metadata.tsv \
  --m-metadata-file $data_dir/core-metrics-results1000/shannon_vector.qza \
  --p-random-effects 'month'\
  --p-formula 'shannon_entropy~month*diet*delivery' \
  --p-state-column 'month' \
  --p-individual-id-column 'host_subject_id' \
  --o-visualization $data_dir/core-metrics-results1000/lme-shannon.qzv

In [None]:
Visualization.load(f'{data_dir}/core-metrics-results1000/lme-shannon.qzv')