#### 04. Alpha Diversity 

Author: Willem Fuetterer


In this Jupyter Notebook the alpha diversity of the samples is analyzed.

**Exercise overview:**<br>
[1. Setup](#setup)<br>
[2. Identification of correct sampling depth](#depth)<br>
[3. Calculating the alpha diversity](#calc)<br>
[4. Testing the associations between categorical metadata columns and the diversity metric](#categorical)<br>
[5. Testing whether numeric sample metadata columns are correlated with microbial community richness](#numeric)<br>






<a id='setup'></a>

## 1. Setup

In [1]:
# importing all required packages & notebook extensions at the start of the notebook
import os
import biom
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import qiime2 as q2
from qiime2 import Visualization

%matplotlib inline

In [2]:
# assigning variables throughout the notebook

# location of this week's data and all the results produced by this notebook
# - this should be a path relative to your working directory
raw_data_dir = "../data/raw"
data_dir = "../data/processed"
vis_dir  = "../results"

<a id='depth'></a>

## 2. Identification of correct sampling depth

In [4]:
! qiime tools peek $data_dir/table-filtered.qza #-> FeatureTable[Frequency]

[32mUUID[0m:        702565bb-ce3d-472e-acd6-4b914601f892
[32mType[0m:        FeatureTable[Frequency]
[32mData format[0m: BIOMV210DirFmt


In [12]:
! qiime tools peek $data_dir/taxonomy.qza #-> FeatureData[Taxonomy]

[32mUUID[0m:        0eabf2a9-d83b-4ef2-950c-d00a600e77fd
[32mType[0m:        FeatureData[Taxonomy]
[32mData format[0m: TSVTaxonomyDirectoryFormat


In [7]:
! qiime tools peek $data_dir/fasttree-tree-rooted.qza #-> Phylogeny[Rooted]

[32mUUID[0m:        54dbac30-b904-41cf-bdc2-9ac608bc6561
[32mType[0m:        Phylogeny[Rooted]
[32mData format[0m: NewickDirectoryFormat


Summary of feature table

In [6]:
! qiime feature-table summarize \
  --i-table $data_dir/table-filtered.qza \
  --m-sample-metadata-file $data_dir/metadata.tsv \
  --o-visualization $data_dir/feature-table-filtered-summary.qzv

[32mSaved Visualization to: ../data/processed/feature-table-filtered-summary.qzv[0m
[0m

In [3]:
Visualization.load(f"{data_dir}/feature-table-filtered-summary.qzv")

### Alpha rarefaction

In [11]:
! qiime diversity alpha-rarefaction \
    --i-table $data_dir/table-filtered.qza \
    --i-phylogeny $data_dir/fasttree-tree-rooted.qza \
    --p-max-depth 290000 \
    --m-metadata-file $data_dir/metadata.tsv \
    --o-visualization $data_dir/alpha-rarefaction.qzv

[32mSaved Visualization to: ../data/processed/alpha-rarefaction.qzv[0m
[0m

In [4]:
Visualization.load(f"{data_dir}/alpha-rarefaction.qzv")

Based on the alpha rarefaction a sequencing depth of 40000 seems to be a good cutoff point, maximizing the rarefying threshold while minimizing loss of samples due to insufficient coverage

<a id='calc'></a>

## 3. Calculating the alpha diversity

Calculating the alpha diversity

In [15]:
! qiime diversity core-metrics-phylogenetic \
  --i-table $data_dir/table-filtered.qza \
  --i-phylogeny $data_dir/fasttree-tree-rooted.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --p-sampling-depth 40000 \
  --output-dir $data_dir/core-metrics-results

[32mSaved FeatureTable[Frequency] to: ../data/processed/core-metrics-results/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../data/processed/core-metrics-results/faith_pd_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../data/processed/core-metrics-results/observed_features_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../data/processed/core-metrics-results/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: ../data/processed/core-metrics-results/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: ../data/processed/core-metrics-results/unweighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../data/processed/core-metrics-results/weighted_unifrac_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../data/processed/core-metrics-results/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: ../data/processed/core-metrics-results/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: ../data/processe

<a id='categorical'></a>

## 4. Testing the associations between categorical metadata columns and the diversity metric

### Shannon

In [16]:
! qiime diversity alpha-group-significance \
  --i-alpha-diversity $data_dir/core-metrics-results/shannon_vector.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --o-visualization $data_dir/core-metrics-results/shannon-group-significance.qzv

[32mSaved Visualization to: ../data/processed/core-metrics-results/shannon-group-significance.qzv[0m
[0m

In [5]:
Visualization.load(f"{data_dir}/core-metrics-results/shannon-group-significance.qzv")

In [8]:
! qiime tools export \
    --input-path $data_dir/core-metrics-results/shannon-group-significance.qzv \
    --output-path $data_dir/core-metrics-results/shannon-group-significance_exported

[32mExported ../data/processed/core-metrics-results/shannon-group-significance.qzv as Visualization to directory ../data/processed/core-metrics-results/shannon-group-significance_exported[0m


### Faith PD

In [18]:
! qiime diversity alpha-group-significance \
  --i-alpha-diversity $data_dir/core-metrics-results/faith_pd_vector.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --o-visualization $data_dir/core-metrics-results/faith-pd-group-significance.qzv

[32mSaved Visualization to: ../data/processed/core-metrics-results/faith-pd-group-significance.qzv[0m
[0m

In [6]:
Visualization.load(f"{data_dir}/core-metrics-results/faith-pd-group-significance.qzv")

In [7]:
! qiime tools export \
    --input-path $data_dir/core-metrics-results/faith-pd-group-significance.qzv \
    --output-path $data_dir/core-metrics-results/faith-pd-group-significance_exported

[32mExported ../data/processed/core-metrics-results/faith-pd-group-significance.qzv as Visualization to directory ../data/processed/core-metrics-results/faith-pd-group-significance_exported[0m


<a id='numeric'></a>

## 5. Testing whether numeric sample metadata columns are correlated with microbial community richness

### Shannon

In [20]:
! qiime diversity alpha-correlation \
  --i-alpha-diversity $data_dir/core-metrics-results/shannon_vector.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --o-visualization $data_dir/core-metrics-results/shannon-group-significance-numeric.qzv

[32mSaved Visualization to: ../data/processed/core-metrics-results/shannon-group-significance-numeric.qzv[0m
[0m

In [9]:
Visualization.load(f"{data_dir}/core-metrics-results/shannon-group-significance-numeric.qzv")

In [10]:
! qiime tools export \
    --input-path $data_dir/core-metrics-results/shannon-group-significance-numeric.qzv \
    --output-path $data_dir/core-metrics-results/shannon-group-significance-numeric_exported

[32mExported ../data/processed/core-metrics-results/shannon-group-significance-numeric.qzv as Visualization to directory ../data/processed/core-metrics-results/shannon-group-significance-numeric_exported[0m


### Faith PD

In [22]:
! qiime diversity alpha-correlation \
  --i-alpha-diversity $data_dir/core-metrics-results/faith_pd_vector.qza \
  --m-metadata-file $data_dir/metadata.tsv \
  --o-visualization $data_dir/core-metrics-results/faith-pd-group-significance-numeric.qzv

[32mSaved Visualization to: ../data/processed/core-metrics-results/faith-pd-group-significance-numeric.qzv[0m
[0m

In [11]:
Visualization.load(f"{data_dir}/core-metrics-results/faith-pd-group-significance-numeric.qzv")

In [12]:
! qiime tools export \
    --input-path $data_dir/core-metrics-results/faith-pd-group-significance-numeric.qzv \
    --output-path $data_dir/core-metrics-results/faith-pd-group-significance-numeric_exported

[32mExported ../data/processed/core-metrics-results/faith-pd-group-significance-numeric.qzv as Visualization to directory ../data/processed/core-metrics-results/faith-pd-group-significance-numeric_exported[0m
