---
# Homework 13: Taxonomic Diversity with Microbiome Analyst

### Questions:
- How can we measure diversity?
- How can we visualize diversity metrics in Microbiome Analyst?

### Objectives:
- Understand alpha and beta diversity.
- Use the Microbiome Analyst to calculate and visualize taxonomic diversity.
- Compare the taxonomic diversity of samples based on metadata types.

### Keypoints:
- Alpha diversity measures the intra-sample diversity.
- Beta diversity measures the inter-sample diversity.
- We can explore the taxonomic diversity of our data and "tell a story".

---

## Getting Started

In [None]:
# set the variables for your netid
netid = "YOUR_NETID"

In [None]:
# Go into the working directory
work_dir = "/xdisk/bhurwitz/bh_class/" + netid + "/assignments/13_diversity"
%cd $work_dir

### Go to the Microbiome Analyst website

https://www.microbiomeanalyst.ca/MicrobiomeAnalyst

This homework was adapted from:

Chong, Jasmine, et al. "Using MicrobiomeAnalyst for comprehensive statistical, functional, and meta-analysis of microbiome data." Nature protocols 15.3 (2020): 799-821.

#### Input file

The input BIOM file for this homework was generated in hw11_data_filter. Download that file from your home directory to your desktop and use that file for the steps below.

#### Step 1:

Go to the MicrobiomeAnalyst home page (https://www.microbiomeanalyst.ca) and click the Get Started Button

![image-2.png](attachment:image-2.png)

#### Step 2:

Click the ‘Marker Data Profiling (MDP)’ circle to enter the MDP module.

![image.png](attachment:image.png)

#### Step 3:

Click on the BIOM format tab. Select Metadata included. Choose the example.biom file for this exercise. Select "Not Specific / Other" for the taxonomy label, and click submit.

![image.png](attachment:image.png)


#### Step 4 

Data integrity check. 

Inspect your taxonomic count data. This page consists of two tabs: 

The first tab, ‘Text Summary’, provides a text summary of the uploaded files. Scroll down to see the Figure for read counts. 

The second tab, ‘Library Size Overview’, graphically describes the read counts for all uploaded samples, which is informative for downstream data filtering and normalization. 

Microbiome Overview:

![image.png](attachment:image.png)


![image-2.png](attachment:image-2.png)


#### Step 5

Edit your metadata to change the order of samples as needed to visualize (by age in weeks for example). Click ‘Proceed’ at the bottom of the page to move forward.

![image.png](attachment:image.png)

#### Step 6 

Data Filtering

Data Filtering. Filtering is generally recommended to remove low-quality features, thereby improving downstream statistical analysis. Keep the default selections for the ‘Low count filter’ and ‘Low variance filter’ sliders and click ‘Submit’ to perform data filtering. A message will appear in the upper-right corner, indicating the results of the data filtering step. 

Click ‘Proceed’ at the bottom right of the page to navigate to the next page.

![image.png](attachment:image.png)


#### Step 7

Data Normalization

Data normalization. On the ‘Data Normalization’ page, users can perform data rarefying, scaling, and transformation. The aim of data normalization is to standardize the data to enable accurate comparisons. Keep the default selections for the options (only ‘Data transformation’ set to ‘Center log ratio’) and click ‘Submit’, followed by ‘Proceed’ to move to the ‘Analysis Overview’ page.

![image.png](attachment:image.png)

#### What is Alpha diversity?

This box describes the alpha-diversity analyses available in MicrobiomeAnalyst for community profiling.

Alpha diversity is a measure of within-sample diversity, whereas beta diversity is a measure of between-sample diversity. Alpha-diversity measures can be considered summary statistics of the diversity of single samples, whereas beta-diversity estimates can be considered dissimilarity scores between pairs of samples. For the latter, these measures permit further analyses via clustering or dimensionality reduction techniques. Various statistical tests can be applied to evaluate whether the differences are significant. More details are available below.

Alpha diversity
Alpha diversity summarizes both the species richness (total number of species) and/or evenness (abundance distribution across species) within a sample. Six alpha-diversity measures are currently supported in MicrobiomeAnalyst, each assessing different aspects of the community. ‘Observed’ calculates the total number of features per sample, whereas ‘ACE’ and ‘Chao1’ estimate taxa richness by accounting for features that are undetected because of low abundance. ‘Shannon’ and ‘Simpson’ take both species richness and evenness into account, with varying weight given to evenness. Finally, ‘Fisher’ models the community abundance structure as a logarithmic series distribution.

#### Step 8

Community profiling. Users can evaluate microbial community diversity profiles using the ‘Alpha-diversity’ and ‘Beta-diversity’ analysis options (refer to the box for further details). To start, click ‘Alpha-diversity analysis’ from the ‘Analysis Overview’ page.

![image.png](attachment:image.png)


#### Alpha diversity

At the top of the page are several drop-down menus where users can explore different alpha-diversity measures or choose a taxonomic level to evaluate diversity differences. By default, alpha diversity is evaluated at the feature (OTU/ASV) level using Chao1, and significant differences are evaluated using t-tests. The bottom half of the page contains two graphical summaries of the results. To the left is a dot plot displaying the alpha-diversity measures across samples, and to the right is a box plot summarizing the alpha-diversity measures across groups. From these results, we can see that the within-sample diversities of the male and female babies in the dataset are not significantly different: the alpha-diversity measures are slightly lower in the male babies compared to the values in the female babies, but the result is not significant.

![image.png](attachment:image.png)


#### Create a figure showing the Alpha diversity for your project.

Explore different alpha-diversity measures; each one makes different assumptions about the community structure and will therefore reveal different aspects of the community structure (refer to the Box above for further details). Also try different taxonomic levels to see whether the same trend can be observed across higher taxonomic levels.

Create 1-2 images from Microbiome Analyst showing the alpha diversity for your samples. Paste those here. Write a figure legend.



#### Record your Methods for creating the Figure(s)

Write you methods down here, to describe how you obtained the figure for Alpha diversity.

#### Get the Pairwise Comparison Statistics

Go to the Pairwise comparisons tab and record the pairwise comparisons here.

#### Record the Pairwise Comparison Statistics Methods

You can get the methods for the pairwise comparisons directly from the pairwise comparisons tab. Note those here.

#### Describe the results of your alpha diversity analysis.

Write down the results of your analyses here.

#### What is Beta Diversity?

This box describes the beta-diversity analyses available in MicrobiomeAnalyst for community profiling.

Beta diversity
Beta diversity evaluates differences in the community composition between samples. Resulting beta-diversity estimates can be combined into a distance matrix and used for ordination to visualize patterns. Samples close to each other are more similar in their microbial community profiles. MicrobiomeAnalyst supports the five most commonly used beta-diversity measures. ‘Jaccard distance’ uses just the presence or absence of features to calculate differences in microbial composition; ‘Bray-Curtis dissimilarity’ uses abundance data and calculates differences in feature abundance; ‘Jensen-Shannon divergence’ assesses the distance between two probability distributions that account for the presence and abundance of microbial features; ‘Unweighted UniFrac’ and ‘weighted UniFrac’ use the phylogenetic distance between features – the former is based purely on phylogenetic distance, whereas the latter is further weighted by the relative abundance of features.

Beta-diversity measures can be visualized using either PCoA or nonmetric multidimensional scaling (NMDS). Both methods take the distance matrix as input; PCoA maximizes the linear correlation between samples, whereas NMDS maximizes the rank-order correlation between samples. Users should use PCoA if distances between samples are so close that a linear transformation would suffice. NMDS is suggested if users wish to highlight the gradient structure within their data. NMDS is iterative and may return different results for the same dataset. Furthermore, MicrobiomeAnalyst calculates a stress value for the NMDS plot, which is a measure of goodness of fit. Generally, values >0.2 suggest a poor fit, whereas values <0.1 indicate a good fit.

Ordination measures between the groups are assessed for their statistical significance using either PERMANOVA, analysis of group similarities (ANOSIM) or homogeneity of group dispersions (PERMDISP). These tests evaluate global differences in microbiome composition between groups. PERMANOVA tests whether the centroids of all groups are equivalent. It uses the distances (or dissimilarity) between samples of the same group and compares them to the distances between groups. This method is sensitive to multivariate dispersions; therefore, PERMDISP should also be used to evaluate whether the dispersion (or variation) between samples differs from the dispersion between groups. ANOSIM tests whether within-group distances are greater or equal to between-group distances, using the ranks of all pair-wise sample distances.

#### Step 9

Beta diversity. Click the ‘Analysis Overview’ link on the navigation track at the top of the page. Next, click ‘Beta-diversity analysis’. The top half of this page contains parameters for beta-diversity analysis (refer to the Box above for further details). The two tabs on the bottom of the page show 2D and 3D PCoA plots, respectively. By default, the difference in diversity between the samples is assessed using the Bray–Curtis index. In the image below, we are looking at differences in Beta diversity due to sex. The permutational multivariate analysis of variance (PERMANOVA) suggests that the clusters for the two groups are NOT significantly different (P value is not < 0.001).

How about your project based on the primary comparison you want to make?

![image.png](attachment:image.png)


#### Try out the 3D PCoA exploration tab

Click the ‘Interactive PCoA 3D’ tab to further explore the PCoA results in an interactive 3D scatter plot based on the first three components. Use your mouse to rotate and zoom in and out of the plot. Again, there is NOT clear separation between the two sexes of the infants.

![image.png](attachment:image.png)

#### Create a figure showing the Beta diversity for your project.

Explore different beta-diversity measures; each one makes different assumptions about the community structure and will therefore reveal different aspects of the community structure (refer to the Box above for further details). Also try different taxonomic levels to see whether the same trend can be observed across higher taxonomic levels.

Create 1-2 images from Microbiome Analyst showing the alpha diversity for your samples. Paste those grpahs here. Write a figure legend.

#### Record your Methods for creating the Figure(s)

Write you methods down here, to describe how you obtained the figure for Beta diversity.

#### Get the Pairwise PERMANOVA Statistics

Go to the Pairwise PERMANOVA tab and record the pairwise comparisons here.

#### Record the Pairwise PERMANOVA Statistics Methods

You can get the methods for the pairwise PERMANOVA directly from the pairwise PERMANOVA tab for your Beta diversity analysis. Note those here.

#### Describe the results of your Beta diversity analysis.

Write down the results of your analyses here.

## The End

Copy your notebook for future reference...

In [None]:
!cp ~/be487-fall-2024/assignments/13_diversity/hw13_diversity.ipynb $work_dir