 
# Alpha and Beta Diversity

Different higher-level measures are often used to describe the microbiome in a sample. These do not provide information on changes in the abundance of specific taxa but allow us to access a broader change or difference in the composition of microorganisms. Alpha and beta diversity are examples of such measures.

Different measures exist to estimate diversity within a single sample, jointly called alpha diversity. The different measures reflect the richness (number) or distribution (evenness) of a microbial sample or aim to reflect a combination of both properties.

Rarefaction curves are often used when calculating alpha diversity indices because increasing numbers of sequenced taxa allow increasingly accurate estimates of total population diversity. Rarefaction curves can therefore be used to estimate the full sample richness, as compared to the observed sample richness.

While alpha diversity is a measure of microbiome diversity applicable to a single sample, beta diversity is a measure of the similarity or dissimilarity of two communities. As for alpha diversity, many indices exist, each reflecting different aspects of community heterogeneity. Key differences relate to how the indices value variation in rare species if they consider presence/absence only or incorporate abundance, and how they interpret shared absence. Bray-Curtis dissimilarity is a popular measure that considers both size (overall abundance per sample) and shape (abundance of each taxon) of the communities (Bray, 1957). Beta diversity is an essential measure for many popular statistical methods in ecology, such as ordination-based methods, and is widely used for studying the association between environmental variables and microbial composition.

In summary, alpha diversity measures can be seen as a summary statistic of a single population (within-sample diversity), while beta diversity measures are estimates of similarity or dissimilarity between populations (between samples).

**Source**: (https://biomcare.com/info/key-terms-in-microbiome-projects/)

### STEP : Diversity Analysis

Using QIIME2 to create diversity analisys graphs and calculations.

- [QIIME2 Workflow Overview](https://docs.qiime2.org/2022.8/tutorials/overview/)


#### Methods
- [diversity](https://docs.qiime2.org/2022.8/plugins/available/diversity/)
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/)
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/)
- [diversity beta](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta/)
- [diversity core_metrics](https://docs.qiime2.org/2022.8/plugins/available/diversity/core-metrics/)
- [diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/)
- [diversity beta_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-group-significance/)
- [feature_table core_features](https://docs.qiime2.org/2022.8/plugins/available/feature-table/core-features/)
- [feature_table summarize](https://docs.qiime2.org/2022.8/plugins/available/feature-table/summarize/)
- [taxa filter-table](https://docs.qiime2.org/2022.8/plugins/available/taxa/filter-table/)
- [taxa collapse](https://docs.qiime2.org/2022.8/plugins/available/taxa/collapse/)

## Setup and settings

In [1]:
# Importing packages
import os
import pandas as pd
from qiime2 import Artifact
from qiime2 import Visualization
from qiime2 import Metadata

from qiime2.plugins.phylogeny.pipelines import align_to_tree_mafft_fasttree

from qiime2.plugins.diversity.pipelines import alpha
from qiime2.plugins.diversity.pipelines import beta
from qiime2.plugins.diversity.pipelines import core_metrics
from qiime2.plugins.diversity.pipelines import alpha_phylogenetic

from qiime2.plugins.diversity.visualizers import alpha_group_significance
from qiime2.plugins.diversity.visualizers import beta_group_significance
from qiime2.plugins.diversity.visualizers import alpha_correlation
from qiime2.plugins.diversity.visualizers import beta_rarefaction

from qiime2.plugins.taxa.methods import filter_table
from qiime2.plugins.taxa.methods import collapse

from qiime2.plugins.feature_table.visualizers import tabulate_seqs
from qiime2.plugins.feature_table.visualizers import summarize
from qiime2.plugins.feature_table.visualizers import core_features

from qiime2.plugins.alignment.methods import mafft


import matplotlib.pyplot as plt

%matplotlib inline

### Receiving the parameters

The following cell can receive parameters using the [papermill](https://papermill.readthedocs.io/en/latest/) tool.

In [2]:
base_dir = os.path.join('/', 'home')
metadata_file = os.path.abspath(os.path.join(base_dir, 'data', 'metadatada.tsv'))
experiment_name = ''
class_col = ''
replace_files = False

In [3]:
# Parameters
experiment_name = "ana-flavia-HSD-NCxHSD-NR-trim"
base_dir = "/home/lauro/nupeb/rede-micro/redemicro-ana-flavia-nutri"
manifest_file = "/home/lauro/nupeb/rede-micro/redemicro-ana-flavia-nutri/data/raw/manifest/manifest-ana-flavia-HSD-NCxHSD-NR.csv"
metadata_file = "/home/lauro/nupeb/rede-micro/redemicro-ana-flavia-nutri/data/raw/metadata/metadata-ana-flavia-HSD-NCxHSD-NR.tsv"
class_col = "group-id"
classifier_file = "/home/lauro/nupeb/rede-micro/models/silva-138-99-nb-classifier.qza"
top_n = 20
replace_files = False
phred = 20
trunc_f = 0
trunc_r = 0
overlap = 12
threads = 6
trim = {
    "overlap": 8,
    "forward_primer": "CCTACGGGRSGCAGCAG",
    "reverse_primer": "GGACTACHVGGGTWTCTAAT",
}


In [4]:
experiment_folder = os.path.abspath(os.path.join(base_dir, 'experiments', experiment_name))
img_folder = os.path.abspath(os.path.join(experiment_folder, 'imgs'))

### Defining names, paths and flags

In [5]:
# QIIME2 Artifacts folder
qiime_folder = os.path.join(experiment_folder, 'qiime-artifacts')

# Input - DADA2 Artifacts
dada2_tabs_path = os.path.join(qiime_folder, 'dada2-tabs.qza')
dada2_reps_path = os.path.join(qiime_folder, 'dada2-reps.qza')
dada2_stat_path = os.path.join(qiime_folder, 'dada2-stat.qza')

# Input - Taxonaomic Artifacts
taxonomy_path = os.path.join(qiime_folder, 'metatax.qza')

# Create folder to store Alpha files
alpha_path = os.path.join(qiime_folder, 'alpha-analysis')
if not os.path.exists(alpha_path):
    os.makedirs(alpha_path)
    print(f'The new directory is created in {alpha_path}')
    
# Create folder to store Beta files
beta_path = os.path.join(qiime_folder, 'beta-analysis')
if not os.path.exists(beta_path):
    os.makedirs(beta_path)
    print(f'The new directory is created in {beta_path}')

# Output -Diversity Artifacts
alpha_diversity_path = os.path.join(alpha_path, 'alpha-diversity.qza')
alpha_diversity_view_path = os.path.join(alpha_path, 'alpha-diversity.qzv')
beta_diversity_path = os.path.join(beta_path, 'beta-diversity.qza')
beta_diversity_view_path = os.path.join(beta_path, 'beta-diversity.qzv')

The new directory is created in /home/lauro/nupeb/rede-micro/redemicro-ana-flavia-nutri/experiments/ana-flavia-HSD-NCxHSD-NR-trim/qiime-artifacts/alpha-analysis
The new directory is created in /home/lauro/nupeb/rede-micro/redemicro-ana-flavia-nutri/experiments/ana-flavia-HSD-NCxHSD-NR-trim/qiime-artifacts/beta-analysis


In [6]:
def filter_and_collapse(tab, seqs, tax, meta, lvl, exclude=True, exclude_list='uncultured,unidentified,metagenome'):
    from qiime2.plugins.taxa.methods import collapse
    from qiime2.plugins.taxa.methods import filter_table
    from qiime2.plugins.feature_table.methods import filter_seqs
    from qiime2.plugins.feature_table.visualizers import summarize
    
    to_include = ('d', 'p', 'c', 'o', 'f', 'g', 's')[lvl-1]
    to_include += '__'
    to_exclude = exclude_list if exclude else None
    
    filtered_tabs = filter_table(
        table=tab, 
        taxonomy=tax,
        include=to_include,
        exclude=to_exclude,
        mode='contains').filtered_table
    
    filtered_seqs = filter_seqs(
        data = seqs,
        table = filtered_tabs,
    ).filtered_data
    
    collapsed_table = collapse(table=filtered_tabs, taxonomy=tax, level=lvl).collapsed_table
    collapsed_table_view = summarize(table=collapsed_table, sample_metadata=meta).visualization
    
    return collapsed_table, collapsed_table_view, filtered_seqs

## Step execution

### Load input files

This Step import the QIIME2 `FeatureTable[Frequency]` Artifact and the `Metadata` file.

In [7]:
#Load Metadata
metadata_qa = Metadata.load(metadata_file)

#Load FeatureTable[Frequency]
tabs = Artifact.load(dada2_tabs_path)
tabs_df = tabs.view(Metadata).to_dataframe().T

# FeatureData[Sequence]
reps = Artifact.load(dada2_reps_path)

# FeatureData[Taxonomy]
tax = Artifact.load(taxonomy_path)

In [8]:
# lvl = 7
# exclude = True
# tabs, collapsed_table_view, reps = filter_and_collapse(
#                     tabs, reps, tax, metadata_qa, 
#                     lvl=lvl,
#                     exclude=exclude, 
#                     exclude_list='uncultured,unidentified,metagenome')
# collapsed_table_view

## Alpha diversity analysis

#### Reference
- [The Use and Types of Alpha-Diversity Metrics in Microbial NGS](https://www.cd-genomics.com/microbioseq/the-use-and-types-of-alpha-diversity-metrics-in-microbial-ngs.html)
- [Alpha diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.alpha.html)

#### Methods
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/): Computes a user-specified alpha diversity metric for all samples in a
feature table.
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/): Computes a user-specified phylogenetic alpha diversity metric for all
samples in a feature table.
- [diversity alpha_correlation](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-correlation/): Determine whether numeric sample metadata columns are correlated with alpha diversity.
- [diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/): Visually and statistically compare groups of alpha diversity values.

### Compute Alpha Diversity vectors
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/): Computes a user-specified alpha diversity metric for all samples in a feature table.
- [Alpha diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.alpha.html)
 - Choices: ('ace', 'berger_parker_d', 'brillouin_d', 'chao1', 'chao1_ci', 'dominance', 'doubles', 'enspie', 'esty_ci', 'fisher_alpha', 'gini_index', 'goods_coverage', 'heip_e', 'kempton_taylor_q', 'lladser_pe', 'margalef', 'mcintosh_d', 'mcintosh_e', 'menhinick', 'michaelis_menten_fit', 'observed_features', 'osd', 'pielou_e', 'robbins', 'shannon', 'simpson', 'simpson_e', 'singles', 'strong')

In [9]:
metrics = ('ace', 'berger_parker_d', 'brillouin_d', 'chao1', 'chao1_ci', 'dominance', 'doubles', 'enspie', 'esty_ci', 'fisher_alpha', 'gini_index', 'goods_coverage', 'heip_e', 'kempton_taylor_q', 'lladser_pe', 'margalef', 'mcintosh_d', 'mcintosh_e', 'menhinick', 'michaelis_menten_fit', 'observed_features', 'osd', 'pielou_e', 'robbins', 'shannon', 'simpson', 'simpson_e', 'singles', 'strong')
metrics = ('chao1', 'observed_features', 'shannon', 'simpson', 'dominance', 'gini_index', 'goods_coverage', 'singles', 'strong')
alpha_diversities = dict()
for metric in metrics:
    print(f"Calculating alpha diversity: {metric}")
    try:
        alpha_diversity = alpha(table=tabs, metric=metric).alpha_diversity
        alpha_diversities[metric] = alpha_diversity
        # Save SampleData[AlphaDiversity] Artifact
        file_path = os.path.join(alpha_path, f'alpha-values-{metric}.qza')
        alpha_diversity.save(file_path)
        print(f"DONE: Calculating alpha diversity: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha diversity: {metric}")
        print(e)

Calculating alpha diversity: chao1
DONE: Calculating alpha diversity: chao1
Calculating alpha diversity: observed_features
DONE: Calculating alpha diversity: observed_features
Calculating alpha diversity: shannon


DONE: Calculating alpha diversity: shannon
Calculating alpha diversity: simpson
DONE: Calculating alpha diversity: simpson
Calculating alpha diversity: dominance
DONE: Calculating alpha diversity: dominance
Calculating alpha diversity: gini_index


DONE: Calculating alpha diversity: gini_index
Calculating alpha diversity: goods_coverage
DONE: Calculating alpha diversity: goods_coverage
Calculating alpha diversity: singles
DONE: Calculating alpha diversity: singles
Calculating alpha diversity: strong


DONE: Calculating alpha diversity: strong


### Create Phylogenetic inference

- [alignment align_to_tree_mafft_fasttree](https://docs.qiime2.org/2022.8/plugins/available/phylogeny/align-to-tree-mafft-fasttree/): Build a phylogenetic tree using fasttree and mafft alignment

This pipeline will start by creating a sequence alignment using MAFFT,
after which any alignment columns that are phylogenetically uninformative
or ambiguously aligned will be removed (masked). The resulting masked
alignment will be used to infer a phylogenetic tree and then subsequently
rooted at its midpoint. Output files from each step of the pipeline will be
saved. This includes both the unmasked and masked MAFFT alignment from
q2-alignment methods, and both the rooted and unrooted phylogenies from
q2-phylogeny methods.


Returns
- alignment : FeatureData[AlignedSequence] : The aligned sequences.
- masked_alignment : FeatureData[AlignedSequence] : The masked alignment.
- tree : Phylogeny[Unrooted] : The unrooted phylogenetic tree.
- rooted_tree : Phylogeny[Rooted] : The rooted phylogenetic tree.

In [10]:
mafft_alignment, mafft_masked_alignment, mafft_tree, mafft_rooted_tree = align_to_tree_mafft_fasttree(
    sequences=reps, n_threads=6, )

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: mafft --preservecase --inputorder --thread 6 /tmp/qiime2-archive-4mimwoa5/be79eb0f-e4b3-4ef7-8935-5c7a86f34509/data/dna-sequences.fasta



inputfile = orig
2732 x 430 - 170 d
nthread = 6
nthreadpair = 6
nthreadtb = 6
ppenalty_ex = 0
stacksize: 8192 kb
generating a scoring matrix for nucleotide (dist=200) ... done
Gap Penalty = -1.53, +0.00, +0.00



Making a distance matrix ..
    1 / 2732 (thread    0)  101 / 2732 (thread    4)

  201 / 2732 (thread    0)  301 / 2732 (thread    1)  401 / 2732 (thread    4)

  501 / 2732 (thread    5)  601 / 2732 (thread    3)  701 / 2732 (thread    1)  801 / 2732 (thread    5)

  901 / 2732 (thread    5) 1001 / 2732 (thread    2) 1101 / 2732 (thread    1) 1201 / 2732 (thread    1) 1301 / 2732 (thread    4)

 1401 / 2732 (thread    3) 1501 / 2732 (thread    2) 1601 / 2732 (thread    5) 1701 / 2732 (thread    0) 1801 / 2732 (thread    1) 1901 / 2732 (thread    1) 2001 / 2732 (thread    2) 2101 / 2732 (thread    1)

 2201 / 2732 (thread    5) 2301 / 2732 (thread    1) 2401 / 2732 (thread    1) 2501 / 2732 (thread    5) 2601 / 2732 (thread    4) 2701 / 2732 (thread    2)
done.

Constructing a UPGMA tree (efffree=0) ... 
    0 / 2732   10 / 2732   20 / 2732   30 / 2732   40 / 2732   50 / 2732   60 / 2732   70 / 2732   80 / 2732   90 / 2732  100 / 2732  110 / 2732  120 / 2732  130 / 2732  140 / 2732  150 / 2732  160 / 2732  170 / 2732  180 / 2732  190 / 2732  200 / 2732  210 / 2732  220 / 2732  230 / 2732  240 / 2732  250 / 2732  260 / 2732  270 / 2732  280 / 2732  290 / 2732  300 / 2732  310 / 2732  320 / 2732  330 / 2732  340 / 2732  350 / 2732  360 / 2732  370 / 2732  380 / 2732  390 / 2732  400 / 2732  410 / 2732  420 / 2732  430 / 2732  440 / 2732  450 / 2732  460 / 2732  470 / 2732  480 / 2732  490 / 2732  500 / 2732  510 / 2732  520 / 2732  530 / 2732  540 / 2732  550 / 2732

  560 / 2732  570 / 2732  580 / 2732  590 / 2732  600 / 2732  610 / 2732  620 / 2732  630 / 2732  640 / 2732  650 / 2732  660 / 2732  670 / 2732  680 / 2732  690 / 2732  700 / 2732  710 / 2732  720 / 2732  730 / 2732  740 / 2732  750 / 2732  760 / 2732  770 / 2732  780 / 2732  790 / 2732  800 / 2732  810 / 2732  820 / 2732  830 / 2732  840 / 2732  850 / 2732  860 / 2732  870 / 2732  880 / 2732  890 / 2732  900 / 2732  910 / 2732  920 / 2732  930 / 2732  940 / 2732  950 / 2732  960 / 2732  970 / 2732  980 / 2732  990 / 2732 1000 / 2732 1010 / 2732 1020 / 2732 1030 / 2732 1040 / 2732 1050 / 2732 1060 / 2732 1070 / 2732 1080 / 2732 1090 / 2732 1100 / 2732 1110 / 2732 1120 / 2732 1130 / 2732 1140 / 2732 1150 / 2732 1160 / 2732 1170 / 2732 1180 / 2732 1190 / 2732 1200 / 2732 1210 / 2732 1220 / 2732 1230 / 2732 1240 / 2732 1250 / 2732 1260 / 2732 1270 / 2732 1280 / 2732 1290 / 2732 1300 / 2732 1310 / 2732 1320 / 273

STEP   601 / 2731 (thread    2) fSTEP   701 / 2731 (thread    3) fSTEP   801 / 2731 (thread    5) fSTEP   901 / 2731 (thread    2) fSTEP  1001 / 2731 (thread    0) fSTEP  1101 / 2731 (thread    4) fSTEP  1201 / 2731 (thread    4) fSTEP  1301 / 2731 (thread    3) f
Reallocating..done. *alloclen = 1944


STEP  1401 / 2731 (thread    4) fSTEP  1501 / 2731 (thread    5) fSTEP  1601 / 2731 (thread    0) fSTEP  1701 / 2731 (thread    0) fSTEP  1801 / 2731 (thread    1) fSTEP  1901 / 2731 (thread    3) fSTEP  2001 / 2731 (thread    3) f

STEP  2101 / 2731 (thread    5) fSTEP  2201 / 2731 (thread    4) fSTEP  2301 / 2731 (thread    4) fSTEP  2401 / 2731 (thread    2) fSTEP  2501 / 2731 (thread    3) f

STEP  2601 / 2731 (thread    4) fSTEP  2701 / 2731 (thread    5) f


done.

Making a distance matrix from msa.. 
    0 / 2732 (thread    0)  100 / 2732 (thread    5)

  200 / 2732 (thread    3)  300 / 2732 (thread    3)

  400 / 2732 (thread    5)  500 / 2732 (thread    3)

  600 / 2732 (thread    3)  700 / 2732 (thread    4)  800 / 2732 (thread    1)

  900 / 2732 (thread    3) 1000 / 2732 (thread    0) 1100 / 2732 (thread    4)

 1200 / 2732 (thread    3) 1300 / 2732 (thread    2) 1400 / 2732 (thread    3) 1500 / 2732 (thread    2)

 1600 / 2732 (thread    0) 1700 / 2732 (thread    3) 1800 / 2732 (thread    3) 1900 / 2732 (thread    3) 2000 / 2732 (thread    3) 2100 / 2732 (thread    3)

 2200 / 2732 (thread    5) 2300 / 2732 (thread    0) 2400 / 2732 (thread    3) 2500 / 2732 (thread    2) 2600 / 2732 (thread    0) 2700 / 2732 (thread    2)
done.

Constructing a UPGMA tree (efffree=1) ... 
    0 / 2732   10 / 2732   20 / 2732   30 / 2732   40 / 2732   50 / 2732   60 / 2732   70 / 2732   80 / 2732   90 / 2732  100 / 2732  110 / 2732  120 / 2732  130 / 2732  140 / 2732  150 / 2732  160 / 2732  170 / 2732  180 / 2732  190 / 2732  200 / 2732  210 / 2732  220 / 2732  230 / 2732  240 / 2732  250 / 2732  260 / 2732  270 / 2732  280 / 2732  290 / 2732  300 / 2732  310 / 2732  320 / 2732  330 / 2732  340 / 2732  350 / 2732  360 / 2732  370 / 2732  380 / 2732  390 / 2732  400 / 2732  410 / 2732  420 / 2732  430 / 2732  440 / 2732  450 / 2732  460 / 2732  470 / 2732  480 / 2732  490 / 2732  500 / 2732  510 / 2732  520 / 2732  530 / 2732  540 / 2732  550 / 2732  560 / 2732  570 / 2732  580 / 2732  590 / 2732  600 

 1290 / 2732 1300 / 2732 1310 / 2732 1320 / 2732 1330 / 2732 1340 / 2732 1350 / 2732 1360 / 2732 1370 / 2732 1380 / 2732 1390 / 2732 1400 / 2732 1410 / 2732 1420 / 2732 1430 / 2732 1440 / 2732 1450 / 2732 1460 / 2732 1470 / 2732 1480 / 2732 1490 / 2732 1500 / 2732 1510 / 2732 1520 / 2732 1530 / 2732 1540 / 2732 1550 / 2732 1560 / 2732 1570 / 2732 1580 / 2732 1590 / 2732 1600 / 2732 1610 / 2732 1620 / 2732 1630 / 2732 1640 / 2732 1650 / 2732 1660 / 2732 1670 / 2732 1680 / 2732 1690 / 2732 1700 / 2732 1710 / 2732 1720 / 2732 1730 / 2732 1740 / 2732 1750 / 2732 1760 / 2732 1770 / 2732 1780 / 2732 1790 / 2732 1800 / 2732 1810 / 2732 1820 / 2732 1830 / 2732 1840 / 2732 1850 / 2732 1860 / 2732 1870 / 2732 1880 / 2732 1890 / 2732 1900 / 2732 1910 / 2732 1920 / 2732 1930 / 2732 1940 / 2732 1950 / 2732 1960 / 2732 1970 / 2732 1980 / 2732 1990 / 2732 2000 / 2732 2010 / 2732 2020 / 2732 2030 / 2732 2040 / 2732 2050 / 273

STEP   136 / 2731 (thread    0) fSTEP   137 / 2731 (thread    3) fSTEP   138 / 2731 (thread    5) fSTEP   139 / 2731 (thread    1) fSTEP   140 / 2731 (thread    2) fSTEP   141 / 2731 (thread    4) fSTEP   142 / 2731 (thread    0) fSTEP   143 / 2731 (thread    1) fSTEP   144 / 2731 (thread    2) fSTEP   145 / 2731 (thread    5)STEP   146 / 2731 (thread    3) f fSTEP   147 / 2731 (thread    1) fSTEP   148 / 2731 (thread    4) fSTEP   149 / 2731 (thread    0) fSTEP   150 / 2731 (thread    2) fSTEP   151 / 2731 (thread    5)STEP   152 / 2731 (thread    3) f fSTEP   153 / 2731 (thread    1) fSTEP   154 / 2731 (thread    4) fSTEP   155 / 2731 (thread    2) fSTEP   156 / 2731 (thread    0) fSTEP   157 / 2731 (thread    1) fSTEP   158 / 2731 (thread    5) fSTEP   159 / 2731 (thread    3) fSTEP   160 / 2731 (thread    2) fSTEP   161 / 2731 (thread    4) fSTEP   162 / 2731 (thread    1) fSTEP   163 / 2731 (thread  

STEP  1101 / 2731 (thread    4) fSTEP  1201 / 2731 (thread    2) fSTEP  1301 / 2731 (thread    0) fSTEP  1401 / 2731 (thread    0) fSTEP  1501 / 2731 (thread    0) fSTEP  1601 / 2731 (thread    3) fSTEP  1701 / 2731 (thread    4) fSTEP  1801 / 2731 (thread    5) f

STEP  1901 / 2731 (thread    1) fSTEP  2001 / 2731 (thread    2) fSTEP  2101 / 2731 (thread    1) f
Reallocating..done. *alloclen = 1944
STEP  2201 / 2731 (thread    0) fSTEP  2301 / 2731 (thread    4) fSTEP  2401 / 2731 (thread    5) f

STEP  2501 / 2731 (thread    3) fSTEP  2601 / 2731 (thread    0) d hSTEP  2701 / 2731 (thread    5) d h


done.

disttbfast (nuc) Version 7.490
alg=A, model=DNA200 (2), 1.53 (4.59), -0.00 (-0.00), noshift, amax=0.0
6 thread(s)


Strategy:
 FFT-NS-2 (Fast but rough)
 Progressive method (guide trees were built 2 times.)

If unsure which option to use, try 'mafft --auto input > output'.
For more information, see 'mafft --help', 'mafft --man' and the mafft page.

The default gap scoring scheme has been changed in version 7.110 (2013 Oct).
It tends to insert more gaps into gap-rich regions than previous versions.
To disable this change, add the --leavegappyregion option.



Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: FastTreeMP -quote -nt /tmp/qiime2-archive-jg59_alm/28dd0794-2641-4265-80f0-b72a8dd4fb6d/data/aligned-dna-sequences.fasta



FastTree Version 2.1.10 Double precision (No SSE3), OpenMP (6 threads)
Alignment: /tmp/qiime2-archive-jg59_alm/28dd0794-2641-4265-80f0-b72a8dd4fb6d/data/aligned-dna-sequences.fasta
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories
      0.10 seconds: Top hits for   1304 of   2547 seqs (at seed    900)


      0.24 seconds: Joined    100 of   2544
      0.43 seconds: Joined    200 of   2544


      0.58 seconds: Joined    300 of   2544
      0.74 seconds: Joined    400 of   2544


      0.88 seconds: Joined    500 of   2544
      1.05 seconds: Joined    600 of   2544


      1.20 seconds: Joined    700 of   2544
      1.33 seconds: Joined    800 of   2544


      1.47 seconds: Joined    900 of   2544
      1.59 seconds: Joined   1000 of   2544


      1.72 seconds: Joined   1100 of   2544
      1.90 seconds: Joined   1200 of   2544


      2.12 seconds: Joined   1400 of   2544
      2.26 seconds: Joined   1500 of   2544


      2.44 seconds: Joined   1600 of   2544
      2.59 seconds: Joined   1700 of   2544


      2.74 seconds: Joined   1800 of   2544
      2.92 seconds: Joined   2000 of   2544


      3.06 seconds: Joined   2100 of   2544
      3.20 seconds: Joined   2200 of   2544


      3.30 seconds: Joined   2300 of   2544
      3.41 seconds: Joined   2400 of   2544


      3.52 seconds: Joined   2500 of   2544
Initial topology in 3.55 seconds
Refining topology: 45 rounds ME-NNIs, 2 rounds ME-SPRs, 23 rounds ML-NNIs
      3.62 seconds: ME NNI round 1 of 45, 2001 of 2545 splits, 440 changes (max delta 0.031)


      3.73 seconds: ME NNI round 2 of 45, 2201 of 2545 splits, 247 changes (max delta 0.026)
      3.83 seconds: ME NNI round 4 of 45, 401 of 2545 splits, 23 changes (max delta 0.009)


      3.93 seconds: ME NNI round 9 of 45, 1 of 2545 splits
      4.06 seconds: SPR round   1 of   2, 201 of 5092 nodes


      4.19 seconds: SPR round   1 of   2, 401 of 5092 nodes
      4.32 seconds: SPR round   1 of   2, 601 of 5092 nodes


      4.44 seconds: SPR round   1 of   2, 801 of 5092 nodes
      4.57 seconds: SPR round   1 of   2, 1001 of 5092 nodes


      4.68 seconds: SPR round   1 of   2, 1201 of 5092 nodes
      4.83 seconds: SPR round   1 of   2, 1401 of 5092 nodes


      4.97 seconds: SPR round   1 of   2, 1601 of 5092 nodes
      5.10 seconds: SPR round   1 of   2, 1801 of 5092 nodes


      5.22 seconds: SPR round   1 of   2, 2001 of 5092 nodes
      5.35 seconds: SPR round   1 of   2, 2201 of 5092 nodes


      5.47 seconds: SPR round   1 of   2, 2401 of 5092 nodes
      5.58 seconds: SPR round   1 of   2, 2601 of 5092 nodes


      5.70 seconds: SPR round   1 of   2, 2801 of 5092 nodes
      5.82 seconds: SPR round   1 of   2, 3001 of 5092 nodes


      5.94 seconds: SPR round   1 of   2, 3201 of 5092 nodes
      6.06 seconds: SPR round   1 of   2, 3401 of 5092 nodes


      6.20 seconds: SPR round   1 of   2, 3601 of 5092 nodes
      6.32 seconds: SPR round   1 of   2, 3801 of 5092 nodes


      6.45 seconds: SPR round   1 of   2, 4001 of 5092 nodes
      6.57 seconds: SPR round   1 of   2, 4201 of 5092 nodes


      6.69 seconds: SPR round   1 of   2, 4401 of 5092 nodes
      6.82 seconds: SPR round   1 of   2, 4601 of 5092 nodes


      6.93 seconds: SPR round   1 of   2, 4801 of 5092 nodes
      7.04 seconds: SPR round   1 of   2, 5001 of 5092 nodes


      7.14 seconds: ME NNI round 16 of 45, 1401 of 2545 splits, 11 changes (max delta 0.002)
      7.24 seconds: ME NNI round 17 of 45, 2101 of 2545 splits, 18 changes (max delta 0.002)


      7.39 seconds: SPR round   2 of   2, 201 of 5092 nodes
      7.49 seconds: SPR round   2 of   2, 401 of 5092 nodes


      7.61 seconds: SPR round   2 of   2, 601 of 5092 nodes
      7.72 seconds: SPR round   2 of   2, 801 of 5092 nodes


      7.83 seconds: SPR round   2 of   2, 1001 of 5092 nodes
      7.94 seconds: SPR round   2 of   2, 1201 of 5092 nodes


      8.05 seconds: SPR round   2 of   2, 1401 of 5092 nodes
      8.20 seconds: SPR round   2 of   2, 1701 of 5092 nodes


      8.31 seconds: SPR round   2 of   2, 1901 of 5092 nodes
      8.43 seconds: SPR round   2 of   2, 2101 of 5092 nodes


      8.54 seconds: SPR round   2 of   2, 2301 of 5092 nodes
      8.64 seconds: SPR round   2 of   2, 2501 of 5092 nodes


      8.74 seconds: SPR round   2 of   2, 2701 of 5092 nodes
      8.87 seconds: SPR round   2 of   2, 2901 of 5092 nodes


      8.98 seconds: SPR round   2 of   2, 3101 of 5092 nodes
      9.09 seconds: SPR round   2 of   2, 3301 of 5092 nodes


      9.21 seconds: SPR round   2 of   2, 3501 of 5092 nodes
      9.34 seconds: SPR round   2 of   2, 3701 of 5092 nodes


      9.46 seconds: SPR round   2 of   2, 3901 of 5092 nodes
      9.58 seconds: SPR round   2 of   2, 4101 of 5092 nodes


      9.71 seconds: SPR round   2 of   2, 4301 of 5092 nodes
      9.83 seconds: SPR round   2 of   2, 4501 of 5092 nodes


      9.95 seconds: SPR round   2 of   2, 4701 of 5092 nodes
     10.08 seconds: SPR round   2 of   2, 4901 of 5092 nodes


     10.21 seconds: ME NNI round 31 of 45, 1 of 2545 splits
     10.31 seconds: ME NNI round 32 of 45, 801 of 2545 splits, 0 changes


Total branch-length 39.266 after 10.47 sec
     10.48 seconds: ML Lengths 1 of 2545 splits
     10.58 seconds: ML Lengths 301 of 2545 splits


     10.72 seconds: ML Lengths 701 of 2545 splits
     10.82 seconds: ML Lengths 1001 of 2545 splits


     10.92 seconds: ML Lengths 1301 of 2545 splits
     11.05 seconds: ML Lengths 1701 of 2545 splits


     11.18 seconds: ML Lengths 2101 of 2545 splits
     11.31 seconds: ML Lengths 2501 of 2545 splits


     11.42 seconds: ML NNI round 1 of 23, 101 of 2545 splits, 15 changes (max delta 7.156)
     11.61 seconds: ML NNI round 1 of 23, 301 of 2545 splits, 50 changes (max delta 7.156)


     11.71 seconds: ML NNI round 1 of 23, 401 of 2545 splits, 60 changes (max delta 10.467)
     11.82 seconds: ML NNI round 1 of 23, 501 of 2545 splits, 78 changes (max delta 10.467)


     12.01 seconds: ML NNI round 1 of 23, 701 of 2545 splits, 111 changes (max delta 10.467)
     12.19 seconds: ML NNI round 1 of 23, 901 of 2545 splits, 150 changes (max delta 10.467)


     12.37 seconds: ML NNI round 1 of 23, 1101 of 2545 splits, 189 changes (max delta 11.467)
     12.55 seconds: ML NNI round 1 of 23, 1301 of 2545 splits, 242 changes (max delta 11.467)


     12.73 seconds: ML NNI round 1 of 23, 1501 of 2545 splits, 262 changes (max delta 11.587)
     12.91 seconds: ML NNI round 1 of 23, 1701 of 2545 splits, 293 changes (max delta 11.587)


     13.09 seconds: ML NNI round 1 of 23, 1901 of 2545 splits, 321 changes (max delta 11.587)
     13.27 seconds: ML NNI round 1 of 23, 2101 of 2545 splits, 342 changes (max delta 11.587)


     13.46 seconds: ML NNI round 1 of 23, 2301 of 2545 splits, 385 changes (max delta 19.237)
     13.63 seconds: ML NNI round 1 of 23, 2501 of 2545 splits, 413 changes (max delta 19.237)


ML-NNI round 1: LogLk = -104388.576 NNIs 420 max delta 19.24 Time 13.70
     13.74 seconds: Site likelihoods with rate category 1 of 20
     13.87 seconds: Site likelihoods with rate category 4 of 20


     14.00 seconds: Site likelihoods with rate category 7 of 20
     14.14 seconds: Site likelihoods with rate category 10 of 20


     14.27 seconds: Site likelihoods with rate category 13 of 20
     14.40 seconds: Site likelihoods with rate category 16 of 20


     14.53 seconds: Site likelihoods with rate category 19 of 20
Switched to using 20 rate categories (CAT approximation)
Rate categories were divided by 1.187 so that average rate = 1.0
CAT-based log-likelihoods may not be comparable across runs
Use -gamma for approximate but comparable Gamma(20) log-likelihoods
     14.72 seconds: ML NNI round 2 of 23, 101 of 2545 splits, 8 changes (max delta 2.013)


     14.89 seconds: ML NNI round 2 of 23, 301 of 2545 splits, 26 changes (max delta 2.920)
     15.08 seconds: ML NNI round 2 of 23, 501 of 2545 splits, 51 changes (max delta 6.138)


     15.28 seconds: ML NNI round 2 of 23, 701 of 2545 splits, 66 changes (max delta 7.423)
     15.38 seconds: ML NNI round 2 of 23, 801 of 2545 splits, 72 changes (max delta 7.423)


     15.58 seconds: ML NNI round 2 of 23, 1001 of 2545 splits, 95 changes (max delta 7.423)
     15.69 seconds: ML NNI round 2 of 23, 1101 of 2545 splits, 104 changes (max delta 7.423)


     15.89 seconds: ML NNI round 2 of 23, 1301 of 2545 splits, 120 changes (max delta 8.208)
     15.99 seconds: ML NNI round 2 of 23, 1401 of 2545 splits, 131 changes (max delta 8.208)


     16.11 seconds: ML NNI round 2 of 23, 1501 of 2545 splits, 149 changes (max delta 8.208)
     16.21 seconds: ML NNI round 2 of 23, 1601 of 2545 splits, 156 changes (max delta 8.208)


     16.32 seconds: ML NNI round 2 of 23, 1701 of 2545 splits, 168 changes (max delta 8.208)
     16.51 seconds: ML NNI round 2 of 23, 1901 of 2545 splits, 186 changes (max delta 8.208)


     16.69 seconds: ML NNI round 2 of 23, 2101 of 2545 splits, 212 changes (max delta 8.208)
     16.86 seconds: ML NNI round 2 of 23, 2301 of 2545 splits, 229 changes (max delta 8.208)


     17.03 seconds: ML NNI round 2 of 23, 2501 of 2545 splits, 254 changes (max delta 8.208)
ML-NNI round 2: LogLk = -87042.082 NNIs 258 max delta 8.21 Time 17.10
     17.18 seconds: ML NNI round 3 of 23, 101 of 2545 splits, 2 changes (max delta 2.727)


     17.35 seconds: ML NNI round 3 of 23, 301 of 2545 splits, 10 changes (max delta 2.793)
     17.52 seconds: ML NNI round 3 of 23, 501 of 2545 splits, 18 changes (max delta 2.793)


     17.71 seconds: ML NNI round 3 of 23, 701 of 2545 splits, 32 changes (max delta 9.614)
     17.90 seconds: ML NNI round 3 of 23, 901 of 2545 splits, 49 changes (max delta 9.614)


     18.09 seconds: ML NNI round 3 of 23, 1101 of 2545 splits, 56 changes (max delta 9.614)
     18.27 seconds: ML NNI round 3 of 23, 1301 of 2545 splits, 69 changes (max delta 9.614)


     18.43 seconds: ML NNI round 3 of 23, 1501 of 2545 splits, 79 changes (max delta 9.614)
ML-NNI round 3: LogLk = -86968.777 NNIs 83 max delta 9.61 Time 18.52
     18.61 seconds: ML NNI round 4 of 23, 101 of 2545 splits, 0 changes


     18.77 seconds: ML NNI round 4 of 23, 301 of 2545 splits, 7 changes (max delta 3.544)
     18.96 seconds: ML NNI round 4 of 23, 501 of 2545 splits, 18 changes (max delta 3.544)


     19.16 seconds: ML NNI round 4 of 23, 701 of 2545 splits, 28 changes (max delta 3.977)
     19.32 seconds: ML NNI round 4 of 23, 901 of 2545 splits, 32 changes (max delta 3.977)


ML-NNI round 4: LogLk = -86935.108 NNIs 34 max delta 3.98 Time 19.42
     19.50 seconds: ML NNI round 5 of 23, 101 of 2545 splits, 6 changes (max delta 3.187)


     19.69 seconds: ML NNI round 5 of 23, 301 of 2545 splits, 10 changes (max delta 3.187)
ML-NNI round 5: LogLk = -86925.844 NNIs 14 max delta 3.19 Time 19.81
     19.81 seconds: ML NNI round 6 of 23, 1 of 2545 splits


     20.00 seconds: ML NNI round 6 of 23, 201 of 2545 splits, 5 changes (max delta 0.000)
ML-NNI round 6: LogLk = -86925.539 NNIs 7 max delta 0.00 Time 20.05
Turning off heuristics for final round of ML NNIs (converged)
     20.13 seconds: ML NNI round 7 of 23, 101 of 2545 splits, 0 changes


     20.30 seconds: ML NNI round 7 of 23, 301 of 2545 splits, 0 changes
     20.46 seconds: ML NNI round 7 of 23, 501 of 2545 splits, 0 changes


     20.64 seconds: ML NNI round 7 of 23, 701 of 2545 splits, 2 changes (max delta 0.000)
     20.81 seconds: ML NNI round 7 of 23, 901 of 2545 splits, 2 changes (max delta 0.000)


     20.99 seconds: ML NNI round 7 of 23, 1101 of 2545 splits, 4 changes (max delta 0.000)
     21.17 seconds: ML NNI round 7 of 23, 1301 of 2545 splits, 7 changes (max delta 0.677)


     21.36 seconds: ML NNI round 7 of 23, 1501 of 2545 splits, 8 changes (max delta 0.677)
     21.54 seconds: ML NNI round 7 of 23, 1701 of 2545 splits, 10 changes (max delta 6.597)


     21.72 seconds: ML NNI round 7 of 23, 1901 of 2545 splits, 11 changes (max delta 6.597)
     21.90 seconds: ML NNI round 7 of 23, 2101 of 2545 splits, 14 changes (max delta 6.597)


     22.06 seconds: ML NNI round 7 of 23, 2301 of 2545 splits, 16 changes (max delta 6.597)
     22.22 seconds: ML NNI round 7 of 23, 2501 of 2545 splits, 16 changes (max delta 6.597)


ML-NNI round 7: LogLk = -86912.345 NNIs 16 max delta 6.60 Time 22.28 (final)
     22.33 seconds: ML Lengths 201 of 2545 splits
     22.45 seconds: ML Lengths 701 of 2545 splits


     22.56 seconds: ML Lengths 1101 of 2545 splits
     22.66 seconds: ML Lengths 1501 of 2545 splits


     22.77 seconds: ML Lengths 1901 of 2545 splits
     22.87 seconds: ML Lengths 2301 of 2545 splits
Optimize all lengths: LogLk = -86910.932 Time 22.96


     23.11 seconds: ML split tests for    100 of   2544 internal splits
     23.26 seconds: ML split tests for    200 of   2544 internal splits


     23.42 seconds: ML split tests for    300 of   2544 internal splits
     23.57 seconds: ML split tests for    400 of   2544 internal splits


     23.72 seconds: ML split tests for    500 of   2544 internal splits
     23.88 seconds: ML split tests for    600 of   2544 internal splits


     24.03 seconds: ML split tests for    700 of   2544 internal splits
     24.19 seconds: ML split tests for    800 of   2544 internal splits


     24.33 seconds: ML split tests for    900 of   2544 internal splits
     24.48 seconds: ML split tests for   1000 of   2544 internal splits


     24.63 seconds: ML split tests for   1100 of   2544 internal splits
     24.78 seconds: ML split tests for   1200 of   2544 internal splits


     24.92 seconds: ML split tests for   1300 of   2544 internal splits
     25.07 seconds: ML split tests for   1400 of   2544 internal splits


     25.22 seconds: ML split tests for   1500 of   2544 internal splits
     25.37 seconds: ML split tests for   1600 of   2544 internal splits


     25.52 seconds: ML split tests for   1700 of   2544 internal splits
     25.67 seconds: ML split tests for   1800 of   2544 internal splits


     25.82 seconds: ML split tests for   1900 of   2544 internal splits
     25.96 seconds: ML split tests for   2000 of   2544 internal splits


     26.11 seconds: ML split tests for   2100 of   2544 internal splits
     26.25 seconds: ML split tests for   2200 of   2544 internal splits


     26.40 seconds: ML split tests for   2300 of   2544 internal splits
     26.54 seconds: ML split tests for   2400 of   2544 internal splits


     26.68 seconds: ML split tests for   2500 of   2544 internal splits
Total time: 26.75 seconds Unique: 2547/2732 Bad splits: 1/2544 Worst delta-LogLk 0.253


### Compute Alpha Diversity (Phylogeny)
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/): Computes a user-specified phylogenetic alpha diversity metric for all samples in a feature table.
- Metrics: Choices ('faith_pd')

In [11]:
metrics = ('faith_pd', )
alpha_diversities_phylogenetic = dict()
for metric in metrics:
    print(f"Calculating alpha diversity: {metric}")
    try:
        alpha_diversity = alpha_phylogenetic(table=tabs, phylogeny=mafft_rooted_tree, metric=metric).alpha_diversity
        alpha_diversities_phylogenetic[metric] = alpha_diversity
        # Save Artifact
        file_path = os.path.join(alpha_path, f'alpha-phylogeny-{metric}.qza')
        alpha_diversity.save(file_path)
        print(f"DONE: Calculating alpha phylogeny: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha phylogeny: {metric}")

Calculating alpha diversity: faith_pd
DONE: Calculating alpha phylogeny: faith_pd


### Alpha diversity correlation

This method only process `numeric` columns.


In [12]:
methods = ('spearman', 'pearson')
numerics_cols = metadata_qa.filter_columns(column_type='numeric')
if numerics_cols.column_count > 0:
    for metric, alpha_values in alpha_diversities.items():
        for method in methods:
            try:
                corr_view = alpha_correlation(alpha_diversity=alpha_values, metadata=numerics_cols, 
                                          method=method, intersect_ids=True).visualization
                view_path = os.path.join(alpha_path, f'alpha-correlation-{metric}-{method}.qzv')
                corr_view.save(view_path)
                corr_view
                print(f"DONE: Calculating alpha correlation: {metric} {method}")
            except Exception as e:
                print(f"ERROR: Calculating alpha correlation: {metric} {method}")

## Alpha diversity comparisons

Visually and statistically compare groups of alpha diversity values.

[diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/)

In [13]:
for metric, alpha_values in alpha_diversities.items():
    print(f"Processing alpha_group_significance: {metric}")
    try:
        significance_view = alpha_group_significance(alpha_diversity=alpha_values, metadata=metadata_qa).visualization
        view_path = os.path.join(alpha_path, f'alpha-group-significance-{metric}.qzv')
        significance_view.save(view_path)
        significance_view
        print(f"DONE: Calculating alpha group significance: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha group significance: {metric}")

Processing alpha_group_significance: chao1
ERROR: Calculating alpha group significance: chao1
Processing alpha_group_significance: observed_features
ERROR: Calculating alpha group significance: observed_features
Processing alpha_group_significance: shannon
ERROR: Calculating alpha group significance: shannon
Processing alpha_group_significance: simpson
ERROR: Calculating alpha group significance: simpson
Processing alpha_group_significance: dominance
ERROR: Calculating alpha group significance: dominance
Processing alpha_group_significance: gini_index
ERROR: Calculating alpha group significance: gini_index
Processing alpha_group_significance: goods_coverage
ERROR: Calculating alpha group significance: goods_coverage
Processing alpha_group_significance: singles
ERROR: Calculating alpha group significance: singles
Processing alpha_group_significance: strong
ERROR: Calculating alpha group significance: strong


## Beta diversity analysis

#### Reference
- [diversity beta](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta/): Computes a user-specified beta diversity metric for all pairs of samples in a feature table.
- [Beta diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.beta.html)

- Metric Choices('aitchison', 'braycurtis', 'canberra', 'canberra_adkins', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulsinski', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule')

In [14]:
metrics = ('aitchison', 'braycurtis', 'canberra', 'canberra_adkins', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulsinski', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule')
metrics = ('euclidean', 'dice', 'braycurtis', 'correlation', 'cosine', 'matching', 'jaccard')
beta_diversities = dict()
for metric in metrics:
    print(f"Calculating beta diversity: {metric}")
    try:
        beta_diversity = beta(table=tabs, metric=metric, n_jobs=6, pseudocount=1).distance_matrix
        beta_diversities[metric] = beta_diversity
        # Save SampleData[BetaDiversity] Artifact
        file_path = os.path.join(beta_path, f'beta-values-{metric}.qza')
        beta_diversity.save(file_path)
        print(f"DONE: Calculating beta diversity: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating beta diversity: {metric}")

Calculating beta diversity: euclidean
DONE: Calculating beta diversity: euclidean
Calculating beta diversity: dice




DONE: Calculating beta diversity: dice
Calculating beta diversity: braycurtis
DONE: Calculating beta diversity: braycurtis
Calculating beta diversity: correlation
ERROR: Calculating beta diversity: correlation
Calculating beta diversity: cosine
ERROR: Calculating beta diversity: cosine
Calculating beta diversity: matching


DONE: Calculating beta diversity: matching
Calculating beta diversity: jaccard
DONE: Calculating beta diversity: jaccard




### Beta group significance

- [diversity beta_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-group-significance/): Determine whether groups of samples are significantly different from one another using a permutation-based statistical test.
- Marti J Anderson. A new method for non-parametric multivariate analysis of variance. Austral ecology, 26(1):32–46, 2001. doi:https://doi.org/10.1111/j.1442-9993.2001.01070.pp.x.

In [15]:
methods = ('permanova', 'anosim', 'permdisp')
for method in methods:
    for metric, beta_diversity in beta_diversities.items():
        print(f'Calculating beta group significance with method {method} and metric {metric}')
        try:
            beta_view = beta_group_significance(distance_matrix=beta_diversity, 
                                                metadata=metadata_qa.get_column(class_col), 
                                                pairwise=True, method=method).visualization
            view_name = os.path.join(beta_path, f'beta-group-significance-{metric}-{method}.qzv')
            beta_view.save(view_name)
            print(f"DONE: Calculating beta group significance: {method} {metric}")
        except Exception as e:
            print(f"ERROR: Calculating beta group significance: {method} {metric}")

Calculating beta group significance with method permanova and metric euclidean
ERROR: Calculating beta group significance: permanova euclidean
Calculating beta group significance with method permanova and metric dice
ERROR: Calculating beta group significance: permanova dice
Calculating beta group significance with method permanova and metric braycurtis
ERROR: Calculating beta group significance: permanova braycurtis
Calculating beta group significance with method permanova and metric matching
ERROR: Calculating beta group significance: permanova matching
Calculating beta group significance with method permanova and metric jaccard
ERROR: Calculating beta group significance: permanova jaccard
Calculating beta group significance with method anosim and metric euclidean
ERROR: Calculating beta group significance: anosim euclidean
Calculating beta group significance with method anosim and metric dice
ERROR: Calculating beta group significance: anosim dice
Calculating beta group significance

### Beta group Rarefaction

- [diversity beta_rarefaction](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-rarefaction/): Repeatedly rarefy a feature table to compare beta diversity results within a given rarefaction depth.  For a given beta diversity metric, this visualizer will provide: an Emperor jackknifed PCoA plot, samples clustered by UPGMA or neighbor joining with support calculation, and a heatmap showing the correlation between rarefaction trials of that beta diversity metric.