 
# Alpha and Beta Diversity

Different higher-level measures are often used to describe the microbiome in a sample. These do not provide information on changes in the abundance of specific taxa but allow us to access a broader change or difference in the composition of microorganisms. Alpha and beta diversity are examples of such measures.

Different measures exist to estimate diversity within a single sample, jointly called alpha diversity. The different measures reflect the richness (number) or distribution (evenness) of a microbial sample or aim to reflect a combination of both properties.

Rarefaction curves are often used when calculating alpha diversity indices because increasing numbers of sequenced taxa allow increasingly accurate estimates of total population diversity. Rarefaction curves can therefore be used to estimate the full sample richness, as compared to the observed sample richness.

While alpha diversity is a measure of microbiome diversity applicable to a single sample, beta diversity is a measure of the similarity or dissimilarity of two communities. As for alpha diversity, many indices exist, each reflecting different aspects of community heterogeneity. Key differences relate to how the indices value variation in rare species if they consider presence/absence only or incorporate abundance, and how they interpret shared absence. Bray-Curtis dissimilarity is a popular measure that considers both size (overall abundance per sample) and shape (abundance of each taxon) of the communities (Bray, 1957). Beta diversity is an essential measure for many popular statistical methods in ecology, such as ordination-based methods, and is widely used for studying the association between environmental variables and microbial composition.

In summary, alpha diversity measures can be seen as a summary statistic of a single population (within-sample diversity), while beta diversity measures are estimates of similarity or dissimilarity between populations (between samples).

**Source**: (https://biomcare.com/info/key-terms-in-microbiome-projects/)

### STEP : Diversity Analysis

Using QIIME2 to create diversity analisys graphs and calculations.

- [QIIME2 Workflow Overview](https://docs.qiime2.org/2022.8/tutorials/overview/)


#### Methods
- [diversity](https://docs.qiime2.org/2022.8/plugins/available/diversity/)
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/)
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/)
- [diversity beta](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta/)
- [diversity core_metrics](https://docs.qiime2.org/2022.8/plugins/available/diversity/core-metrics/)
- [diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/)
- [diversity beta_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-group-significance/)
- [feature_table core_features](https://docs.qiime2.org/2022.8/plugins/available/feature-table/core-features/)
- [feature_table summarize](https://docs.qiime2.org/2022.8/plugins/available/feature-table/summarize/)
- [taxa filter-table](https://docs.qiime2.org/2022.8/plugins/available/taxa/filter-table/)
- [taxa collapse](https://docs.qiime2.org/2022.8/plugins/available/taxa/collapse/)

## Setup and settings

In [1]:
# Importing packages
import os
import pandas as pd
from qiime2 import Artifact
from qiime2 import Visualization
from qiime2 import Metadata

from qiime2.plugins.phylogeny.pipelines import align_to_tree_mafft_fasttree

from qiime2.plugins.diversity.pipelines import alpha
from qiime2.plugins.diversity.pipelines import beta
from qiime2.plugins.diversity.pipelines import core_metrics
from qiime2.plugins.diversity.pipelines import alpha_phylogenetic

from qiime2.plugins.diversity.visualizers import alpha_group_significance
from qiime2.plugins.diversity.visualizers import beta_group_significance
from qiime2.plugins.diversity.visualizers import alpha_correlation
from qiime2.plugins.diversity.visualizers import beta_rarefaction

from qiime2.plugins.taxa.methods import filter_table
from qiime2.plugins.taxa.methods import collapse

from qiime2.plugins.feature_table.visualizers import tabulate_seqs
from qiime2.plugins.feature_table.visualizers import summarize
from qiime2.plugins.feature_table.visualizers import core_features
from qiime2.plugins.diversity.pipelines import core_metrics_phylogenetic

from qiime2.plugins.feature_table.methods import filter_samples
from qiime2.plugins.feature_table.methods import filter_seqs

from qiime2.plugins.alignment.methods import mafft


import matplotlib.pyplot as plt

%matplotlib inline

### Receiving the parameters

The following cell can receive parameters using the [papermill](https://papermill.readthedocs.io/en/latest/) tool.

In [2]:
base_dir = os.path.join('/', 'home')
metadata_file = os.path.abspath(os.path.join(base_dir, 'data', 'metadatada.tsv'))
experiment_name = ''
class_col = ''
replace_files = False

In [3]:
# Parameters
experiment_name = "thayane-meno-hist"
base_dir = "/mnt/nupeb/rede-micro/redemicro-thayane"
manifest_file = "/mnt/nupeb/rede-micro/redemicro-thayane/data/manifest-paired.csv"
metadata_file = (
    "/mnt/nupeb/rede-micro/redemicro-thayane/data/metadata-meno-joined-hist.tsv"
)
class_col = "class"
classifier_file = "/mnt/nupeb/rede-micro/datasets/16S_classifiers_qiime2/silva-138-99-nb-classifier.qza"
replace_files = False
phred = 20
trunc_f = 0
trunc_r = 0
overlap = 12
threads = 6
trim = {
    "overlap": 8,
    "forward_primer": "CCTACGGGRSGCAGCAG",
    "reverse_primer": "GGACTACHVGGGTWTCTAAT",
}


In [4]:
experiment_folder = os.path.abspath(os.path.join(base_dir, 'experiments', experiment_name))
img_folder = os.path.abspath(os.path.join(experiment_folder, 'imgs'))

### Defining names, paths and flags

In [5]:
# QIIME2 Artifacts folder
qiime_folder = os.path.join(experiment_folder, 'qiime-artifacts')

# Input - DADA2 Artifacts
dada2_tabs_path = os.path.join(qiime_folder, 'dada2-tabs.qza')
dada2_reps_path = os.path.join(qiime_folder, 'dada2-reps.qza')
dada2_stat_path = os.path.join(qiime_folder, 'dada2-stat.qza')

# Input - Taxonaomic Artifacts
taxonomy_path = os.path.join(qiime_folder, 'metatax.qza')

# Create folder to store Alpha files
alpha_path = os.path.join(qiime_folder, 'alpha-analysis')
if not os.path.exists(alpha_path):
    os.makedirs(alpha_path)
    print(f'The new directory is created in {alpha_path}')
    
# Create folder to store Beta files
beta_path = os.path.join(qiime_folder, 'beta-analysis')
if not os.path.exists(beta_path):
    os.makedirs(beta_path)
    print(f'The new directory is created in {beta_path}')

# Output -Diversity Artifacts
alpha_diversity_path = os.path.join(alpha_path, 'alpha-diversity.qza')
alpha_diversity_view_path = os.path.join(alpha_path, 'alpha-diversity.qzv')
beta_diversity_path = os.path.join(beta_path, 'beta-diversity.qza')
beta_diversity_view_path = os.path.join(beta_path, 'beta-diversity.qzv')

The new directory is created in /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/alpha-analysis
The new directory is created in /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/beta-analysis


In [6]:
def filter_and_collapse(tab, seqs, tax, meta, lvl, exclude=True, exclude_list='uncultured,unidentified,metagenome'):
    from qiime2.plugins.taxa.methods import collapse
    from qiime2.plugins.taxa.methods import filter_table
    from qiime2.plugins.feature_table.methods import filter_seqs
    from qiime2.plugins.feature_table.visualizers import summarize
    
    to_include = ('d', 'p', 'c', 'o', 'f', 'g', 's')[lvl-1]
    to_include += '__'
    to_exclude = exclude_list if exclude else None
    
    filtered_tabs = filter_table(
        table=tab, 
        taxonomy=tax,
        include=to_include,
        exclude=to_exclude,
        mode='contains').filtered_table
    
    filtered_seqs = filter_seqs(
        data = seqs,
        table = filtered_tabs,
    ).filtered_data
    
    collapsed_table = collapse(table=filtered_tabs, taxonomy=tax, level=lvl).collapsed_table
    collapsed_table_view = summarize(table=collapsed_table, sample_metadata=meta).visualization
    
    return collapsed_table, collapsed_table_view, filtered_seqs

## Step execution

### Load input files

This Step import the QIIME2 `FeatureTable[Frequency]` Artifact and the `Metadata` file.

In [7]:
#Load Metadata
metadata_qa = Metadata.load(metadata_file)

#Load FeatureTable[Frequency]
tabs = Artifact.load(dada2_tabs_path)
tabs_df = tabs.view(Metadata).to_dataframe().T

# FeatureData[Sequence]
reps = Artifact.load(dada2_reps_path)

# FeatureData[Taxonomy]
tax = Artifact.load(taxonomy_path)

In [8]:
# Filter FeatureTable[Frequency | RelativeFrequency | PresenceAbsence | Composition] based on Metadata sample ID values
tabs = filter_samples(
    table=tabs,
    metadata=metadata_qa,
).filtered_table
# Filter SampleData[SequencesWithQuality | PairedEndSequencesWithQuality | JoinedSequencesWithQuality] based on Metadata sample ID values; returns FeatureData[Sequence | AlignedSequence]
reps = filter_seqs(
    data=reps,
    table=tabs,
).filtered_data

  for id_, seq in data.iteritems():


## Alpha diversity analysis

#### Reference
- [The Use and Types of Alpha-Diversity Metrics in Microbial NGS](https://www.cd-genomics.com/microbioseq/the-use-and-types-of-alpha-diversity-metrics-in-microbial-ngs.html)
- [Alpha diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.alpha.html)

#### Methods
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/): Computes a user-specified alpha diversity metric for all samples in a
feature table.
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/): Computes a user-specified phylogenetic alpha diversity metric for all
samples in a feature table.
- [diversity alpha_correlation](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-correlation/): Determine whether numeric sample metadata columns are correlated with alpha diversity.
- [diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/): Visually and statistically compare groups of alpha diversity values.

### Compute Alpha Diversity vectors
- [diversity alpha](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha/): Computes a user-specified alpha diversity metric for all samples in a feature table.
- [Alpha diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.alpha.html)
 - Choices: ('ace', 'berger_parker_d', 'brillouin_d', 'chao1', 'chao1_ci', 'dominance', 'doubles', 'enspie', 'esty_ci', 'fisher_alpha', 'gini_index', 'goods_coverage', 'heip_e', 'kempton_taylor_q', 'lladser_pe', 'margalef', 'mcintosh_d', 'mcintosh_e', 'menhinick', 'michaelis_menten_fit', 'observed_features', 'osd', 'pielou_e', 'robbins', 'shannon', 'simpson', 'simpson_e', 'singles', 'strong')

In [9]:
metrics = ('ace', 'berger_parker_d', 'brillouin_d', 'chao1', 'chao1_ci', 'dominance', 'doubles', 'enspie', 'esty_ci', 'fisher_alpha', 'gini_index', 'goods_coverage', 'heip_e', 'kempton_taylor_q', 'lladser_pe', 'margalef', 'mcintosh_d', 'mcintosh_e', 'menhinick', 'michaelis_menten_fit', 'observed_features', 'osd', 'pielou_e', 'robbins', 'shannon', 'simpson', 'simpson_e', 'singles', 'strong')

# Sugestão de valores para Alpha diversity
# chao1 e observed_features (riqueza); shannon e simpson (diversidade - que levam em consideração riqueza e equitabilidade).
metrics = ('chao1', 'chao1_ci', 'observed_features', 'shannon', 'simpson', 'simpson_e')
alpha_diversities = dict()
for metric in metrics:
    print(f"Calculating alpha diversity: {metric}")
    try:
        alpha_diversity = alpha(table=tabs, metric=metric).alpha_diversity
        alpha_diversities[metric] = alpha_diversity
        # Save SampleData[AlphaDiversity] Artifact
        file_path = os.path.join(alpha_path, f'alpha-values-{metric}.qza')
        alpha_diversity.save(file_path)
        print(f"DONE: Calculating alpha diversity: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha diversity: {metric}")
        print(e)

Calculating alpha diversity: chao1
DONE: Calculating alpha diversity: chao1
Calculating alpha diversity: chao1_ci
DONE: Calculating alpha diversity: chao1_ci
Calculating alpha diversity: observed_features


DONE: Calculating alpha diversity: observed_features
Calculating alpha diversity: shannon
DONE: Calculating alpha diversity: shannon
Calculating alpha diversity: simpson
DONE: Calculating alpha diversity: simpson
Calculating alpha diversity: simpson_e


DONE: Calculating alpha diversity: simpson_e


### Create Phylogenetic inference

- [alignment align_to_tree_mafft_fasttree](https://docs.qiime2.org/2022.8/plugins/available/phylogeny/align-to-tree-mafft-fasttree/): Build a phylogenetic tree using fasttree and mafft alignment

This pipeline will start by creating a sequence alignment using MAFFT,
after which any alignment columns that are phylogenetically uninformative
or ambiguously aligned will be removed (masked). The resulting masked
alignment will be used to infer a phylogenetic tree and then subsequently
rooted at its midpoint. Output files from each step of the pipeline will be
saved. This includes both the unmasked and masked MAFFT alignment from
q2-alignment methods, and both the rooted and unrooted phylogenies from
q2-phylogeny methods.


Returns
- alignment : FeatureData[AlignedSequence] : The aligned sequences.
- masked_alignment : FeatureData[AlignedSequence] : The masked alignment.
- tree : Phylogeny[Unrooted] : The unrooted phylogenetic tree.
- rooted_tree : Phylogeny[Rooted] : The rooted phylogenetic tree.

In [10]:
mafft_alignment, mafft_masked_alignment, mafft_tree, mafft_rooted_tree = align_to_tree_mafft_fasttree(
    sequences=reps, n_threads=6, )

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: mafft --preservecase --inputorder --thread 6 /tmp/qiime2/lauro/data/1ecef25c-4d9c-46a2-82cf-d4b8aee3eff0/data/dna-sequences.fasta



inputfile = orig
2963 x 430 - 24 d
nthread = 6
nthreadpair = 6
nthreadtb = 6
ppenalty_ex = 0
stacksize: 8192 kb
generating a scoring matrix for nucleotide (dist=200) ... done
Gap Penalty = -1.53, +0.00, +0.00



Making a distance matrix ..
    1 / 2963 (thread    0)  101 / 2963 (thread    0)  201 / 2963 (thread    2)  301 / 2963 (thread    4)  401 / 2963 (thread    4)  501 / 2963 (thread    1)

  601 / 2963 (thread    4)  701 / 2963 (thread    4)  801 / 2963 (thread    3)  901 / 2963 (thread    4) 1001 / 2963 (thread    1) 1101 / 2963 (thread    2) 1201 / 2963 (thread    5) 1301 / 2963 (thread    0) 1401 / 2963 (thread    0) 1501 / 2963 (thread    5) 1601 / 2963 (thread    1)

 1701 / 2963 (thread    2) 1801 / 2963 (thread    5) 1901 / 2963 (thread    0) 2001 / 2963 (thread    3) 2101 / 2963 (thread    1) 2201 / 2963 (thread    2) 2301 / 2963 (thread    2) 2401 / 2963 (thread    2) 2501 / 2963 (thread    3) 2601 / 2963 (thread    4) 2701 / 2963 (thread    5) 2801 / 2963 (thread    5) 2901 / 2963 (thread    2)
done.

Constructing a UPGMA tree (efffree=0) ... 
    0 / 2963   10 / 2963   20 / 2963   30 / 2963   40 / 2963   50 / 2963   60 / 2963   70 / 2963   80 / 2963   90 / 2963  100 / 2963  110 / 2963  120 / 2963  130 / 2963  140 / 2963  150 / 2963  160 / 2963  170 / 2963  180 / 2963  190 / 2963  200 / 2963  210 / 2963  220 / 2963  230 / 2963  240 / 2963  250 / 2963  260 / 2963  270 / 2963  280 / 2963  290 / 2963  300 / 2963  310 / 2963  320 / 2963  330 / 2963  340 / 2963  350 / 2963  360 / 2963  370 / 2963  380 / 2963  390 / 2963  400 / 2963  410 / 2963  420 / 2963  430 / 2963  440 / 2963  450 / 2963

  470 / 2963  480 / 2963  490 / 2963  500 / 2963  510 / 2963  520 / 2963  530 / 2963  540 / 2963  550 / 2963  560 / 2963  570 / 2963  580 / 2963  590 / 2963  600 / 2963  610 / 2963  620 / 2963  630 / 2963  640 / 2963  650 / 2963  660 / 2963  670 / 2963  680 / 2963  690 / 2963  700 / 2963  710 / 2963  720 / 2963  730 / 2963  740 / 2963  750 / 2963  760 / 2963  770 / 2963  780 / 2963  790 / 2963  800 / 2963  810 / 2963  820 / 2963  830 / 2963  840 / 2963  850 / 2963  860 / 2963  870 / 2963  880 / 2963  890 / 2963  900 / 2963  910 / 2963  920 / 2963  930 / 2963  940 / 2963  950 / 2963  960 / 2963  970 / 2963  980 / 2963  990 / 2963 1000 / 2963 1010 / 2963 1020 / 2963 1030 / 2963 1040 / 2963 1050 / 2963 1060 / 2963 1070 / 2963 1080 / 2963 1090 / 2963 1100 / 2963 1110 / 2963 1120 / 2963 1130 / 2963 1140 / 2963 1150 / 2963 1160 / 2963 1170 / 2963 1180 / 2963 1190 / 2963 1200 / 2963 1210 / 2963 1220 / 2963 1230 / 296

STEP  1001 / 2962 (thread    4) fSTEP  1101 / 2962 (thread    0) fSTEP  1201 / 2962 (thread    3) fSTEP  1301 / 2962 (thread    4) fSTEP  1401 / 2962 (thread    0) fSTEP  1501 / 2962 (thread    1) fSTEP  1601 / 2962 (thread    2) fSTEP  1701 / 2962 (thread    3) fSTEP  1801 / 2962 (thread    3) fSTEP  1901 / 2962 (thread    1) fSTEP  2001 / 2962 (thread    2) fSTEP  2101 / 2962 (thread    0) fSTEP  2201 / 2962 (thread    3) fSTEP  2301 / 2962 (thread    0) f

STEP  2401 / 2962 (thread    3) f
Reallocating..done. *alloclen = 1861
STEP  2501 / 2962 (thread    2) fSTEP  2601 / 2962 (thread    2) dSTEP  2701 / 2962 (thread    5) fSTEP  2801 / 2962 (thread    0) fSTEP  2901 / 2962 (thread    4) d h


done.

Making a distance matrix from msa.. 
    0 / 2963 (thread    0)  100 / 2963 (thread    1)  200 / 2963 (thread    4)  300 / 2963 (thread    3)

  400 / 2963 (thread    0)  500 / 2963 (thread    2)  600 / 2963 (thread    4)  700 / 2963 (thread    3)  800 / 2963 (thread    5)

  900 / 2963 (thread    4) 1000 / 2963 (thread    4) 1100 / 2963 (thread    0) 1200 / 2963 (thread    5) 1300 / 2963 (thread    2) 1400 / 2963 (thread    4) 1500 / 2963 (thread    5)

 1600 / 2963 (thread    1) 1700 / 2963 (thread    3) 1800 / 2963 (thread    2) 1900 / 2963 (thread    3) 2000 / 2963 (thread    1) 2100 / 2963 (thread    1) 2200 / 2963 (thread    2) 2300 / 2963 (thread    1) 2400 / 2963 (thread    2) 2500 / 2963 (thread    0) 2600 / 2963 (thread    1) 2700 / 2963 (thread    2) 2800 / 2963 (thread    0) 2900 / 2963 (thread    3)
done.

Constructing a UPGMA tree (efffree=1) ... 
    0 / 2963   10 / 2963   20 / 2963   30 / 2963   40 / 2963   50 / 2963   60 / 2963   70 / 2963   80 / 2963   90 / 2963  100 / 2963  110 / 2963  120 / 2963  130 / 2963  140 / 2963  150 / 2963  160 / 2963  170 / 2963  180 / 2963  190 / 2963

  200 / 2963  210 / 2963  220 / 2963  230 / 2963  240 / 2963  250 / 2963  260 / 2963  270 / 2963  280 / 2963  290 / 2963  300 / 2963  310 / 2963  320 / 2963  330 / 2963  340 / 2963  350 / 2963  360 / 2963  370 / 2963  380 / 2963  390 / 2963  400 / 2963  410 / 2963  420 / 2963  430 / 2963  440 / 2963  450 / 2963  460 / 2963  470 / 2963  480 / 2963  490 / 2963  500 / 2963  510 / 2963  520 / 2963  530 / 2963  540 / 2963  550 / 2963  560 / 2963  570 / 2963  580 / 2963  590 / 2963  600 / 2963  610 / 2963  620 / 2963  630 / 2963  640 / 2963  650 / 2963  660 / 2963  670 / 2963  680 / 2963  690 / 2963  700 / 2963  710 / 2963  720 / 2963  730 / 2963  740 / 2963  750 / 2963  760 / 2963  770 / 2963  780 / 2963  790 / 2963  800 / 2963  810 / 2963  820 / 2963  830 / 2963  840 / 2963  850 / 2963  860 / 2963  870 / 2963  880 / 2963  890 / 2963  900 / 2963  910 / 2963  920 / 2963  930 / 2963  940 / 2963  950 / 2963  960 / 296

STEP  1201 / 2962 (thread    4) fSTEP  1301 / 2962 (thread    3) fSTEP  1401 / 2962 (thread    3) fSTEP  1501 / 2962 (thread    0) fSTEP  1601 / 2962 (thread    0) fSTEP  1701 / 2962 (thread    5) fSTEP  1801 / 2962 (thread    4) f
Reallocating..done. *alloclen = 1863
STEP  1901 / 2962 (thread    3) fSTEP  2001 / 2962 (thread    1) fSTEP  2101 / 2962 (thread    0) fSTEP  2201 / 2962 (thread    0) fSTEP  2301 / 2962 (thread    5) fSTEP  2401 / 2962 (thread    3) fSTEP  2501 / 2962 (thread    5) f

STEP  2601 / 2962 (thread    1) fSTEP  2701 / 2962 (thread    3) fSTEP  2801 / 2962 (thread    4) d hSTEP  2901 / 2962 (thread    0) f


done.

disttbfast (nuc) Version 7.520
alg=A, model=DNA200 (2), 1.53 (4.59), -0.00 (-0.00), noshift, amax=0.0
6 thread(s)


Strategy:
 FFT-NS-2 (Fast but rough)
 Progressive method (guide trees were built 2 times.)

If unsure which option to use, try 'mafft --auto input > output'.
For more information, see 'mafft --help', 'mafft --man' and the mafft page.

The default gap scoring scheme has been changed in version 7.110 (2013 Oct).
It tends to insert more gaps into gap-rich regions than previous versions.
To disable this change, add the --leavegappyregion option.



Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: FastTreeMP -quote -nt /tmp/qiime2/lauro/data/be23ddc3-3268-4c29-a11e-470a31c43312/data/aligned-dna-sequences.fasta



FastTree Version 2.1.11 Double precision (No SSE3), OpenMP (6 threads)
Alignment: /tmp/qiime2/lauro/data/be23ddc3-3268-4c29-a11e-470a31c43312/data/aligned-dna-sequences.fasta
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories
      0.11 seconds: Joined    100 of   2782


      0.22 seconds: Joined    300 of   2782
      0.32 seconds: Joined    500 of   2782


      0.48 seconds: Joined    700 of   2782
      0.61 seconds: Joined    900 of   2782


      0.75 seconds: Joined   1100 of   2782
      0.87 seconds: Joined   1300 of   2782


      0.98 seconds: Joined   1500 of   2782
      1.09 seconds: Joined   1700 of   2782


      1.20 seconds: Joined   1900 of   2782
      1.30 seconds: Joined   2100 of   2782


      1.44 seconds: Joined   2300 of   2782
      1.57 seconds: Joined   2500 of   2782


Initial topology in 1.70 seconds
Refining topology: 46 rounds ME-NNIs, 2 rounds ME-SPRs, 23 rounds ML-NNIs
      1.69 seconds: ME NNI round 1 of 46, 1 of 2783 splits
      1.79 seconds: ME NNI round 2 of 46, 2001 of 2783 splits, 218 changes (max delta 0.075)


      1.89 seconds: ME NNI round 5 of 46, 401 of 2783 splits, 14 changes (max delta 0.015)
      2.00 seconds: SPR round   1 of   2, 201 of 5568 nodes


      2.12 seconds: SPR round   1 of   2, 501 of 5568 nodes
      2.24 seconds: SPR round   1 of   2, 801 of 5568 nodes


      2.35 seconds: SPR round   1 of   2, 1101 of 5568 nodes
      2.46 seconds: SPR round   1 of   2, 1401 of 5568 nodes


      2.58 seconds: SPR round   1 of   2, 1701 of 5568 nodes
      2.71 seconds: SPR round   1 of   2, 2001 of 5568 nodes


      2.83 seconds: SPR round   1 of   2, 2301 of 5568 nodes
      2.94 seconds: SPR round   1 of   2, 2601 of 5568 nodes


      3.05 seconds: SPR round   1 of   2, 2901 of 5568 nodes
      3.16 seconds: SPR round   1 of   2, 3201 of 5568 nodes


      3.27 seconds: SPR round   1 of   2, 3501 of 5568 nodes
      3.41 seconds: SPR round   1 of   2, 3801 of 5568 nodes


      3.52 seconds: SPR round   1 of   2, 4101 of 5568 nodes
      3.63 seconds: SPR round   1 of   2, 4401 of 5568 nodes


      3.74 seconds: SPR round   1 of   2, 4701 of 5568 nodes
      3.86 seconds: SPR round   1 of   2, 5001 of 5568 nodes


      3.99 seconds: SPR round   1 of   2, 5401 of 5568 nodes
      4.09 seconds: ME NNI round 16 of 46, 2001 of 2783 splits, 22 changes (max delta 0.022)


      4.21 seconds: SPR round   2 of   2, 101 of 5568 nodes
      4.32 seconds: SPR round   2 of   2, 401 of 5568 nodes


      4.43 seconds: SPR round   2 of   2, 701 of 5568 nodes
      4.57 seconds: SPR round   2 of   2, 1101 of 5568 nodes


      4.69 seconds: SPR round   2 of   2, 1401 of 5568 nodes
      4.81 seconds: SPR round   2 of   2, 1701 of 5568 nodes


      4.91 seconds: SPR round   2 of   2, 2001 of 5568 nodes
      5.03 seconds: SPR round   2 of   2, 2301 of 5568 nodes


      5.14 seconds: SPR round   2 of   2, 2601 of 5568 nodes
      5.25 seconds: SPR round   2 of   2, 2901 of 5568 nodes


      5.37 seconds: SPR round   2 of   2, 3301 of 5568 nodes
      5.48 seconds: SPR round   2 of   2, 3601 of 5568 nodes


      5.60 seconds: SPR round   2 of   2, 3901 of 5568 nodes
      5.71 seconds: SPR round   2 of   2, 4201 of 5568 nodes


      5.85 seconds: SPR round   2 of   2, 4501 of 5568 nodes
      5.97 seconds: SPR round   2 of   2, 4801 of 5568 nodes


      6.07 seconds: SPR round   2 of   2, 5101 of 5568 nodes
      6.18 seconds: SPR round   2 of   2, 5401 of 5568 nodes


      6.28 seconds: ME NNI round 31 of 46, 2401 of 2783 splits, 9 changes (max delta 0.006)
Total branch-length 52.665 after 6.42 sec
      6.42 seconds: ML Lengths 1 of 2783 splits


      6.53 seconds: ML Lengths 701 of 2783 splits
      6.65 seconds: ML Lengths 1401 of 2783 splits


      6.76 seconds: ML Lengths 2101 of 2783 splits
      6.87 seconds: ML NNI round 1 of 23, 1 of 2783 splits


      6.98 seconds: ML NNI round 1 of 23, 401 of 2783 splits, 70 changes (max delta 10.709)
      7.09 seconds: ML NNI round 1 of 23, 801 of 2783 splits, 143 changes (max delta 10.769)


      7.20 seconds: ML NNI round 1 of 23, 1201 of 2783 splits, 221 changes (max delta 10.769)
      7.31 seconds: ML NNI round 1 of 23, 1601 of 2783 splits, 301 changes (max delta 10.769)


      7.42 seconds: ML NNI round 1 of 23, 2001 of 2783 splits, 390 changes (max delta 14.679)
      7.53 seconds: ML NNI round 1 of 23, 2401 of 2783 splits, 484 changes (max delta 14.679)


ML-NNI round 1: LogLk = -125495.847 NNIs 569 max delta 14.68 Time 7.65
      7.66 seconds: Site likelihoods with rate category 1 of 20
      7.77 seconds: Site likelihoods with rate category 6 of 20


      7.87 seconds: Site likelihoods with rate category 11 of 20
      7.98 seconds: Site likelihoods with rate category 16 of 20


Switched to using 20 rate categories (CAT approximation)
Rate categories were divided by 1.227 so that average rate = 1.0
CAT-based log-likelihoods may not be comparable across runs
Use -gamma for approximate but comparable Gamma(20) log-likelihoods
      8.08 seconds: ML NNI round 2 of 23, 1 of 2783 splits
      8.19 seconds: ML NNI round 2 of 23, 401 of 2783 splits, 58 changes (max delta 7.039)


      8.31 seconds: ML NNI round 2 of 23, 801 of 2783 splits, 101 changes (max delta 7.039)
      8.43 seconds: ML NNI round 2 of 23, 1201 of 2783 splits, 145 changes (max delta 7.485)


      8.54 seconds: ML NNI round 2 of 23, 1601 of 2783 splits, 202 changes (max delta 7.864)
      8.65 seconds: ML NNI round 2 of 23, 2001 of 2783 splits, 258 changes (max delta 7.864)


      8.77 seconds: ML NNI round 2 of 23, 2401 of 2783 splits, 309 changes (max delta 20.817)
ML-NNI round 2: LogLk = -104960.137 NNIs 358 max delta 20.82 Time 8.90
      8.89 seconds: ML NNI round 3 of 23, 1 of 2783 splits


      9.00 seconds: ML NNI round 3 of 23, 401 of 2783 splits, 30 changes (max delta 4.954)
      9.11 seconds: ML NNI round 3 of 23, 801 of 2783 splits, 55 changes (max delta 7.172)


      9.22 seconds: ML NNI round 3 of 23, 1201 of 2783 splits, 77 changes (max delta 8.748)
      9.34 seconds: ML NNI round 3 of 23, 1601 of 2783 splits, 106 changes (max delta 8.748)


ML-NNI round 3: LogLk = -104838.208 NNIs 139 max delta 8.75 Time 9.46
      9.46 seconds: ML NNI round 4 of 23, 1 of 2783 splits
      9.56 seconds: ML NNI round 4 of 23, 401 of 2783 splits, 14 changes (max delta 0.651)


      9.68 seconds: ML NNI round 4 of 23, 801 of 2783 splits, 35 changes (max delta 2.165)
      9.79 seconds: ML NNI round 4 of 23, 1201 of 2783 splits, 48 changes (max delta 3.260)
ML-NNI round 4: LogLk = -104808.867 NNIs 48 max delta 3.26 Time 9.81


      9.89 seconds: ML NNI round 5 of 23, 301 of 2783 splits, 9 changes (max delta 6.817)
ML-NNI round 5: LogLk = -104790.942 NNIs 16 max delta 6.82 Time 9.96
     10.02 seconds: ML NNI round 6 of 23, 201 of 2783 splits, 2 changes (max delta 0.000)
ML-NNI round 6: LogLk = -104790.279 NNIs 2 max delta 0.00 Time 10.04
Turning off heuristics for final round of ML NNIs (converged)


     10.14 seconds: ML NNI round 7 of 23, 401 of 2783 splits, 5 changes (max delta 0.748)
     10.24 seconds: ML NNI round 7 of 23, 801 of 2783 splits, 7 changes (max delta 0.748)


     10.35 seconds: ML NNI round 7 of 23, 1201 of 2783 splits, 15 changes (max delta 0.748)
     10.46 seconds: ML NNI round 7 of 23, 1601 of 2783 splits, 20 changes (max delta 0.748)


     10.57 seconds: ML NNI round 7 of 23, 2001 of 2783 splits, 27 changes (max delta 1.396)
     10.67 seconds: ML NNI round 7 of 23, 2401 of 2783 splits, 32 changes (max delta 1.396)


ML-NNI round 7: LogLk = -104778.492 NNIs 35 max delta 1.40 Time 10.80 (final)
     10.79 seconds: ML Lengths 1 of 2783 splits
     10.90 seconds: ML Lengths 901 of 2783 splits


     11.01 seconds: ML Lengths 1801 of 2783 splits
     11.12 seconds: ML Lengths 2701 of 2783 splits
Optimize all lengths: LogLk = -104776.013 Time 11.15


     11.29 seconds: ML split tests for    200 of   2782 internal splits
     11.43 seconds: ML split tests for    400 of   2782 internal splits


     11.58 seconds: ML split tests for    600 of   2782 internal splits
     11.72 seconds: ML split tests for    800 of   2782 internal splits


     11.87 seconds: ML split tests for   1000 of   2782 internal splits
     12.01 seconds: ML split tests for   1200 of   2782 internal splits


     12.16 seconds: ML split tests for   1400 of   2782 internal splits
     12.30 seconds: ML split tests for   1600 of   2782 internal splits


     12.45 seconds: ML split tests for   1800 of   2782 internal splits
     12.59 seconds: ML split tests for   2000 of   2782 internal splits


     12.74 seconds: ML split tests for   2200 of   2782 internal splits
     12.88 seconds: ML split tests for   2400 of   2782 internal splits


     13.03 seconds: ML split tests for   2600 of   2782 internal splits
Total time: 13.17 seconds Unique: 2785/2963 Bad splits: 3/2782 Worst delta-LogLk 5.164


### Compute Alpha Diversity (Phylogeny)
- [diversity alpha_phylogenetic](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-phylogenetic/): Computes a user-specified phylogenetic alpha diversity metric for all samples in a feature table.
- Metrics: Choices ('faith_pd')

In [11]:
metrics = ('faith_pd', )
alpha_diversities_phylogenetic = dict()
for metric in metrics:
    print(f"Calculating alpha diversity: {metric}")
    try:
        alpha_diversity = alpha_phylogenetic(table=tabs, phylogeny=mafft_rooted_tree, metric=metric).alpha_diversity
        alpha_diversities_phylogenetic[metric] = alpha_diversity
        # Save Artifact
        file_path = os.path.join(alpha_path, f'alpha-phylogeny-{metric}.qza')
        alpha_diversity.save(file_path)
        print(f"DONE: Calculating alpha phylogeny: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha phylogeny: {metric}")

Calculating alpha diversity: faith_pd
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command:

faithpd -i /tmp/qiime2/lauro/data/f0370047-28a2-4f3f-9541-4431aa48e21b/data/feature-table.biom -t /tmp/qiime2/lauro/data/a727d7c8-d2c1-4d5e-9888-5efb4197a91c/data/tree.nwk -o /tmp/q2-AlphaDiversityFormat-tm2w1xhr

DONE: Calculating alpha phylogeny: faith_pd


- [core-metrics-phylogenetic](https://docs.qiime2.org/2023.7/plugins/available/diversity/core-metrics-phylogenetic/)

In [12]:
s_depth = int(tabs.view(pd.DataFrame).sum(axis=1).min())
results = core_metrics_phylogenetic(
    table = tabs,
    phylogeny = mafft_rooted_tree,
    sampling_depth = s_depth,
    metadata = metadata_qa,
    n_jobs_or_threads = 6,
)

  warn(


Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command:

faithpd -i /tmp/qiime2/lauro/data/e82b63ce-dca2-43dc-a176-7776e7d90b77/data/feature-table.biom -t /tmp/qiime2/lauro/data/a727d7c8-d2c1-4d5e-9888-5efb4197a91c/data/tree.nwk -o /tmp/q2-AlphaDiversityFormat-7wsnk2rf

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command:

ssu -i /tmp/qiime2/lauro/data/e82b63ce-dca2-43dc-a176-7776e7d90b77/data/feature-table.biom -t /tmp/qiime2/lauro/data/a727d7c8-d2c1-4d5e-9888-5efb4197a91c/data/tree.nwk -m unweighted -o /tmp/q2-LSMatFormat-01t3kft9



Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command:

ssu -i /tmp/qiime2/lauro/data/e82b63ce-dca2-43dc-a176-7776e7d90b77/data/feature-table.biom -t /tmp/qiime2/lauro/data/a727d7c8-d2c1-4d5e-9888-5efb4197a91c/data/tree.nwk -m weighted_unnormalized -o /tmp/q2-LSMatFormat-a4tkmyyh



  warn(
  warn(


In [13]:
results_info = [("rarefied_table", "FeatureTable[Frequency]", "The resulting rarefied feature table."),
("faith_pd_vector", "SampleData[AlphaDiversity]", "Vector of Faith PD values by sample."),
("observed_features_vector", "SampleData[AlphaDiversity]", "Vector of Observed Features values by sample."),
("shannon_vector", "SampleData[AlphaDiversity]", "Vector of Shannon diversity values by sample."),
("evenness_vector", "SampleData[AlphaDiversity]", "Vector of Pielou's evenness values by sample."),
("unweighted_unifrac_distance_matrix", "DistanceMatrix", "Matrix of unweighted UniFrac distances between pairs of samples."),
("weighted_unifrac_distance_matrix", "DistanceMatrix", "Matrix of weighted UniFrac distances between pairs of samples."),
("jaccard_distance_matrix", "DistanceMatrix", "Matrix of Jaccard distances between pairs of samples."),
("bray_curtis_distance_matrix", "DistanceMatrix", "Matrix of Bray-Curtis distances between pairs of samples."),
("unweighted_unifrac_pcoa_results", "PCoAResults", "PCoA matrix computed from unweighted UniFrac distances between samples."),
("weighted_unifrac_pcoa_results", "PCoAResults", "PCoA matrix computed from weighted UniFrac distances between samples."),
("jaccard_pcoa_results", "PCoAResults", "PCoA matrix computed from Jaccard distances between samples."),
("bray_curtis_pcoa_results", "PCoAResults", "PCoA matrix computed from Bray-Curtis distances between samples."),
("unweighted_unifrac_emperor", "Visualization", "Emperor plot of the PCoA matrix computed from unweighted UniFrac."),
("weighted_unifrac_emperor", "Visualization", "Emperor plot of the PCoA matrix computed from weighted UniFrac."),
("jaccard_emperor", "Visualization", "Emperor plot of the PCoA matrix computed from Jaccard."),
("bray_curtis_emperor", "Visualization", "Emperor plot of the PCoA matrix computed from Bray-Curtis.")]

In [14]:
distance_matrix = dict()
for i, info in enumerate(results_info):
    r_id, r_type, r_desc = info
    #print(i, r_id, r_type)
    file_name = f"{r_id}.qzv"
    if r_type == "FeatureTable[Frequency]":
        pass
    elif r_type == "DistanceMatrix":
        distance_matrix[r_id] = results[i]
    elif r_id.endswith('emperor'):
        print(i, r_id, r_type)
        print(f"--- {r_desc} ---")
        file_name = os.path.join(beta_path, file_name)
        print(f'Saving emperor file at: {file_name}\n')
        results[i].save(filepath=file_name)

13 unweighted_unifrac_emperor Visualization
--- Emperor plot of the PCoA matrix computed from unweighted UniFrac. ---
Saving emperor file at: /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/beta-analysis/unweighted_unifrac_emperor.qzv

14 weighted_unifrac_emperor Visualization
--- Emperor plot of the PCoA matrix computed from weighted UniFrac. ---
Saving emperor file at: /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/beta-analysis/weighted_unifrac_emperor.qzv

15 jaccard_emperor Visualization
--- Emperor plot of the PCoA matrix computed from Jaccard. ---
Saving emperor file at: /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/beta-analysis/jaccard_emperor.qzv



16 bray_curtis_emperor Visualization
--- Emperor plot of the PCoA matrix computed from Bray-Curtis. ---
Saving emperor file at: /mnt/nupeb/rede-micro/redemicro-thayane/experiments/thayane-meno-hist/qiime-artifacts/beta-analysis/bray_curtis_emperor.qzv



### Alpha diversity correlation

This method only process `numeric` columns.


In [15]:
methods = ('spearman', 'pearson')
numerics_cols = metadata_qa.filter_columns(column_type='numeric')
if numerics_cols.column_count > 0:
    for metric, alpha_values in alpha_diversities.items():
        for method in methods:
            try:
                corr_view = alpha_correlation(alpha_diversity=alpha_values, metadata=numerics_cols, 
                                          method=method, intersect_ids=True).visualization
                view_path = os.path.join(alpha_path, f'alpha-correlation-{metric}-{method}.qzv')
                corr_view.save(view_path)
                corr_view
                print(f"DONE: Calculating alpha correlation: {metric} {method}")
            except Exception as e:
                print(f"ERROR: Calculating alpha correlation: {metric} {method}")

DONE: Calculating alpha correlation: chao1 spearman


  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]


DONE: Calculating alpha correlation: chao1 pearson
ERROR: Calculating alpha correlation: chao1_ci spearman
ERROR: Calculating alpha correlation: chao1_ci pearson
DONE: Calculating alpha correlation: observed_features spearman


DONE: Calculating alpha correlation: observed_features pearson


  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]


DONE: Calculating alpha correlation: shannon spearman
DONE: Calculating alpha correlation: shannon pearson


DONE: Calculating alpha correlation: simpson spearman


  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]
  alpha_diversity = alpha_diversity[matched_ids]


DONE: Calculating alpha correlation: simpson pearson
DONE: Calculating alpha correlation: simpson_e spearman


DONE: Calculating alpha correlation: simpson_e pearson


  alpha_diversity = alpha_diversity[matched_ids]


## Alpha diversity comparisons

Visually and statistically compare groups of alpha diversity values.

[diversity alpha_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/alpha-group-significance/)

In [16]:
for metric, alpha_values in alpha_diversities.items():
    print(f"Processing alpha_group_significance: {metric}")
    try:
        significance_view = alpha_group_significance(alpha_diversity=alpha_values, metadata=metadata_qa).visualization
        view_path = os.path.join(alpha_path, f'alpha-group-significance-{metric}.qzv')
        significance_view.save(view_path)
        significance_view
        print(f"DONE: Calculating alpha group significance: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating alpha group significance: {metric}")

Processing alpha_group_significance: chao1


DONE: Calculating alpha group significance: chao1
Processing alpha_group_significance: chao1_ci
ERROR: Calculating alpha group significance: chao1_ci
Processing alpha_group_significance: observed_features


DONE: Calculating alpha group significance: observed_features
Processing alpha_group_significance: shannon


DONE: Calculating alpha group significance: shannon
Processing alpha_group_significance: simpson


DONE: Calculating alpha group significance: simpson
Processing alpha_group_significance: simpson_e


DONE: Calculating alpha group significance: simpson_e


## Beta diversity analysis

#### Reference
- [diversity beta](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta/): Computes a user-specified beta diversity metric for all pairs of samples in a feature table.
- [Beta diversity metrics](http://scikit-bio.org/docs/0.2.0/generated/skbio.diversity.beta.html)

- Metric Choices('aitchison', 'braycurtis', 'canberra', 'canberra_adkins', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulsinski', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule')

In [17]:
metrics = ('aitchison', 'braycurtis', 'canberra', 'canberra_adkins', 'chebyshev', 'cityblock', 'correlation', 'cosine', 'dice', 'euclidean', 'hamming', 'jaccard', 'jensenshannon', 'kulsinski', 'matching', 'minkowski', 'rogerstanimoto', 'russellrao', 'seuclidean', 'sokalmichener', 'sokalsneath', 'sqeuclidean', 'yule')
metrics = ('euclidean', 'dice', 'braycurtis', 'correlation', 'cosine', 'matching', 'jaccard')
beta_diversities = dict()
for metric in metrics:
    print(f"Calculating beta diversity: {metric}")
    try:
        beta_diversity = beta(table=tabs, metric=metric, n_jobs=6, pseudocount=1).distance_matrix
        beta_diversities[metric] = beta_diversity
        # Save SampleData[BetaDiversity] Artifact
        file_path = os.path.join(beta_path, f'beta-values-{metric}.qza')
        beta_diversity.save(file_path)
        print(f"DONE: Calculating beta diversity: {metric}")
    except Exception as e:
        print(f"ERROR: Calculating beta diversity: {metric}")

Calculating beta diversity: euclidean


DONE: Calculating beta diversity: euclidean
Calculating beta diversity: dice


DONE: Calculating beta diversity: dice
Calculating beta diversity: braycurtis


DONE: Calculating beta diversity: braycurtis
Calculating beta diversity: correlation
ERROR: Calculating beta diversity: correlation
Calculating beta diversity: cosine




ERROR: Calculating beta diversity: cosine
Calculating beta diversity: matching


DONE: Calculating beta diversity: matching
Calculating beta diversity: jaccard


DONE: Calculating beta diversity: jaccard




### Beta group significance

- [diversity beta_group_significance](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-group-significance/): Determine whether groups of samples are significantly different from one another using a permutation-based statistical test.
- Marti J Anderson. A new method for non-parametric multivariate analysis of variance. Austral ecology, 26(1):32–46, 2001. doi:https://doi.org/10.1111/j.1442-9993.2001.01070.pp.x.

In [18]:
methods = ('permanova', 'anosim', 'permdisp')
for method in methods:
    for metric, beta_diversity in beta_diversities.items():
        print(f'Calculating beta group significance with method {method} and metric {metric}')
        try:
            beta_view = beta_group_significance(distance_matrix=beta_diversity, 
                                                metadata=metadata_qa.get_column(class_col), 
                                                pairwise=True, method=method).visualization
            view_name = os.path.join(beta_path, f'beta-group-significance-{metric}-{method}.qzv')
            beta_view.save(view_name)
            print(f"DONE: Calculating beta group significance: {method} {metric}")
        except Exception as e:
            print(f"ERROR: Calculating beta group significance: {method} {metric}")

Calculating beta group significance with method permanova and metric euclidean


DONE: Calculating beta group significance: permanova euclidean
Calculating beta group significance with method permanova and metric dice


DONE: Calculating beta group significance: permanova dice
Calculating beta group significance with method permanova and metric braycurtis


DONE: Calculating beta group significance: permanova braycurtis
Calculating beta group significance with method permanova and metric matching


DONE: Calculating beta group significance: permanova matching
Calculating beta group significance with method permanova and metric jaccard


DONE: Calculating beta group significance: permanova jaccard
Calculating beta group significance with method anosim and metric euclidean


DONE: Calculating beta group significance: anosim euclidean
Calculating beta group significance with method anosim and metric dice


DONE: Calculating beta group significance: anosim dice
Calculating beta group significance with method anosim and metric braycurtis


DONE: Calculating beta group significance: anosim braycurtis
Calculating beta group significance with method anosim and metric matching


DONE: Calculating beta group significance: anosim matching
Calculating beta group significance with method anosim and metric jaccard


DONE: Calculating beta group significance: anosim jaccard
Calculating beta group significance with method permdisp and metric euclidean


  warn(


  warn(


  warn(


  warn(


DONE: Calculating beta group significance: permdisp euclidean
Calculating beta group significance with method permdisp and metric dice


DONE: Calculating beta group significance: permdisp dice
Calculating beta group significance with method permdisp and metric braycurtis


DONE: Calculating beta group significance: permdisp braycurtis
Calculating beta group significance with method permdisp and metric matching


  warn(


  warn(


  warn(


  warn(


DONE: Calculating beta group significance: permdisp matching
Calculating beta group significance with method permdisp and metric jaccard


DONE: Calculating beta group significance: permdisp jaccard


<Figure size 640x480 with 0 Axes>

In [19]:
# Expand tests using UNIFRAC metrics
methods = ('permanova', 'anosim', 'permdisp')
for method in methods:
    for metric, beta_diversity in distance_matrix.items():
        print(f'Calculating beta group significance with method {method} and metric {metric}')
        try:
            beta_view = beta_group_significance(distance_matrix=beta_diversity, 
                                                metadata=metadata_qa.get_column(class_col), 
                                                pairwise=True, method=method).visualization
            view_name = os.path.join(beta_path, f'beta-group-significance-{metric}-{method}.qzv')
            beta_view.save(view_name)
            print(f"DONE: Calculating beta group significance: {method} {metric}")
        except Exception as e:
            print(f"ERROR: Calculating beta group significance: {method} {metric}")

Calculating beta group significance with method permanova and metric unweighted_unifrac_distance_matrix


DONE: Calculating beta group significance: permanova unweighted_unifrac_distance_matrix
Calculating beta group significance with method permanova and metric weighted_unifrac_distance_matrix


DONE: Calculating beta group significance: permanova weighted_unifrac_distance_matrix
Calculating beta group significance with method permanova and metric jaccard_distance_matrix


DONE: Calculating beta group significance: permanova jaccard_distance_matrix
Calculating beta group significance with method permanova and metric bray_curtis_distance_matrix


DONE: Calculating beta group significance: permanova bray_curtis_distance_matrix
Calculating beta group significance with method anosim and metric unweighted_unifrac_distance_matrix


DONE: Calculating beta group significance: anosim unweighted_unifrac_distance_matrix
Calculating beta group significance with method anosim and metric weighted_unifrac_distance_matrix


DONE: Calculating beta group significance: anosim weighted_unifrac_distance_matrix
Calculating beta group significance with method anosim and metric jaccard_distance_matrix


DONE: Calculating beta group significance: anosim jaccard_distance_matrix
Calculating beta group significance with method anosim and metric bray_curtis_distance_matrix


DONE: Calculating beta group significance: anosim bray_curtis_distance_matrix
Calculating beta group significance with method permdisp and metric unweighted_unifrac_distance_matrix


  warn(


  warn(


  warn(


  warn(


DONE: Calculating beta group significance: permdisp unweighted_unifrac_distance_matrix
Calculating beta group significance with method permdisp and metric weighted_unifrac_distance_matrix


  warn(


  warn(


  warn(


  warn(


  warn(


  warn(


  warn(


DONE: Calculating beta group significance: permdisp weighted_unifrac_distance_matrix
Calculating beta group significance with method permdisp and metric jaccard_distance_matrix


DONE: Calculating beta group significance: permdisp jaccard_distance_matrix
Calculating beta group significance with method permdisp and metric bray_curtis_distance_matrix


  warn(


DONE: Calculating beta group significance: permdisp bray_curtis_distance_matrix


<Figure size 640x480 with 0 Axes>

### Beta group Rarefaction

- [diversity beta_rarefaction](https://docs.qiime2.org/2022.8/plugins/available/diversity/beta-rarefaction/): Repeatedly rarefy a feature table to compare beta diversity results within a given rarefaction depth.  For a given beta diversity metric, this visualizer will provide: an Emperor jackknifed PCoA plot, samples clustered by UPGMA or neighbor joining with support calculation, and a heatmap showing the correlation between rarefaction trials of that beta diversity metric.