# Phylogeny Analysis

#### Notebook overview 

[1. Setup](#setup)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[1.1 _Data Import_](#import_data)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[1.2 Fragment insertion](#import_data)
[2. De-novo Phylogeny analysis](#De_novo_Phylogeny_analysis)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.1 _Sequence alignment_](#sequence_alignment)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.2 _Alignment masking_](#alignment_masking)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.3 _Tree construction and visualization using FastTree_](#FastTree)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.4 _Bootstrapping with RaxML_](#bootstrapping)<br>
&nbsp;&nbsp;&nbsp;&nbsp;[2.5 _Tree construction with RaxML tree search_](#RaxMl)<br>
[3. Fragment insertion](#fragment_insertion)<br>



<a id='setup'></a>

## 1. Setup

In [1]:
from qiime2 import Visualization
import os
import pandas as pd
import numpy as np

import qiime2 as q2

%matplotlib inline
# location of this week's data and all the results produced by this notebook 
data_dir = 'project_data'


if not os.path.isdir(data_dir):
    os.makedirs(data_dir) 

<a id='import_data'></a>
### 1.1 Import data

In [3]:
#filtered sequence
! wget -nv -O $data_dir/rep-seqs.qza 'https://polybox.ethz.ch/index.php/s/MBLSUQXzglnn66u/download?path=%2F&files=Sequences_rep_set.qza'

# Taxonomy file generate from silva
! wget -nv -O $data_dir/taxonomy_1.qza 'https://polybox.ethz.ch/index.php/s/MBLSUQXzglnn66u/download?path=%2F&files=taxonomy_1.qza'

2022-11-12 18:30:03 URL:https://polybox.ethz.ch/index.php/s/MBLSUQXzglnn66u/download?path=%2F&files=Sequences_rep_set.qza [390624/390624] -> "project_data/rep-seqs.qza" [1]
2022-11-12 18:30:04 URL:https://polybox.ethz.ch/index.php/s/MBLSUQXzglnn66u/download?path=%2F&files=Feature_table.qza [504534/504534] -> "project_data/table.qza" [1]


<a id='De_novo_Phylogeny_analysis'></a>

## 2. De-novo Phylogeny analysis


<a id='sequence_alignment'></a>
### 2.1 Sequence alignment

In [6]:
! qiime alignment mafft \
    --i-sequences $data_dir/rep-seqs.qza \
    --o-alignment $data_dir/aligned-rep-seqs.qza

[32mSaved FeatureData[AlignedSequence] to: project_data/aligned-rep-seqs.qza[0m
[0m

<a id='alignment_masking'></a>
### 2.2 Alignment masking

In [8]:
! qiime alignment mask \
    --i-alignment $data_dir/aligned-rep-seqs.qza \
    --o-masked-alignment $data_dir/masked-aligned-rep-seqs.qza

[32mSaved FeatureData[AlignedSequence] to: project_data/masked-aligned-rep-seqs.qza[0m
[0m

<a id='FastTree'></a>
### 2.3 Tree construction and Visualization

In [9]:
! qiime phylogeny fasttree \
    --i-alignment $data_dir/masked-aligned-rep-seqs.qza \
    --o-tree $data_dir/fasttree-tree.qza

#Rooting the tree
! qiime phylogeny midpoint-root \
    --i-tree $data_dir/fasttree-tree.qza \
    --o-rooted-tree $data_dir/fasttree-tree-rooted.qza

[32mSaved Phylogeny[Unrooted] to: project_data/fasttree-tree.qza[0m
[0m[32mSaved Phylogeny[Rooted] to: project_data/fasttree-tree-rooted.qza[0m
[0m

#### 2.3.1 Visualization using qiime2

In [11]:
! qiime empress tree-plot \
    --i-tree $data_dir/fasttree-tree-rooted.qza \
    --m-feature-metadata-file $data_dir/taxonomy_1.qza \
    --o-visualization $data_dir/fasttree-tree-rooted.qzv

[32mSaved Visualization to: project_data/fasttree-tree-rooted.qzv[0m
[0m

In [2]:
Visualization.load(f'{data_dir}/fasttree-tree-rooted.qzv')

### Note

Bootsrapping Fragment insertion and RaxML tree search methods were not able to be executed due to unsufficient computational ressources.

<a id='bootstrapping'></a>
### 2.4 Bootstrapping

! qiime phylogeny raxml-rapid-bootstrap \
    --i-alignment $data_dir/masked-aligned-rep-seqs.qza \
    --p-seed 1723 \
    --p-rapid-bootstrap-seed 9384 \
    --p-bootstrap-replicates 100 \
    --p-substitution-model GTRCAT \
    --p-n-threads 3 \
    --o-tree $data_dir/raxml-cat-bootstrap-tree.qza

! qiime phylogeny midpoint-root \
    --i-tree $data_dir/raxml-cat-bootstrap-tree.qza \
    --o-rooted-tree $data_dir/raxml-cat-bootstrap-tree-rooted.qza

! qiime empress tree-plot \
    --i-tree $data_dir/raxml-cat-bootstrap-tree-rooted.qza \
    --m-feature-metadata-file $data_dir/taxonomy.qza \
    --o-visualization $data_dir/raxml-cat-bootstrap-tree-rooted.qzv

Visualization.load(f'{data_dir}/raxml-cat-bootstrap-tree-rooted.qzv')

<a id='RaxMl'></a>
### 2.5 Tree construction and Visualization using RaxMl tree search

! qiime phylogeny raxml \
      --i-alignment $data_dir/masked-aligned-rep-seqs.qza \
      --p-substitution-model GTRCAT \
      --p-seed 1723 \
      --p-n-searches 3 \
      --o-tree $data_dir/raxml-cat-searches-tree.qza

<a id='fragment_insertion'></a>
## 3. Fragment insertion

! wget -nv -O $data_dir/sepp-refs-gg-13-8.qza https://data.qiime2.org/2021.4/common/sepp-refs-gg-13-8.qza

! qiime fragment-insertion sepp \
    --i-representative-sequences $data_dir/rep-seqs.qza \
    --i-reference-database $data_dir/sepp-refs-gg-13-8.qza \
    --p-threads 2 \
    --o-tree $data_dir/sepp-tree.qza \
    --o-placements $data_dir/sepp-tree-placements.qza

! qiime empress tree-plot \
    --i-tree $data_dir/sepp-tree.qza \
    --m-feature-metadata-file $data_dir/taxonomy_1.qza \
    --o-visualization $data_dir/sepp-tree.qzv

Visualization.load(f'{data_dir}/sepp-tree.qzv')