# Phylo-CTF-analysis-ASV

**Note**: This notebook assumes you have installed [QIIME2](https://qiime2.org/) using one of the procedures in the [install documents](https://docs.qiime2.org/2020.2/install/). This tutorial also assumed you have installed, [Qurro](https://github.com/biocore/qurro), [DEICODE](https://github.com/biocore/DEICODE), and [gemelli](https://github.com/biocore/gemelli).

First, we will make a tutorial directory and download the data above and move the files to the `ECC2P/data` directory:

```bash
mkdir ECC2P
```
```bash
# move downloaded data here
mkdir ECC2P/data
```

First we will import our data with the QIIME2 Python API. 


In [27]:
# ! pip install gemelli
# ! pip install deicode 
# ! pip install qurro
# ! pip install empres

In [28]:
!mkdir -p ../../Results/Feature_table
!mkdir -p ../../Results/Dist_matrix

!biom convert \
    -i ../../Data/637/function/637_functional_kos_count_table.tsv \
    -o ../../Results/Feature_table/637_functional_kos_count_table.biom \
    --to-hdf5 \
    --table-type "OTU table"

!qiime tools import \
    --input-path ../../Results/Feature_table/637_functional_kos_count_table.biom \
    --output-path ../../Results/Feature_table/637_functional_kos_count_table.qza \
    --type "FeatureTable[Frequency]" 

[32mImported ../../Results/Feature_table/637_functional_kos_count_table.biom as BIOMV210DirFmt to ../../Results/Feature_table/637_functional_kos_count_table.qza[0m
[0m

In [29]:
import os
import warnings
import qiime2 as q2
import pandas as pd
from qiime2.plugins.feature_table.actions import filter_seqs
from qiime2.plugins.feature_table.actions import filter_samples
from qiime2.plugins.feature_table.actions import summarize

# hide pandas Future/Deprecation Warning(s) for tutorial
warnings.filterwarnings("ignore", category=DeprecationWarning) 
warnings.simplefilter(action='ignore', category=FutureWarning)

# import table(s)
table = q2.Artifact.load('../../Results/Feature_table/637_functional_kos_count_table.qza')
# import metadata
metadata = q2.Metadata.load('../../Data/637/637_metadata.txt')
# make directory to store results
output_path = '../../Results/Dist_matrix/637_functional_rPCA'
if not os.path.exists(output_path): 
    os.mkdir(output_path)


Next, we will demonstrate the issues with using conventional dimensionality reduction methods on time series data. To do this we will perform PCoA dimensionality reduction on weighted and unweighted UniFrac $\beta$-diversity distances. We will also run Aitchison Robust PCA with _DEICODE_ which is built on the same framework as CTF but does not account for repeated measures.


### Sample filtering based on the metadata

In [30]:
table.view(pd.DataFrame).shape

(637, 3996)

### RPCA based on all samples

In [31]:
from qiime2.plugins.deicode.actions import rpca
from qiime2.plugins.emperor.actions import (plot, biplot)
from qiime2.plugins.diversity.actions import (beta_phylogenetic, pcoa, beta_group_significance)

# run RPCA and plot with emperor
rpca_biplot, rpca_distance = rpca(table)
rpca_biplot_emperor = biplot(rpca_biplot, metadata)
# now we can save the plots
rpca_biplot.save(os.path.join(output_path, 'rpca.biplot.qza'))
rpca_distance.save(os.path.join(output_path, 'rpca.dist.qza'))
rpca_biplot_emperor.visualization.save(os.path.join(output_path, 'RPCA-biplot.qzv'))


'../../Results/Dist_matrix/637_functional_rPCA/RPCA-biplot.qzv'

In [32]:
from skbio.stats.distance import DistanceMatrix
distance_matrix = rpca_distance.view(DistanceMatrix)
distance_df = distance_matrix.to_data_frame()
distance_df.to_csv(os.path.join(output_path, '637_functional_rpca_dist.txt'), sep='\t', index=True)

In [33]:
from skbio import OrdinationResults

sample_coords = rpca_biplot.view(OrdinationResults).samples
sample_coords = sample_coords.rename(columns={0: 'PC1', 1: 'PC2', 2: 'PC3'})
sample_coords.index.name = 'SampleID'
sample_coords.head(5)

sample_coords.to_csv(os.path.join(output_path,'637_functional_rpca_sample_coordinates.tsv'), sep = "\t", index=True)