In this notebook we try to analyze the differential abundance in our samples. Meaning we try to test whether individual ASVs/taxa differ in abundance between samples groups. :)

We will first try to explore the data (finding out that data is not normalverteilt, shocking) and then try to use ANCOM as appropriate statistical test. 

In [5]:
import os
import matplotlib.pyplot as plt
import pandas as pd
import qiime2 as q2
from qiime2 import Visualization
import seaborn as sns
from scipy.stats import shapiro, kruskal, f_oneway

data_dir = 'CE'
%matplotlib inline

Artifacts we need to run this Notebook:
1. feature table = 'dada2_table_align_filtered.qza'
2. metadata table = 'food-metadata.tsv'
3. taxonomic classification = 'taxonomy_v4.qza'


In [2]:
##Data Exploration

In [6]:
data = q2.Artifact.load(f'{data_dir}/dada2_table_align_filtered.qza').view(pd.DataFrame)

TypeError: No format: BIOMV210DirFmt

In [None]:
data.head()

##ANCOM

In [8]:
#only retain features that are present at some minimal frequency (25) and in at least 4 samples
! qiime feature-table filter-features \
--i-table $data_dir/dada2_table_align_filtered.qza \
--p-min-frequency 25 \
--p-min-samples 4 \
--o-filtered-table $data_dir/table_abund254.qza

[32mSaved FeatureTable[Frequency] to: CE/table_abund254.qza[0m
[0m

In [10]:
#example: comparing diff. abundance within continents
! qiime feature-table filter-samples \
--i-table $data_dir/table_abund254.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[continent]='North_America' or [continent]='Europe'" \
--o-filtered-table $data_dir/table_abund254_continent.qza

[32mSaved FeatureTable[Frequency] to: CE/table_abund254_continent.qza[0m
[0m

In [15]:
! qiime composition add-pseudocount \
--i-table $data_dir/table_abund254_continent.qza \
--o-composition-table $data_dir/table_abund254_continent_comp.qza

[32mSaved FeatureTable[Composition] to: CE/table_abund254_continent_comp.qza[0m
[0m

In [16]:
! qiime composition ancom \
--i-table $data_dir/table_abund254_continent_comp.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--m-metadata-column continent \
--p-transform-function log \
--o-visualization $data_dir/ancom254_continent.qzv

[32mSaved Visualization to: CE/ancom254_continent.qzv[0m
[0m

In [17]:
Visualization.load(f'{data_dir}/ancom254_continent.qzv')

In [18]:
#example: comparing diff. abundance within rindtype (washed or natural)
! qiime feature-table filter-samples \
--i-table $data_dir/table_abund254.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--p-where "[rindtype]='washed' or [rindtype]='natural'" \
--o-filtered-table $data_dir/table_abund254_rindtype1.qza

[32mSaved FeatureTable[Frequency] to: CE/table_abund254_rindtype1.qza[0m
[0m

In [22]:
! qiime composition add-pseudocount \
--i-table $data_dir/table_abund254_rindtype1.qza \
--o-composition-table $data_dir/table_abund254_rindtype1_comp.qza

[32mSaved FeatureTable[Composition] to: CE/table_abund254_rindtype1_comp.qza[0m
[0m

In [23]:
! qiime composition ancom \
--i-table $data_dir/table_abund254_rindtype1_comp.qza \
--m-metadata-file $data_dir/food-metadata.tsv \
--m-metadata-column rindtype \
--p-transform-function log \
--o-visualization $data_dir/ancom254_rindtype1.qzv

[32mSaved Visualization to: CE/ancom254_rindtype1.qzv[0m
[0m

In [24]:
Visualization.load(f'{data_dir}/ancom254_rindtype1.qzv')