## Python Tutorial: Part 3 – QIIME 2 Artifact API

We will explore the QIIME 2 Artifact API (https://docs.qiime2.org/2021.2/interfaces/artifact-api/). This API lets us import QIIME artifacts and convert them to Pandas DataFrames. Then we can do additional statistical analyses and plotting.

#### Import the required libraries

In [1]:
import pandas as pd
import numpy as np
from qiime2 import Artifact, Metadata
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

In [2]:
root = '.'

Note: The path variable `root` should be set to the location of the main working directory, for example `/wrk/student028`.

#### Metadata

In [3]:
df_metadata = pd.read_csv('%s/sample-metadata.tsv' % root, sep='\t', index_col=0, comment='#')

FileNotFoundError: [Errno 2] No such file or directory: './sample-metadata.tsv'

In [None]:
df_metadata.head()

In [None]:
df_metadata['treatment-group'].value_counts()

In [None]:
df_metadata['sequencing-run'].value_counts()

#### Matplotlib scatter plots: principal coordinates analysis

In [None]:
path_pcoa = '%s/core-metrics/unweighted_unifrac_pcoa_results.qza' % root
pcoa = Artifact.load(path_pcoa)
md = pcoa.view(Metadata)
df_pcoa = md.to_dataframe()

In [None]:
df_pcoa.head()

In [None]:
color_dict = {'treatment': 'red', 'donor': 'blue', 'control': 'orange'}

In [None]:
treatment_colors = [color_dict[df_metadata.loc[x, 'treatment-group']] for x in df_pcoa.index]

In [None]:
fig, ax = plt.subplots(figsize=(5,5))
ax.scatter(df_pcoa['Axis 1'], df_pcoa['Axis 2'], c=treatment_colors)
ax.set_xlabel('PC1')
ax.set_ylabel('PC2')
plt.savefig('bdiv_scatter.pdf')

#### Seaborn box plots: alpha-diversity data

In [None]:
path_adiv = '%s/core-metrics/observed_otus_vector.qza' % root
adiv = Artifact.load(path_adiv)
df_adiv = adiv.view(pd.Series)

In [None]:
df_adiv.head()

In [None]:
df_metadata['adiv_observed_otus'] = [df_adiv[x] if x in df_adiv.index else np.nan for x in df_metadata.index]

In [None]:
df_metadata.head()

In [None]:
sns.boxplot(x='treatment-group', y='adiv_observed_otus', hue='sequencing-run', data=df_metadata)
plt.savefig('adiv_boxplot.pdf')