# AM data analysis

#### Description of data and aims:
In the summer of 2024, a mysterious disease dubbed the “pundemic” by the media began cropping up worldwide. Diseased patients make puns at every opportunity. A link between the pundemic and changes in the gut microbiome was discovered, and a doctor at the USZ set up a clinical trial using fecal microbiota transplants (FMT) as a possible treatment.

Trial data:  
Collection of fecal microbiome samples from pundemic patients before and after the trial, from both treatment and placebo groups. Pundemic severity in patients was quantified in terms of puns per hour. Fecal samples were collected from the FMT donors as well.

Because the bacterial and fungal gut microbiome are both of interest, the USZ team collected both **16S rRNA gene** and **ITS** data from the study cohort. 

Aims:
1. Analyzing the ITS data in order to further explore the connection between pundemic symptoms and an altered gut mycobiome composition
2. Analyzing the potential of FMT as a pandemic treatment option. You have received DNA sequences as well as metadata allowing you to distinguish pundemic from healthy samples.


In [3]:
# Package import
import os
import pandas as pd
from qiime2 import Visualization
import matplotlib.pyplot as plt
import numpy as np
import qiime2 as q2

%matplotlib inline

## Data import and denoising

In [5]:
# Data and metadata import
! wget -O data/raw/pundemic_metadata.tsv https://polybox.ethz.ch/index.php/s/7LxWSbaw2q37yof/download
! wget -O data/raw/pundemic_forward_reads.qza https://polybox.ethz.ch/index.php/s/o8HqHJqvuf9e2on/download


--2024-10-03 22:55:02--  https://polybox.ethz.ch/index.php/s/7LxWSbaw2q37yof/download
Resolving polybox.ethz.ch (polybox.ethz.ch)... 129.132.71.243
Connecting to polybox.ethz.ch (polybox.ethz.ch)|129.132.71.243|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 10868 (11K) [application/octet-stream]
Saving to: ‘data/raw/pundemic_metadata.tsv’


2024-10-03 22:55:02 (106 MB/s) - ‘data/raw/pundemic_metadata.tsv’ saved [10868/10868]

--2024-10-03 22:55:02--  https://polybox.ethz.ch/index.php/s/o8HqHJqvuf9e2on/download
Resolving polybox.ethz.ch (polybox.ethz.ch)... 129.132.71.243
Connecting to polybox.ethz.ch (polybox.ethz.ch)|129.132.71.243|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 942144925 (898M) [application/octet-stream]
Saving to: ‘data/raw/pundemic_forward_reads.qza’


2024-10-03 22:55:05 (314 MB/s) - ‘data/raw/pundemic_forward_reads.qza’ saved [942144925/942144925]



In [21]:
# Metadata df creation and overview
meta_df = pd.read_csv('data/raw/pundemic_metadata.tsv', sep='\t', index_col=0)
meta_df

Unnamed: 0_level_0,patient_id,age,sex,ethnicity,continent,country,region,city,group,disease_subgroup,blinded_clinical_response,puns_per_hour_pre_treatment,puns_per_hour_post_treatment,time_point
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
SRR10505051,1048,36,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,NR,9.0,8.0,post-treatment
SRR10505052,1048,36,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,NR,9.0,8.0,pre-treatment
SRR10505053,1045,29,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,Res,6.0,0.0,pre-treatment
SRR10505054,1045,29,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,Res,6.0,0.0,post-treatment
SRR10505055,1044,34,male,Indian Subcontinental,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,,4.0,,pre-treatment
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRR10505151,D54,Unknown,Unknown,,Europe,Switzerland,Zurich,Zurich,Healthy,donor,,,,t1
SRR10505152,D53,Unknown,Unknown,,Europe,Switzerland,Zurich,Zurich,Healthy,donor,,,,t1
SRR10505153,2225,34,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,6.0,5.0,pre-treatment
SRR10505154,1024,Unknown,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,8.0,0.0,post-treatment


In [26]:
meta_df['disease_subgroup'].unique()

array(['Placebo', 'FMT', 'donor'], dtype=object)

In [29]:
meta_df[(meta_df['disease_subgroup'] == "placebo") | (meta_df['disease_subgroup'] == "FMT")]

Unnamed: 0_level_0,patient_id,age,sex,ethnicity,continent,country,region,city,group,disease_subgroup,blinded_clinical_response,puns_per_hour_pre_treatment,puns_per_hour_post_treatment,time_point
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
SRR10505057,1043,35,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,9.0,6.0,post-treatment
SRR10505058,1043,35,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,9.0,6.0,pre-treatment
SRR10505061,2212,27,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,8.0,4.0,pre-treatment
SRR10505062,1041,39,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,6.0,0.0,post-treatment
SRR10505063,1041,39,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,6.0,0.0,pre-treatment
SRR10505066,1038,35,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,6.0,2.0,pre-treatment
SRR10505067,1037,46,male,Mediterranean,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,9.0,5.0,post-treatment
SRR10505070,1035,27,male,Asian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,8.0,5.0,post-treatment
SRR10505071,1035,27,male,Asian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,8.0,5.0,pre-treatment
SRR10505072,2212,27,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,8.0,4.0,post-treatment


In [25]:
meta_df[(meta_df['time_point'] == "pre-treatment") | (meta_df['time_point'] == "post-treatment")]

Unnamed: 0_level_0,patient_id,age,sex,ethnicity,continent,country,region,city,group,disease_subgroup,blinded_clinical_response,puns_per_hour_pre_treatment,puns_per_hour_post_treatment,time_point
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
SRR10505051,1048,36,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,NR,9.0,8.0,post-treatment
SRR10505052,1048,36,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,NR,9.0,8.0,pre-treatment
SRR10505053,1045,29,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,Res,6.0,0.0,pre-treatment
SRR10505054,1045,29,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,Res,6.0,0.0,post-treatment
SRR10505055,1044,34,male,Indian Subcontinental,Europe,Switzerland,Zurich,Zurich,Puns,Placebo,,4.0,,pre-treatment
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRR10505141,1001,57,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,7.0,2.0,pre-treatment
SRR10505142,1001,57,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,7.0,2.0,post-treatment
SRR10505153,2225,34,male,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,NR,6.0,5.0,pre-treatment
SRR10505154,1024,Unknown,female,Caucasian,Europe,Switzerland,Zurich,Zurich,Puns,FMT,Res,8.0,0.0,post-treatment


We have 105 samples