# UC Severity Qiime Analyses

By RM

Environment:Qiime2(2018.4) & Qiime1

Project Abstract:

Ulcerative colitis (UC) is a chronic autoimmune condition defined by intestinal inflammation and concurrent microbiome dysbiosis. Here, we advance the understanding of host-microbiome interactions governing UC by collecting six meta–omic datasets profiling host and microbial molecules in 40 patients displaying a wide range of clinically assessed disease activity (remission to severe). The six datasets provided unique evidence toward a central hypothesis of proteolysis co-occurring with increased disease activity. Metaproteomics identified Bacteroides proteases as a distinguishing feature of severity. Shotgun metagenomics guided taxonomic inferences and revealed that the Bacteroides association was driven primarily by changes in protein and not DNA abundances. Potential evidence of a host response to Bacteroides serine proteases was found in the increase of serum and fecal serine protease inhibitors. Metapeptidomics added evidence of protease activity as an increase of peptide fragments was present among the patients with high severity. In addition, we compare prediction of severity and clinical parameters between data types. In total, our meta-omic platform has provided compelling integrated evidence for host-microbiome interactions during UC and opens the door for protease inhibition as a therapeutic approach for severe UC patients.


In [2]:
# Initializes the notebook with inline display
%matplotlib inline

from os import mkdir
import os
import copy
from os.path import abspath, join as pjoin, exists
from shutil import copy2, move
from time import strftime, strptime
from numpy import nan, isnan, arange
from pandas import read_csv, Series, DataFrame
from IPython.display import Image
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

16S.biom
16S_biom.qza
16S_biom.qzv
16S_pmayo_PCoA_Unweighted_CalprotectinSize.png
16S_pmayo_PCoA_Unweighted_CalprotectinSize.svg
[34m1Search_only[m[m/
1search_biom.qza
1search_biom.qzv
1search_common_biom.qza
[34m2Search[m[m/
2search_biom.qza
2search_biom.qzv
2search_common_biom.qza
2searchpDB_pmayo_PCoA_braycurtis_CalprotectinSize.png
2searchpDB_pmayo_PCoA_braycurtis_CalprotectinSize.svg
Bacteroides_Proteases_by_Species_workup.xlsx
Bacteroides_of_interest.xlsx
Bacteroides_workup.xlsx
COG_Spearman_Corrs_Stats_pergene.csv
Descriptive_Statistics.xlsx
IGCNormalizedCommonReps.biom
IGCNormalizedDataAll.biom
IGC_biom.qza
IGC_biom.qzv
IGC_common_biom.qza
List_of_proteins_NotInSearch1.xlsx
LowMayoStoolFreq_BH.csv
Low_severity_remission.csv
Low_severity_remission_volcanos.ai
Low_severity_remission_volcanos.png
Lysozyme_correlations_ordered_MP_corrnonan.csv
Lysozyme_correlations_ordered_MP_corrwnan.csv
MG_Bacteroides_Composition.csv
MG_Enzymes.csv
MGw0s_LR

### Make subset of no-human metaproteomics

In [4]:
df = pd.read_csv("./2Search/CSVs/NormalizedCommonReps.txt", 
                 sep = '\t', index_col= "datarest$ProteinID")

In [5]:
#Remove any human derived proteins
df = df[df.index.str.contains('k99_') != False]
df

Unnamed: 0_level_0,H16,H3,L4,L11,H8,L3,H11,H20,L10,L7,...,L16,H14,H1,L1,H6,L14,H2,H10,L17,L26
datarest$ProteinID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
k99_1000928_1,111.069493,94.231149,166.363367,99.088192,54.591755,173.691705,50.662214,134.574253,26.011840,59.754656,...,68.093307,71.056933,33.271179,161.770360,70.540445,101.274311,149.492346,99.904085,58.012858,60.128184
k99_1001938_3,46.630938,320.607204,264.045104,74.501569,33.129914,149.089471,239.364036,60.129463,6.352032,16.593794,...,88.449492,410.993816,15.131730,19.084435,3.638069,2.979719,106.715396,22.169808,66.089185,35.278634
k99_1001990_2,57.391732,47.461439,123.003403,32.235909,66.112950,69.628543,106.518997,705.535557,116.762120,146.875607,...,17.494716,12.224755,34.637102,12.661787,126.771758,104.163969,37.907910,168.860490,30.236346,143.500305
k99_1002150_3,30.935689,125.472858,347.763489,148.560410,39.245119,43.271356,129.515300,79.409804,25.220030,71.962896,...,62.935518,21.794327,36.540982,79.051493,42.634040,61.234510,69.066572,725.394872,136.957015,48.171145
k99_1002370_6,46.583836,164.440568,166.747069,134.089458,14.935078,52.832114,52.968206,178.658634,62.721379,55.951204,...,162.402114,1.037451,144.370090,208.113864,14.262005,43.673788,74.932959,38.915743,45.162795,79.388883
k99_1002967_3,27.341552,120.232326,349.095667,193.395062,48.804329,53.359054,47.009773,152.000113,58.934078,43.200448,...,87.707173,9.256106,59.617181,247.086797,24.663200,51.850043,4.419552,27.218740,21.916938,68.751905
k99_1003965_10,24.376980,172.249152,150.505049,18.110006,132.123840,28.703120,8.149837,115.727792,21.424122,16.858031,...,57.223006,41.482728,204.751568,35.337095,53.272303,127.598116,74.795004,93.190566,92.367805,104.356005
k99_1004041_1,53.284447,0.306497,4.638289,61.191568,203.231975,88.284329,3.046813,5.299889,195.578089,6.252366,...,130.641019,9.193953,4.496304,142.587605,0.826234,266.520335,8.812933,52.492953,17.674539,40.778811
k99_1004623_1,119.368003,33.717070,12.766574,38.006418,117.899451,185.485691,446.934636,11.686298,182.239036,24.190626,...,33.647304,118.834157,46.292226,35.476515,33.872342,52.061963,36.413021,111.688583,22.764500,39.151955
k99_1005212_10,54.244065,267.630639,62.279420,124.831251,56.083609,54.110338,70.235402,175.656809,51.662677,47.703592,...,95.664787,7.083284,126.750742,115.386956,37.891127,82.778597,97.051830,151.377216,138.447806,85.003125


In [6]:
df.to_csv('./2Search/CSVs/NormalizedCommonReps_nohuman.txt', sep = '\t')

### Create Biom Files

In [7]:
#2 Search pDB Approach - no human
#Convert tab-separated file to biom file
!biom convert -i ./2Search/CSVs/NormalizedCommonReps_nohuman.txt \
-o ./2Search/NormalizedCommonReps2_nohuman.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

In [12]:
#2 Search pDB Approach
#Convert tab-separated file to biom file
!biom convert -i ./2Search/CSVs/NormalizedCommonReps.txt \
-o ./2Search/NormalizedCommonReps2.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

In [28]:
#Metabolomics
#Convert tab-separated file to biom file
!biom convert -i ../Metabolomics/Feature_table.txt \
-o ./Metabolomics.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

In [3]:
#16S
#Convert tab-separated file to biom file
!biom convert -i ../Genomics/16S/reference-hit_idswap2_blankremove.txt \
-o ./16S.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

In [1]:
#Serum
!biom convert -i ../Serum/CSVs/NormalizedCommonReps_ids.txt \
-o ./Serum_Common.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

In [2]:
#MG
!biom convert -i ../Genomics/Shotgun/Salmon_CPMs_0s.txt \
-o ./Salmon_CPMs.biom \
-m ../UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

### Import all as Qiime2 artifacts

In [29]:
!qiime tools import \
  --input-path ./Metabolomics.biom \
  --type 'FeatureTable[Frequency]' \
  --output-path Metabolomics_biom.qza

In [14]:
!qiime tools import \
  --input-path ./2Search/NormalizedCommonReps2.biom\
  --type 'FeatureTable[Frequency]' \
  --output-path 2search_common_biom.qza

In [4]:
!qiime tools import \
  --input-path ./16S.biom\
  --type 'FeatureTable[Frequency]' \
  --output-path 16S_biom.qza

In [3]:
!qiime tools import \
  --input-path ./Salmon_CPMs.biom\
  --type 'FeatureTable[Frequency]' \
  --output-path ./Salmon_CPMs_biom.qza

In [4]:
!qiime tools import \
  --input-path ./Serum_Common.biom\
  --type 'FeatureTable[Frequency]' \
  --output-path ./Serum_Common_biom.qza

In [8]:
!qiime tools import \
  --input-path ./2Search/NormalizedCommonReps2_nohuman.biom\
  --type 'FeatureTable[Frequency]' \
  --output-path ./pDB_Common_nohuman_biom.qza

In [None]:
#Metagenomics using UniFrac. Based on centrifuge counts.
!qiime tools import \
  --input-path ./Genomics/Shotgun/all.filt.biom \
  --type 'FeatureTable[Frequency]' \
  --output-path MG_all_biom.qza

### Feature table summarize

In [9]:
!qiime feature-table summarize \
  --i-table ./pDB_Common_nohuman_biom.qza \
  --o-visualization ./pDB_Common_nohuman_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

[32mSaved Visualization to: ./pDB_Common_nohuman_biom.qzv[0m


In [30]:
!qiime feature-table summarize \
  --i-table Metabolomics_biom.qza \
  --o-visualization Metabolomics_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

[32mSaved Visualization to: Metabolomics_biom.qzv[0m


In [17]:
!qiime feature-table summarize \
  --i-table 2search_common_biom.qza \
  --o-visualization 2search_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

[32mSaved Visualization to: 2search_biom.qzv[0m


In [5]:
!qiime feature-table summarize \
  --i-table 16S_biom.qza \
  --o-visualization 16S_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

[32mSaved Visualization to: 16S_biom.qzv[0m


In [5]:
!qiime feature-table summarize \
  --i-table ./Serum_Common_biom.qza \
  --o-visualization ./Serum_Common_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

[32mSaved Visualization to: ./Serum_Common_biom.qzv[0m


In [6]:
!qiime feature-table summarize \
  --i-table ./Salmon_CPMs_biom.qza \
  --o-visualization Salmon_CPMs_biom.qzv \
  --m-sample-metadata-file ../UC_MP_Emperor_Map.txt

^C

Aborted!


In [None]:
qiime feature-table summarize \
  --i-table ./Serum_biom.qza \
  --o-visualization Serum_biom.qzv \
  --m-sample-metadata-file ../../UC_MP_Emperor_Map.txt

### Qiime 1 PCoAs

In [19]:
!validate_mapping_file.py -o vmf-map/ -m ../UC_MP_Emperor_Map_10052018.txt



In [21]:
from IPython.display import FileLinks, FileLink
FileLinks('vmf-map/')

In [23]:
!beta_diversity_through_plots.py -i ./Metabolomics.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_Metabolomics -p ../smallDB_correct/PCoA_Plots/paramaters2.txt

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [24]:
!beta_diversity_through_plots.py -i ./2Search/NormalizedCommonReps2.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_2search -p ../smallDB_correct/PCoA_Plots/paramaters2.txt

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [3]:
!beta_diversity_through_plots.py -i ./16S.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_16S -p ../smallDB_correct/PCoA_Plots/paramaters2_16S.txt -t ../Genomics/16S/insertion_tree.relabelled.tre 

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [5]:
!beta_diversity_through_plots.py -i ./Salmon_CPMs.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_MG -p ../smallDB_correct/PCoA_Plots/paramaters2.txt

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [6]:
!beta_diversity_through_plots.py -i ./Serum_Common.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_Serum -p ../smallDB_correct/PCoA_Plots/paramaters2.txt

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [4]:
!beta_diversity_through_plots.py -i ./2Search/NormalizedCommonReps2_nohuman.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_2search_nohuman -p ../smallDB_correct/PCoA_Plots/paramaters2.txt


  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


In [2]:
!beta_diversity_through_plots.py -i ../Serum/UC_Serum_CommonReps_nored_hdf5.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_Serum_nored -p ../smallDB_correct/PCoA_Plots/paramaters2.txt

Traceback (most recent call last):
  File "/Users/rhmills/miniconda3/envs/qiime1/bin/beta_diversity_through_plots.py", line 4, in <module>
    __import__('pkg_resources').run_script('qiime==1.9.1', 'beta_diversity_through_plots.py')
  File "/Users/rhmills/miniconda3/envs/qiime1/lib/python2.7/site-packages/pkg_resources/__init__.py", line 750, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/rhmills/miniconda3/envs/qiime1/lib/python2.7/site-packages/pkg_resources/__init__.py", line 1527, in run_script
    exec(code, namespace, namespace)
  File "/Users/rhmills/miniconda3/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/beta_diversity_through_plots.py", line 153, in <module>
    main()
  File "/Users/rhmills/miniconda3/envs/qiime1/lib/python2.7/site-packages/qiime-1.9.1-py2.7.egg-info/scripts/beta_diversity_through_plots.py", line 127, in main
    create_dir(output_dir, fail_on_exist=not opts.force)
  File "/Users

In [5]:
!beta_diversity_through_plots.py -i ../Genomics/Shotgun/all.filt.biom -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -o ./Qiime/PCoA_MG_allUniq -p ../smallDB_correct/PCoA_Plots/paramaters2_16S.txt -t ../Genomics/Shotgun/tree.nwk 

  if rank(datamtx) != 2:
  if rank(datamtx) != 2:


### Qiime 1 Beta-Diversity Statistics Adonis & PERMANOVA

In [3]:
!pwd

/Users/rhmills/Documents/Thesis Work/UC_Severity/pDB_Proteomics


In [2]:
#Data types which require PERMANOVA categorical significance.
Perma = ['sex','race','historic_extent','ASA_exposure', 'current_5ASA', 'steroid_exposure', 'current_steroids', 'IM_exposure', 'IM_type', 'biologic_exposure',
        'biologic_exposure_type','current_biologic','current_biologic_type','Experiment','TMT_Label']
Adonis = ['CRP','Calprotectin', 'partial_mayo','age','age_diagnosis','disease_duration','height','stool_frequency','rectal_bleeding','PGA','mayo_endoscopic_score','UCEIS_endoscopic_score','COLLECTION_TIMESTAMP','Endoscopy_date']

In [7]:
#Iterative Permanovas on MG UniFrac
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_MG_allUniq/unweighted_unifrac_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/MG_UniFrac/$i

In [8]:
#Iterative Adonis on MG UniFrac
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_MG_allUniq/unweighted_unifrac_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/MG_UniFrac/$i -n 999

In [9]:
#Iterative Permanovas on 16S
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_16S/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/16S_bray/$i

In [None]:
#Iterative Permanovas on 16S - all samples
for i in Perma:
        !compare_categories.py --method permanova -i ../Genomics/16S/core-metrics-results_idswap2_newmetadata_allsamples/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/16S_bray/$i

In [10]:
#Iterative Adonis on 16S
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_16S/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/16S_bray/$i -n 999

In [12]:
#Iterative Permanovas on pDB
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_2search/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/pDB_MP/$i

In [10]:
#Iterative Adonis on pDB
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_2search/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/pDB_MP/$i -n 999

In [6]:
#Iterative Permanovas on pDB NOHUMAN
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_2search_nohuman/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/pDB_MP_nohuman/$i

In [3]:
#Iterative Adonis on pDB NOHUMAN
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_2search_nohuman/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/pDB_MP_nohuman/$i -n 999

In [13]:
#Iterative Permanovas on Metabolomics
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_Metabolomics/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/Metabolomics/$i

In [14]:
#Iterative Adonis on Metabolomics
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_Metabolomics/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/Metabolomics/$i -n 999

In [15]:
#Iterative Permanovas on Metagenome
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_MG/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/MG/$i

In [16]:
#Iterative Adonis on Metagenome
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_MG/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/MG/$i -n 999

In [17]:
#Iterative Permanovas on Serum
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_Serum/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/Serum/$i

In [19]:
#Iterative Adonis on Serum
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_Serum/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/Serum/$i -n 999

In [36]:
#Iterative Permanovas on Serum NORED
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_Serum_nored/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/Serum_nored/$i

In [37]:
#Iterative Adonis on Serum NORED
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_Serum_nored/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/Serum_nored/$i -n 999

In [20]:
#Iterative Permanovas on 16S
for i in Perma:
        !compare_categories.py --method permanova -i ./Qiime/PCoA_16S/unweighted_unifrac_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Permanova/16S/$i

In [21]:
#Iterative Adonis on 16S
for i in Adonis:
        !compare_categories.py --method adonis -i ./Qiime/PCoA_16S/unweighted_unifrac_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c $i -o ./Qiime/Adonis/16S/$i -n 999

In [3]:
!compare_categories.py --method permanova -i ./Qiime/PCoA_2search/bray_curtis_dm.txt -m ./vmf-map/UC_MP_Emperor_Map_10052018_corrected.txt -c IM_type -o ./Qiime/Permanova/pDB_MP/IM_type

### Core metrics - Qiime2

In [10]:
#2Searh pDB using the no-human proteins
!qiime diversity core-metrics \
  --i-table pDB_Common_nohuman_biom.qza \
    --p-sampling-depth 539826 \
--m-metadata-file ../UC_MP_Emperor_Map.txt \
--output-dir core-metrics-results_pDB_noHuman

[32mSaved FeatureTable[Frequency] to: core-metrics-results_pDB_noHuman/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_pDB_noHuman/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_pDB_noHuman/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_pDB_noHuman/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_pDB_noHuman/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_pDB_noHuman/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: core-metrics-results_pDB_noHuman/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: core-metrics-results_pDB_noHuman/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: core-metrics-results_pDB_noHuman/jaccard_emperor.qzv[0m
[32mSaved Visualization to: core-metrics-results_pDB_noHuman/bray_curtis_emperor.qzv[0m


In [35]:
#Huge experimental (TMT experiment) variability in this approach.

!qiime diversity core-metrics \
  --i-table 2search_common_biom.qza \
    --p-sampling-depth 694193 \
--m-metadata-file ../UC_MP_Emperor_Map.txt \
--output-dir core-metrics-results_2search

[32mSaved FeatureTable[Frequency] to: core-metrics-results_2search/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_2search/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_2search/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_2search/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_2search/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_2search/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: core-metrics-results_2search/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: core-metrics-results_2search/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: core-metrics-results_2search/jaccard_emperor.qzv[0m
[32mSaved Visualization to: core-metrics-results_2search/bray_curtis_emperor.qzv[0m


In [40]:
#Huge experimental (TMT experiment) variability in this approach.

!qiime diversity core-metrics \
  --i-table IGC_biom.qza \
    --p-sampling-depth 1813092 \
--m-metadata-file ../UC_MP_Emperor_Map.txt \
--output-dir core-metrics-results_IGC

[32mSaved FeatureTable[Frequency] to: core-metrics-results_IGC/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_IGC/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_IGC/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_IGC/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_IGC/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_IGC/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: core-metrics-results_IGC/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: core-metrics-results_IGC/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: core-metrics-results_IGC/jaccard_emperor.qzv[0m
[32mSaved Visualization to: core-metrics-results_IGC/bray_curtis_emperor.qzv[0m


In [32]:
#Metabolomics

!qiime diversity core-metrics \
  --i-table Metabolomics_biom.qza \
    --p-sampling-depth 184074904 \
--m-metadata-file ../UC_MP_Emperor_Map.txt \
--output-dir core-metrics-results_Metabolomics

[32mSaved FeatureTable[Frequency] to: core-metrics-results_Metabolomics/rarefied_table.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_Metabolomics/observed_otus_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_Metabolomics/shannon_vector.qza[0m
[32mSaved SampleData[AlphaDiversity] to: core-metrics-results_Metabolomics/evenness_vector.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_Metabolomics/jaccard_distance_matrix.qza[0m
[32mSaved DistanceMatrix to: core-metrics-results_Metabolomics/bray_curtis_distance_matrix.qza[0m
[32mSaved PCoAResults to: core-metrics-results_Metabolomics/jaccard_pcoa_results.qza[0m
[32mSaved PCoAResults to: core-metrics-results_Metabolomics/bray_curtis_pcoa_results.qza[0m
[32mSaved Visualization to: core-metrics-results_Metabolomics/jaccard_emperor.qzv[0m
[32mSaved Visualization to: core-metrics-results_Metabolomics/bray_curtis_emperor.qzv[0m


In [2]:
#Shannon diversity metabolome

!qiime diversity alpha \
  --i-table Metabolomics_biom.qza \
  --p-metric shannon \
  --o-alpha-diversity Metabolome_shannon.qza

[32mSaved SampleData[AlphaDiversity] to: Metabolome_shannon.qza[0m


### Random Forest Classifier - Qiime2

In [4]:
#Code for determining features most related to the pielou evenness - supplemental figure

!qiime sample-classifier regress-samples \
--i-table ./16S_biom.qza \
--m-metadata-file ../UC_MP_Emperor_Map.txt \
--m-metadata-column pielou_e \
--p-optimize-feature-selection \
--p-parameter-tuning \
--p-estimator RandomForestRegressor \
--o-visualization \
../Random_Forests/pieloue_rforest_16S.qzv

[32mSaved Visualization to: ../Random_Forests/pieloue_rforest_16S.qzv[0m


In [None]:
#Create combined table with all data
Mb = pd.read_csv('./Metabolomics/Feature_table.txt', sep = '\t', index_col = '#OTUID')
Mp = pd.read_csv('./pDB_Proteomics/2Search/CSVs/NormalizedCommonReps.txt', sep = '\t', index_col = 'datarest$ProteinID')
Ser = pd.read_csv('./Serum/CSVs/NormalizedCommonReps_ids.txt', sep = '\t', index_col = 'datarest$ProteinID')
MG = pd.read_csv('./Genomics/Shotgun/Salmon_CPMs_0s.txt', sep = '\t', index_col = '#OTU')
Amp = pd.read_csv('./Genomics/16S/reference-hit_idswap2_noblanks.txt', sep = '\t', index_col = '#OTU ID')

#Overlapping IDS are a problem, so append the metaproteome ids with _MP to signify metaproteome.
Mp.index = Mp.index + '_MP'

#Concatenate data
Objs = [Mb, Mp, Ser, MG, Amp]
alldf = pd.concat(Objs)

alldf.index.rename('Features', inplace = True)

#Save the new feature table
#alldf.to_csv('./allfeaturesconcat.txt', sep = '\t')

In [None]:
#Create biom file
!biom convert -i ./allfeaturesconcat.txt \
-o ./Allfeaturesconcat.biom \
-m ./UC_MP_Emperor_Map_1.txt \
--table-type="OTU table" --to-hdf5

<i> Performed random forest analyses on supercomputer using the Allfeaturesconcat.biom file </i>

### 16S analysis - Qiime2

In [None]:
# Initializes the notebook with inline display
%matplotlib inline

from os import mkdir
import os
import copy
from os.path import abspath, join as pjoin, exists
from shutil import copy2, move
from time import strftime, strptime
from numpy import nan, isnan, arange
from pandas import read_csv, Series, DataFrame
from IPython.display import Image
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
from IPython.display import FileLinks, FileLink

In [None]:
!validate_mapping_file.py -o vmf-map/ -m ./11549_prep_4716_qiime_20180508-114955.txt

In [None]:
!biom convert -i ./reference-hit_idswap2.txt -o ./reference-hit_idswap2.biom --table-type="OTU table" --to-hdf5

#### Import all files as Qiime2 artifacts

In [None]:
!qiime tools import \
  --input-path reference-hit_idswap2.biom \
  --type 'FeatureTable[Frequency]' \
  --output-path biom_id2.qza

In [None]:
!qiime feature-table summarize \
  --i-table biom_id2.qza \
  --o-visualization biom_id2.qzv \
  --m-sample-metadata-file ./UC_Severity_1_MF_Idswap6.21.18.txt

In [None]:
## rep-seqs 
!qiime tools import \
  --input-path reference-hit.seqs.fa \
  --output-path sequences.qza \
  --type 'FeatureData[Sequence]'

In [None]:
!qiime alignment mafft \
  --i-sequences sequences.qza \
  --o-alignment aligned-rep-seqs.qza

In [None]:
!qiime alignment mask \
  --i-alignment aligned-rep-seqs.qza \
  --o-masked-alignment masked-aligned-rep-seqs.qza

In [None]:
!qiime phylogeny fasttree \
  --i-alignment masked-aligned-rep-seqs.qza \
  --o-tree unrooted-tree.qza

In [None]:
!qiime phylogeny midpoint-root \
  --i-tree unrooted-tree.qza \
  --o-rooted-tree rooted-tree.qza

#### Core Diversity Analysis

In [None]:
!qiime diversity core-metrics-phylogenetic \
  --i-phylogeny rooted-tree.qza \
  --i-table biom_id2.qza \
  --p-sampling-depth 4166 \
  --m-metadata-file UC_Severity_1_MF_Idswap6.21.18.txt \
  --output-dir core-metrics-results_idswap2_newmetadata_allsamples

#### Alpha diversity analyses

In [None]:
!qiime diversity alpha-group-significance \
  --i-alpha-diversity core-metrics-results_idswap2/faith_pd_vector.qza \
  --m-metadata-file UC_Severity_1_MF_Idswap6.21.18.txt \
  --o-visualization core-metrics-results_idswap2_newmetadata/faith-pd-group-significance.qzv

In [None]:
!qiime diversity alpha-group-significance \
  --i-alpha-diversity core-metrics-results_idswap2/evenness_vector.qza \
  --m-metadata-file UC_Severity_1_MF_Idswap6.21.18.txt \
  --o-visualization core-metrics-results_idswap2_newmetadata/evenness-group-significance.qzv

#### Taxanomic analysis

In [None]:
!qiime feature-classifier classify-sklearn \
  --i-classifier gg-13-8-99-515-806-nb-classifier.qza \
  --i-reads sequences.qza \
  --o-classification taxonomy.qza

In [None]:
!qiime metadata tabulate \
  --m-input-file taxonomy.qza \
  --o-visualization taxonomy.qzv

In [None]:
!qiime taxa barplot \
  --i-table biom_id2.qza \
  --i-taxonomy taxonomy.qza \
  --m-metadata-file UC_Severity_1_MF_Idswap6.21.18.txt \
  --o-visualization taxa-bar-plots_idswap2.qzv