# Functional Prediction

You can learn more about PICRUSt2 on its [GitHub wiki](https://github.com/picrust/picrust2/wiki), in [this tutorial](https://github.com/picrust/picrust2/wiki/q2-picrust2-Tutorial) and the [Nature Biotechnology article](https://doi.org/10.1038/s41587-020-0548-6).

<a id='setup'></a>
## 0. Setup

In [1]:
import os
import pandas as pd
import qiime2 as q2
import requests

from qiime2 import Visualization

data_dir = 'shared_data/'
    
%matplotlib inline

In [2]:
def fetch_ipath(ids: list, img_output_path: str, verbose: bool = False):
    """Fetches a enriched pathways map from iPATH3 for given IDs."""
    url = 'https://pathways.embl.de/mapping.cgi'
    
    # remove colon from EC names
    if ':' in ids[0]:
        ids = [x.replace(':', '') for x in ids]
    
    if verbose:
        print(f'Fetching iPATH3 diagram for ids: {ids}')
    params = {
        'default_opacity': 0.6,
        'export_type': 'svg',
        'selection': '\n'.join(ids)
    }   
    response = requests.get(url=url, params=params)
    
    with open(img_output_path, 'wb') as img:
        img.write(response.content)

In [3]:
# path to the picrust2 conda environment - do not change!
picrust_env = '/opt/conda/envs/picrust2/bin'

<a id='picrust'></a>
## 1. Functional inference

Use Picrust 2 to simulate metagenome data from our dataset. 


In [4]:
%%script env picrust_env="$picrust_env" data_dir="$data_dir" bash

# append the env location to PATH so that qiime
# can find all required executables
export PATH=$picrust_env:$PATH

$picrust_env/qiime picrust2 full-pipeline \
    --i-seq $data_dir/Denoising/dada2_rep_set.qza \
    --i-table $data_dir/Denoising/dada2_table.qza \
    --output-dir $data_dir/picrust2_results \
    --p-placement-tool sepp \
    --p-threads 2 \
    --p-hsp-method pic \
    --p-max-nsti 2 

QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.


Saved FeatureTable[Frequency] to: shared_data//picrust2_results/ko_metagenome.qza
Saved FeatureTable[Frequency] to: shared_data//picrust2_results/ec_metagenome.qza
Saved FeatureTable[Frequency] to: shared_data//picrust2_results/pathway_abundance.qza


In [5]:
! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/ko_metagenome.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where "[alleged_abduction]='0'" \
    --o-filtered-table $data_dir/picrust2_results/ko_metagenome_abducted.qza

! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/ec_metagenome.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where  "[alleged_abduction]='0'" \
    --o-filtered-table $data_dir/picrust2_results/ec_metagenome_abducted.qza

! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/pathway_abundance.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where  "[alleged_abduction]='0'" \
    --o-filtered-table $data_dir/picrust2_results/pathway_abundance_abducted.qza

[33mQIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.[0m
[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/ko_metagenome_abducted.qza[0m
[0m[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/ec_metagenome_abducted.qza[0m
[0m[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/pathway_abundance_abducted.qza[0m
[0m

Now, we can read in all three artifacts using QIIME 2 Python API - we can view them as DataFrames:

In [6]:
ko_a = q2.Artifact.load(f'{data_dir}/picrust2_results/ko_metagenome_abducted.qza').view(pd.DataFrame)
ec_a = q2.Artifact.load(f'{data_dir}/picrust2_results/ec_metagenome_abducted.qza').view(pd.DataFrame)
pa_a = q2.Artifact.load(f'{data_dir}/picrust2_results/pathway_abundance_abducted.qza').view(pd.DataFrame)

Let's briefly examine the contents of each of those tables:

In [7]:
ko_a.head(1)

Unnamed: 0,K00001,K00002,K00003,K00004,K00005,K00007,K00008,K00009,K00010,K00011,...,K19777,K19778,K19779,K19780,K19784,K19785,K19787,K19788,K19789,K19791
0KB68F,1361.400386,22.47605,6638.807061,1283.583009,2819.624601,0.0,12651.473549,4992.0463,5519.033478,3.783984e-21,...,0.024526,0.0,0.0,0.0,609.994359,0.0,0.0,0.0,0.010882,0.00029


In [8]:
ec_a.head(1)

Unnamed: 0,EC:1.1.1.1,EC:1.1.1.10,EC:1.1.1.100,EC:1.1.1.101,EC:1.1.1.102,EC:1.1.1.103,EC:1.1.1.105,EC:1.1.1.107,EC:1.1.1.108,EC:1.1.1.11,...,EC:6.4.1.8,EC:6.5.1.1,EC:6.5.1.2,EC:6.5.1.3,EC:6.5.1.4,EC:6.5.1.5,EC:6.5.1.6,EC:6.5.1.7,EC:6.6.1.1,EC:6.6.1.2
0KB68F,8873.981782,0.001547,11849.507136,1.5720369999999998e-19,0.034769,958.574356,2.834738e-11,0.054358,34.422218,0.0,...,3.210485,1998.437665,6643.33606,266.921716,9.72331e-13,0.0,248.185353,248.185353,170.785566,111.015333


In [9]:
pa_a.head(1)

Unnamed: 0,1CMET2-PWY,3-HYDROXYPHENYLACETATE-DEGRADATION-PWY,AEROBACTINSYN-PWY,ALL-CHORISMATE-PWY,ANAEROFRUCAT-PWY,ANAGLYCOLYSIS-PWY,ARG+POLYAMINE-SYN,ARGDEG-PWY,ARGORNPROST-PWY,ARGSYN-PWY,...,THISYN-PWY,THREOCAT-PWY,THRESYN-PWY,TRNA-CHARGING-PWY,TRPSYN-PWY,TYRFUMCAT-PWY,UBISYN-PWY,UDPNAGSYN-PWY,VALDEG-PWY,VALSYN-PWY
0KB68F,5609.428252,56.803589,0.707068,0.0,6474.200627,8260.825591,981.9506,1e-05,1187.378324,5202.509647,...,2405.565906,42.45194,5210.690347,5785.023943,4657.981644,106.453291,40.903373,5043.6082,2.866319,7555.79568


In [10]:
! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/ko_metagenome.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where "[alleged_abduction]='1'" \
    --o-filtered-table $data_dir/picrust2_results/ko_metagenome_not_abducted.qza

! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/ec_metagenome.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where  "[alleged_abduction]='1'" \
    --o-filtered-table $data_dir/picrust2_results/ec_metagenome_not_abducted.qza

! qiime feature-table filter-samples \
    --i-table $data_dir/picrust2_results/pathway_abundance.qza \
    --m-metadata-file $data_dir/metadata/sample_metadata.tsv \
    --p-where  "[alleged_abduction]='1'" \
    --o-filtered-table $data_dir/picrust2_results/pathway_abundance_not_abducted.qza

[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/ko_metagenome_not_abducted.qza[0m
[0m[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/ec_metagenome_not_abducted.qza[0m
[0m[32mSaved FeatureTable[Frequency] to: shared_data//picrust2_results/pathway_abundance_not_abducted.qza[0m
[0m

In [11]:
ko_na = q2.Artifact.load(f'{data_dir}/picrust2_results/ko_metagenome_not_abducted.qza').view(pd.DataFrame)
ec_na = q2.Artifact.load(f'{data_dir}/picrust2_results/ec_metagenome_not_abducted.qza').view(pd.DataFrame)
pa_na = q2.Artifact.load(f'{data_dir}/picrust2_results/pathway_abundance_not_abducted.qza').view(pd.DataFrame)

In [13]:
ko_na.head(1)

Unnamed: 0,K00001,K00002,K00003,K00004,K00005,K00007,K00008,K00009,K00010,K00011,...,K19776,K19777,K19778,K19779,K19780,K19784,K19785,K19788,K19789,K19791
0DOSLC,54182.069959,6.981562,114120.68968,72354.209497,45604.268712,10.652446,79362.866673,59942.895952,33794.194597,5e-06,...,25.698715,37.359132,22.442708,5.201112,19.893207,27133.128495,0.007928,0.0,37.851223,0.0


In [14]:
ec_na.head(1)

Unnamed: 0,EC:1.1.1.1,EC:1.1.1.10,EC:1.1.1.100,EC:1.1.1.101,EC:1.1.1.102,EC:1.1.1.103,EC:1.1.1.105,EC:1.1.1.107,EC:1.1.1.108,EC:1.1.1.11,...,EC:6.4.1.8,EC:6.5.1.1,EC:6.5.1.2,EC:6.5.1.3,EC:6.5.1.4,EC:6.5.1.5,EC:6.5.1.6,EC:6.5.1.7,EC:6.6.1.1,EC:6.6.1.2
0DOSLC,279428.793373,1.404367e-09,333782.8077,3.7427060000000004e-17,0.064341,624.524986,2.125909,0.112491,7238.304061,10.652446,...,3.295559,2392.831915,136766.045596,6958.754988,34.220674,0.002587,941.460175,941.460175,1173.472589,633.545709


In [15]:
pa_na.head(1)

Unnamed: 0,1CMET2-PWY,3-HYDROXYPHENYLACETATE-DEGRADATION-PWY,AEROBACTINSYN-PWY,ALL-CHORISMATE-PWY,ANAEROFRUCAT-PWY,ANAGLYCOLYSIS-PWY,ARG+POLYAMINE-SYN,ARGDEG-PWY,ARGORNPROST-PWY,ARGSYN-PWY,...,THISYN-PWY,THREOCAT-PWY,THRESYN-PWY,TRNA-CHARGING-PWY,TRPSYN-PWY,TYRFUMCAT-PWY,UBISYN-PWY,UDPNAGSYN-PWY,VALDEG-PWY,VALSYN-PWY
0DOSLC,63771.529103,3814.332432,269.19087,364.005719,155715.806612,154301.035122,13280.963268,61.024873,91538.094937,81425.820836,...,4605.343494,906.927598,124580.518839,37532.251923,83719.081706,346.681116,169.56297,136811.33978,827.124515,133300.743599


You can see that they look just like the other feature tables we worked before with. The difference is that now they do not contain information about ASVs but about different levels of the functional profiles:

1. `ko` table: columns represent KEGG orthologs, as indicated by their names (e.g., **K**19777)
2. `ec` table: columns represent enzymes, as indicated by the Enzyme Commission numbers (e.g., **EC**:1.1.1.108)
3. `pa` table: columns represent entire pathways using the MetaCyc classification (e.g., ANAGLYCOLYSIS-PWY)

<a id='ipath'></a>
### 1.1 Enriched KEGG orthologs visualization

We start by merging our feature table with the treatment column (`disease`) from the metadata:

In [17]:
metadata = pd.read_csv(f'{data_dir}/metadata/sample_metadata.tsv', sep='\t', header=0, index_col=0)

In [18]:
ko_a_meta = ko_a.merge(metadata[['disease']], left_index=True, right_index=True)
ec_a_meta = ec_a.merge(metadata[['disease']], left_index=True, right_index=True)
pa_a_meta = pa_a.merge(metadata[['disease']], left_index=True, right_index=True)

In [19]:
ko_na_meta = ko_na.merge(metadata[['disease']], left_index=True, right_index=True)
ec_na_meta = ec_na.merge(metadata[['disease']], left_index=True, right_index=True)
pa_na_meta = pa_na.merge(metadata[['disease']], left_index=True, right_index=True)

Next, we will calculate an average abundance of each KO, EC and pathway in each group (treatment vs. no treatment):

In [20]:
# collapse samples per sample_type - calculate average abundance

ko_a_meta_avg = ko_a_meta.groupby('disease').mean()
ec_a_meta_avg = ec_a_meta.groupby('disease').mean()
pa_a_meta_avg = pa_a_meta.groupby('disease').mean()
ko_na_meta_avg = ko_na_meta.groupby('disease').mean()
ec_na_meta_avg = ec_na_meta.groupby('disease').mean()
pa_na_meta_avg = pa_na_meta.groupby('disease').mean()

In [22]:
ko_a_meta_avg

Unnamed: 0_level_0,K00001,K00002,K00003,K00004,K00005,K00007,K00008,K00009,K00010,K00011,...,K19777,K19778,K19779,K19780,K19784,K19785,K19787,K19788,K19789,K19791
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,19028.235594,1065.362017,34387.09214,5062.437658,14231.673873,0.909434,41327.780462,8815.128883,16317.379247,0.002319,...,357.612024,374.245338,82.476389,357.136535,1636.803313,0.0,0.0,0.0,764.903179,0.004102
Leukemia,17227.619162,423.810503,16639.077543,2499.003063,6982.46835,21.041076,18481.732069,11486.349201,9974.081115,0.003699,...,38.012421,371.852232,8.854783,35.672199,5126.323039,0.000112,0.000287,1.960334e-65,190.707274,0.002403
Myelodysplastic Syndromes,13056.800498,1193.270974,14331.828948,360.672058,22843.442305,0.0,28629.95561,3001.358,8689.148528,0.10126,...,0.0,0.180566,0.0,0.0,220.871235,0.0,0.0,0.0,1450.629391,0.0
Non-Hodgkin's Lymphoma,11439.37848,578.621286,26721.551141,3549.133715,21085.984097,1.041146,40508.693863,26224.881159,12625.656417,0.048428,...,352.133688,354.08343,87.898479,351.837638,4058.66147,0.00013,0.0,8.615288e-62,505.411611,0.001922


In [23]:
ec_a_meta_avg

Unnamed: 0_level_0,EC:1.1.1.1,EC:1.1.1.10,EC:1.1.1.100,EC:1.1.1.101,EC:1.1.1.102,EC:1.1.1.103,EC:1.1.1.105,EC:1.1.1.107,EC:1.1.1.108,EC:1.1.1.11,...,EC:6.4.1.8,EC:6.5.1.1,EC:6.5.1.2,EC:6.5.1.3,EC:6.5.1.4,EC:6.5.1.5,EC:6.5.1.6,EC:6.5.1.7,EC:6.6.1.1,EC:6.6.1.2
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,59201.622706,1.936135,84269.563776,1.420883e-19,0.492604,9563.47267,4.016185e-10,1.398176,204.295806,0.909434,...,30.085516,6494.472966,45453.742876,4127.30498,371.255156,4.711116e-93,1369.242137,1369.242137,3375.390229,2979.604866
Leukemia,42676.254918,1.48646,59547.867017,6.923547999999999e-19,4.797126,1508.77492,0.4480973,0.850152,201.423154,21.041076,...,4.553,3345.707116,24506.12636,2459.493577,362.636418,6.425743e-05,510.747908,510.747908,1154.913759,1857.370608
Myelodysplastic Syndromes,21570.018224,0.930931,76914.138717,6.164715e-21,1e-05,7444.313042,0.0,0.361765,3975.211256,0.0,...,0.0,3237.27934,34037.845491,3234.30545,0.641981,0.0,338.171783,338.171783,6838.272213,12205.136175
Non-Hodgkin's Lymphoma,47317.641787,3.671824,73667.712361,9.768579999999999e-19,0.132309,3453.161945,0.08812602,2.626358,1960.874303,1.041146,...,98.21908,8831.497716,34687.161603,3556.853184,454.37882,5.192716e-05,1349.660664,1349.660664,5263.938253,4059.135492


In [24]:
pa_a_meta_avg

Unnamed: 0_level_0,1CMET2-PWY,3-HYDROXYPHENYLACETATE-DEGRADATION-PWY,AEROBACTINSYN-PWY,ALL-CHORISMATE-PWY,ANAEROFRUCAT-PWY,ANAGLYCOLYSIS-PWY,ARG+POLYAMINE-SYN,ARGDEG-PWY,ARGORNPROST-PWY,ARGSYN-PWY,...,THISYN-PWY,THREOCAT-PWY,THRESYN-PWY,TRNA-CHARGING-PWY,TRPSYN-PWY,TYRFUMCAT-PWY,UBISYN-PWY,UDPNAGSYN-PWY,VALDEG-PWY,VALSYN-PWY
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,31162.419166,236.455288,5.834883,1765.841533,28808.684822,41706.15374,5869.759793,691.860993,5122.506043,31287.6296,...,17824.192086,350.323012,30792.296334,32061.643846,29981.198004,654.706874,1455.22073,18286.029379,161.726056,44546.328173
Leukemia,14474.596924,228.825864,98.334934,690.832831,22961.374273,26294.150322,3087.242212,261.65823,4693.925958,14788.007768,...,6551.055647,406.654119,17005.916606,15561.742897,13460.415022,241.475892,1396.212503,18221.914383,69.064722,20282.531193
Myelodysplastic Syndromes,27539.507557,10.19969,2.598471,100.701889,28461.994549,37301.772122,6832.735976,6.832042,5473.324694,21181.456474,...,21293.844391,30.95701,30566.034246,28779.619168,19231.325066,212.058219,642.385398,16477.394555,0.138583,35292.947268
Non-Hodgkin's Lymphoma,23344.962993,255.005188,7.088741,1103.823229,28260.608367,35310.414158,6211.704743,566.729025,7420.252549,23448.637803,...,8995.458905,678.610439,27794.143991,25098.281772,18179.796477,502.104125,949.595462,24969.884756,30.244374,31692.309386


In [25]:
ko_na_meta_avg

Unnamed: 0_level_0,K00001,K00002,K00003,K00004,K00005,K00007,K00008,K00009,K00010,K00011,...,K19776,K19777,K19778,K19779,K19780,K19784,K19785,K19788,K19789,K19791
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,34953.4687,2.55238,41150.899491,4825.747676,6031.352334,2.987571,2319.676886,1318.989544,6554.63751,1.033848e-21,...,516.795292,60.329399,242.171153,14.429723,57.305988,1022.062706,0.0,0.0,1147.702727,0.42267
Leukemia,12857.970171,87.614986,36454.100688,25317.228261,17271.679642,12.264023,26162.447626,23579.801853,13815.782371,0.0007105095,...,1459.920138,671.360458,696.323734,146.345722,601.635315,12723.246532,0.010643,3.690182e-59,1491.308203,0.000295
Myelodysplastic Syndromes,1994.688516,0.0,6361.559141,0.374649,3321.432803,0.0,19096.681963,25235.315634,8.007399,0.0,...,0.0,0.0,0.0,0.0,0.0,3297.361743,0.0,0.0,40.44593,0.0
Non-Hodgkin's Lymphoma,16468.199466,76.265046,41743.571134,4283.070794,15978.635289,0.0,53992.016472,63361.706906,1237.352134,1.9271329999999998e-21,...,2.694251,2.572491,3.047133,0.639329,2.559274,7232.517849,0.0,0.0,147.29541,0.006617


In [26]:
ec_na_meta_avg

Unnamed: 0_level_0,EC:1.1.1.1,EC:1.1.1.10,EC:1.1.1.100,EC:1.1.1.101,EC:1.1.1.102,EC:1.1.1.103,EC:1.1.1.105,EC:1.1.1.107,EC:1.1.1.108,EC:1.1.1.11,...,EC:6.4.1.8,EC:6.5.1.1,EC:6.5.1.2,EC:6.5.1.3,EC:6.5.1.4,EC:6.5.1.5,EC:6.5.1.6,EC:6.5.1.7,EC:6.6.1.1,EC:6.6.1.2
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,231499.446004,2e-06,53734.031499,4.042741e-20,1.012892,1244.27278,0.04558188,41.374385,111.70763,2.987571,...,21.85233,1718.308313,43140.181236,2953.253153,60.367298,3.3583439999999996e-124,359.114744,359.114744,1102.277545,1131.022601
Leukemia,82104.480065,0.27112,124752.655966,1.0006400000000001e-17,0.333364,2603.384905,0.2769499,0.086174,4841.721009,12.264023,...,1.874056,7091.477817,44897.114723,9195.087383,3605.289367,0.008241254,747.180949,747.180949,6465.561181,7710.808756
Myelodysplastic Syndromes,3918.127866,0.0,9123.134872,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,3291.16,6431.549308,0.0,0.0,0.0,0.0,0.0,240.10436,198.473166
Non-Hodgkin's Lymphoma,87493.46437,0.550642,73266.07786,2.8410709999999998e-19,0.017707,3856.331836,1.443694e-11,0.24562,91.187411,0.0,...,13.319997,12216.842675,42925.781427,295.685416,4.284694,0.0,294.224634,294.224634,664.838419,482.420288


In [27]:
pa_na_meta_avg

Unnamed: 0_level_0,1CMET2-PWY,3-HYDROXYPHENYLACETATE-DEGRADATION-PWY,AEROBACTINSYN-PWY,ALL-CHORISMATE-PWY,ANAEROFRUCAT-PWY,ANAGLYCOLYSIS-PWY,ARG+POLYAMINE-SYN,ARGDEG-PWY,ARGORNPROST-PWY,ARGSYN-PWY,...,THISYN-PWY,THREOCAT-PWY,THRESYN-PWY,TRNA-CHARGING-PWY,TRPSYN-PWY,TYRFUMCAT-PWY,UBISYN-PWY,UDPNAGSYN-PWY,VALDEG-PWY,VALSYN-PWY
disease,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Hodgkin's Disease,42485.462649,864.096703,459.858644,3086.112382,16521.682461,44396.42784,2411.114775,1637.173916,2216.609938,43174.928143,...,4788.809612,1372.54673,42802.903098,28873.829793,46079.973418,536.934202,1363.540412,42289.663817,0.743029,58880.829111
Leukemia,29240.267115,6260.231204,822.717498,5513.193568,46508.56544,48800.519429,11017.178063,2465.368221,25652.647536,29915.998824,...,10236.472859,2946.600518,41146.338823,29254.868565,28071.626259,4261.056857,5609.540358,42331.36625,217.641211,44847.75365
Myelodysplastic Syndromes,3664.39311,0.0,0.0,0.0,5598.846816,8351.513944,0.0,0.0,28.119596,5051.817137,...,161.837161,0.0,6108.471542,6047.609136,3954.873567,0.0,0.0,6531.384339,0.0,9099.813351
Non-Hodgkin's Lymphoma,42112.848363,36.017001,0.242872,68.961719,23851.1864,59788.860836,6819.374557,6.207856,2307.334787,41111.530388,...,10085.182092,43.019198,42527.89724,33097.505587,38342.722224,398.477999,444.067383,42166.907824,5.414037,57810.029458
