# Construct integrated data matrices

This example analytical workflow integrates data from different experiments, sample types, and analytical methods. The final output of this workflow generates a unified fold-change matrix for all metabolites identified. 

This example workflow is created using the common biological sample types in microbiome studies (e.g., feces, serum, urine). For bacterial experiments, users can use the same workflow, except for 1) replacing `sample_type` with `media_type` by specifying the type of media used for culturing bacteria, and 2) specifying in the `experiment_type` column with `bacteria`. 

First, the sample database and MS-DIAL analysis data files from all experiments are parsed and integrated into a single data matrix of raw ion counts across the three analytical methods (referred to as "modes" below). 

Next, a shared set of robustly detected internal standards among different experiments are used to normalize within and across experiments to account for intra- and inter-experimental variations in instrumental sensitivity. Specifically, the raw ion count from each metabolite is normalized to the sum of internal standards specific to each experiment (e.g., mono-colonization) and sample type (e.g., serum). 

Lastly, the fold change matrix is generated by calculating the relative fold change between metabolite ion counts detected in colonized mice vs. germ-free controls for each experiment and sample type. A separate metadata file is also generated to provide details on properties associated with each sample run in each mode.

See supporting code in `data_analysis.py`.


## Input files

### MS-DIAL Data Map

This MS-DIAL data map documents file paths and worksheet names of all MS-DIAL post-analysis data files. One MS-DIAL data file is generated for each of the three modes (C18 positive, C18 negative, or HILIC positive). Each MS-DIAL data file contains all experiments (each experiment is indicated as an independent worksheet in the same excel file) analyzed using the same mode. 

experiment_type        | chromatography | ionization | sample_type | msdial_fp                       | sheetname                   | collection_time
----------------- | -------------- | ---------- | ----------- | ------------------------------- | --------------------------- | ---------------
mono-colonization | c18            | positive   | feces       | c18_positive_msdial_data.xlsx | 20180504_GFBtCs_FecesCaecal | 1
mono-colonization | c18            | positive   | caecal      | c18_positive_msdial_data.xlsx | 20180504_GFBtCs_FecesCaecal | 1

Required columns: 

- msdial_fp: The file path to the MSDIAL data file (relative to the current file)
- sheetname: The name of the worksheet inside the MSDIAL data file
- experiment_type: The type of experiment (e.g., 'mono-colonization' for mouse experiments, 'bacterial_culture' for bacterial experiments) 
- chromatography: Should be either `c18` or `hilic`
- ionization: Should be either `positive` or `negative`
- sample_type: The biological sample type collected (e.g., serum, urine, feces, caecal) from mammalian hosts. Should be named `media_type` for bacterial data: represents the media in which the bacteria has been cultured.

Optional column:

- collection_time: Used for distinguishing samples collected from independent experimental repeats of the same set of conditions. Use `1` for the first experiment, and `2`, `3`, and so on for any additional experimental repeats. 


### MS-DIAL Data

The MS-DIAL data file contains all post-analysis file output from MS-DIAL analysis per mode. If all three modes are used, there should be three MS-DIAL data files with each file correponds to each mode.

Average Rt(min) | Average Mz | Metabolite name | GF_F4  | GF_F3  | GF_F2  | GF_F1
--------------- | ---------- | --------------- | ------ | ------ | ------ | ------
0.71            | 180.0864   | m_c18p_0000     | 131492 | 85440  | 164425 | 130352
3.46            | 153.054    | m_c18p_0006     | 173172 | 139179 | 131060 | 160408
1.64            | 132.1021   | m_c18p_0018     | 43081  | 41948  | 34707  | 39552

Required columns:

- Metabolite name: Contains metabolite dnames matching the dnames from the published mz-rt compound library.
- `sample name columns`: The columns coming after `Metabolite name` should be MS-DIAL sample name columns containing raw ion counts

### Sample Database

The database file contains all information regarding each sample that is used in this workflow. Specifically, a unique sample ID is assigned to each sample row, and different run IDs are assigned to the same sample (and the same sample ID) when it is analyzed using different modes. For example, when one sample is analyzed using two or three modes, two or three unique run IDs can be associated with the same sample and its sample ID. 

sample_id | run_id | ms_dial_sample_name | chromatography | ionization | sample_type | colonization | experiment_type        | collection_time
--------- | ------ | ------------------- | -------------- | ---------- | ----------- | ------------ | ----------------- | ---------------
s1001     | m0011  | GF_F1               | c18            | positive   | serum       | germ-free    | mono-colonization | 1
s1002     | m0012  | GF_F2               | c18            | positive   | serum       | germ-free    | mono-colonization | 1
s1003     | m0013  | GF_F3               | c18            | positive   | serum       | germ-free    | mono-colonization | 1
s1004     | m0014  | GF_F4               | c18            | positive   | serum       | germ-free    | mono-colonization | 1
s1001     | m0054  | GF_F1               | c18            | negative   | feces       | germ-free    | mono-colonization | 1
s1002     | m0055  | GF_F2               | c18            | negative   | feces       | germ-free    | mono-colonization | 1
s1003     | m0056  | GF_F3               | c18            | negative   | feces       | germ-free    | mono-colonization | 1
s1004     | m0057  | GF_F4               | c18            | negative   | feces       | germ-free    | mono-colonization | 1

Required columns:

- sample_id: Uniquely identifies a sample. When the same sample is run through multiple modes, the rows belonging to the same sample should have the same sample_id.
- run_id: Uniquely identifies a mode. Every row in the database should have a different run_id.
- ms_dial_sample_name: The sample names in this column should match the column names in the MS-DIAL data files
- chromatography: Should be either `c18` or `hilic`
- ionization: Should be either `positive` or `negative`
- sample_type: The biological sample type collected (e.g., serum, urine, feces, caecal) from mammalian hosts. Should be named `media_type` for bacterial data: represents the media in which the bacteria has been cultured.
- colonization: The colonization status (e.g., 'germ-free') for mouse data. Should be named `bacteria` for bacterial data: represents the bacterial species used in the sample. For mouse data, fold change values will be calculated relative to data with colonization = `germ-free`. For bacterial data, use `media_blank`.
- experiment_type: Represents a unique set of experimental conditions.

Optional column:

- collection_time: Used for distinguishing samples collected from independent experimental repeats of the same set of conditions. Use `1` for the first experiment, and `2`, `3`, and so on for any additional experimental repeats. 


### mz-rt compound library

Contains all the compounds and dnames relevant to the data analysis. This file is identical to the published mz-rt library in Supplementary Table 1 (Han et al, Nature, 2021).

dname       | Compound            | Peak	
----------- | ------------------- | ------
m_c18p_0312 | O-PHOSPHO-SERINE    |
m_c18n_0016 | 3-PHENYLLACTIC ACID | PEAK1
m_c18n_0016 | 3-PHENYLLACTIC ACID | PEAK2

Required columns:

- dname: The unique ID assigned to each compound and is specific to each mode. For example, if a compound is detected in multiple modes, then the same compound can have multiple dnames. Very occasionally, dnames are repeated in separate rows, because they correspond to two adjacent peaks of the same compound analyzed in the same mode.
- Compound: The compound name assigned based on information from the PubChem database.

In [1]:
import numpy as np
import pandas as pd
import os
from data_analysis import DataAnalysis

db = pd.read_excel('input/sample_database.xlsx').set_index('run_id')

msdial_analysis_map = pd.read_excel('input/msdial_data_map.xlsx')
msdial_analysis_dir = os.path.dirname('input/msdial_data_map.xlsx')

cpd_library = pd.read_excel('input/mz-rt_library.xlsx')

analysis = DataAnalysis(
    db,
    msdial_analysis_map=msdial_analysis_map,
    msdial_analysis_dir=msdial_analysis_dir,
    cpd_library=cpd_library
)

# Generate matrices with columns as dnames
analysis_result_with_dnames = analysis.run(
    output_cpd_names=False, 
    remove_dnames=True
)

analysis_result_with_dnames['metadata'].to_csv('metadata.txt', sep='\t')
analysis_result_with_dnames['raw_ion_counts_matrix'].to_csv('raw_ion_counts_matrix.txt', sep='\t')
analysis_result_with_dnames['istd_corrected_matrix'].to_csv('istd_corrected_matrix.txt', sep='\t')
analysis_result_with_dnames['fold_change_matrix'].to_csv('fold_change_matrix.txt', sep='\t')

# Generate matrices with columns as compound names for plotting purposes
analysis_result = analysis.run(
    output_cpd_names=True, 
    remove_dnames=True
)

metadata = analysis_result['metadata']
fold_change_matrix = analysis_result['fold_change_matrix']

In [2]:
metadata

Unnamed: 0_level_0,experiment_type,sample_type,colonization,c18positive,c18negative,hilicpositive,collection_time
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
s1001,mono-colonization,feces,germ-free,m0011,m0054,m0088,1
s1002,mono-colonization,feces,germ-free,m0012,m0055,m0089,1
s1003,mono-colonization,feces,germ-free,m0013,m0056,m0090,1
s1004,mono-colonization,feces,germ-free,m0014,m0057,m0091,1
s1005,mono-colonization,feces,Bt,m0015,m0058,m0092,1
s1006,mono-colonization,feces,Bt,m0016,m0059,m0093,1
s1007,mono-colonization,feces,Bt,m0017,m0060,m0094,1
s1008,mono-colonization,feces,Bt,m0018,m0061,m0095,1
s1009,mono-colonization,feces,Bt,m0019,m0062,m0096,1
s1010,mono-colonization,feces,Cs,m0020,m0063,m0097,1


In [3]:
fold_change_matrix

Unnamed: 0_level_0,"1,5-ANHYDRO-GLUCITOL.c18negative","1,6-ANHYDRO-B-GLUCOSE.c18negative","1,6-ANHYDRO-B-GLUCOSE.c18positive",1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID.c18positive,1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID.hilicpositive,1-METHYLADENOSINE.c18positive,1-METHYLGUANIDINE.c18positive,1-METHYLGUANIDINE.hilicpositive,1-METHYLNICOTINAMIDE.c18positive,1-METHYLNICOTINAMIDE.hilicpositive,...,URIDINE.c18negative,URIDINE.c18positive,UROCANIC ACID.c18positive,URSOCHOLIC ACID.hilicpositive,VALINE.c18positive,VITAMIN D3.hilicpositive,XANTHINE.c18negative,XANTHOSINE.c18positive,XANTHURENIC ACID.c18negative,XANTHURENIC ACID.c18positive
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
s1001,,-0.218056,-0.005839,-0.06103,,-0.144163,,0.254625,-0.089892,-0.004897,...,0.280329,0.285868,-0.166864,-0.476704,,0.148454,0.233243,,,0.024187
s1002,,0.060171,-0.09417,0.23834,,0.100255,,-0.08571,0.319946,0.082823,...,-0.264986,-0.235241,0.249569,0.328136,,-0.21334,-0.173237,,,-0.692183
s1003,,-0.002076,0.060398,-0.447115,,-0.358842,,-0.110645,-0.551284,-0.375779,...,-0.109456,-0.050325,-0.005899,-0.039374,,-0.170612,0.192142,,,0.084908
s1004,,0.136356,0.034885,0.174122,,0.314224,,-0.091603,0.175953,0.230955,...,0.03757,-0.05051,-0.11333,0.074388,,0.189812,-0.33058,,,0.382453
s1005,,0.599146,-4.130862,-0.872575,,-2.130107,,0.844104,-0.864498,0.115232,...,-1.657661,-2.0002,3.462068,3.236005,,0.092709,0.625561,,,4.490826
s1006,,0.843235,-3.99468,-0.56093,,-0.036527,,-0.054176,-0.460992,0.435536,...,-1.184175,-1.832401,3.181058,4.562597,,0.005544,2.075968,,,4.625499
s1007,,0.440317,-3.679783,-0.447307,,0.357342,,0.390119,-0.209119,0.925003,...,-1.413892,-2.244526,3.041337,2.203029,,-0.21052,-0.582959,,,4.645014
s1008,,0.76062,-3.962769,-0.855617,,-2.122358,,0.243682,0.052203,0.717259,...,-1.302852,-2.007077,2.987108,3.552502,,-0.284693,-0.569233,,,5.033459
s1009,,0.559096,-3.357828,-1.438843,,-1.051576,,0.318677,0.482707,1.060911,...,-1.472597,-2.27625,2.105712,2.933618,,-0.23272,0.653569,,,5.13958
s1010,,-0.407534,-0.473163,-0.510748,,0.950097,,0.212652,0.834965,0.228821,...,-3.398783,-3.669699,0.649495,0.849227,,0.346354,-1.853404,,,1.505358


## Mode picking preference

Because each metabolite can be co-detected by more than one modes (C18 positive, C18 negative, and/or HILIC positive), we have developed a strategy to select a preferred mode to reduce the dimensionality of the metabolite fold-change data. Based on the ISTD (internal standards)-corrected  matrix, for each metabolite reported, we 1) compute coefficient of variation (CV) of each set of biological replicates generated per sample type, colonization, and experiment, and 2) plot all CVs from all experiments in a summary histogram, so users can visually inspect their distributions. 

When a metabolite is co-detected by two or three analytical methods: 

- A preferred mode is chosen when the average CV is less than or equal to 0.2 (e.g., using a threshold of 0.2 set in code that users can change). The metabolite fold-change value will be reported as the one generated by the preferred mode.
Or
- Two or three preferred modes are chosen simultaneously when their average CVs all satisfy the user-defined threshold. The metabolite fold-change value will be the average of the two or three modes selected.

When a metabolite is only detected by one mode, the metabolite fold-change value will be reported by default and will be not eliminated based on the CV threshold. 

In [4]:
# Plot CV histograms and generate the mode picking preferences
mode_picker = analysis.plot_cv_histograms(analysis_result['istd_corrected_matrix'], 
                                          metadata)

mode_picker.to_excel('mode_picking_preference.xlsx')
mode_picker

Generating histograms in cv_histograms/
Generating histograms in cv_histograms_codetected/


Unnamed: 0_level_0,mode_detected,mode_pref
metabolite,Unnamed: 1_level_1,Unnamed: 2_level_1
"1,5-ANHYDRO-GLUCITOL",c18n,c18n
"1,6-ANHYDRO-B-GLUCOSE","c18n, c18p","c18n, c18p"
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,"c18p, hilicp",c18p
1-METHYLADENOSINE,c18p,c18p
1-METHYLGUANIDINE,"c18p, hilicp",hilicp
...,...,...
VALINE,c18p,c18p
VITAMIN D3,hilicp,hilicp
XANTHINE,c18n,c18n
XANTHOSINE,c18p,c18p


# Join metabolite data collected by three analytical methods

When a metabolite is detected by multiple analytical methods ("modes"), its fold-change values are averaged among the preferred modes. This mode preference is determined based on the consistency of detection, as calculated by coefficient of variation (CV), among all sets of bioloical replicates for each mode. 

The following calculation uses the file "mode_picking_preference.xlsx" generated above to choose data from the best detected mode or average data among two or more well-performing modes for each metabolite.

In [5]:
analysis = DataAnalysis()

fold_change_matrix_mode_collapsed = analysis.collapse_modes(
    fold_change_matrix, 
    mode_picker # mode picker generated from previous section
)

fold_change_matrix_mode_collapsed.to_csv('fold_change_matrix_mode_collapsed.txt', sep='\t')

In [6]:
fold_change_matrix_mode_collapsed

Unnamed: 0_level_0,"1,5-ANHYDRO-GLUCITOL","1,6-ANHYDRO-B-GLUCOSE",1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,1-METHYLADENOSINE,1-METHYLGUANIDINE,1-METHYLNICOTINAMIDE,1-OLEOYL-RAC-GLYCEROL,"2,4-DIHYDROXYBUTANOIC ACID","2,6-DIAMINOHEPTANEDIOIC ACID","2,6-DIHYDROXYBENZOIC ACID",...,URACIL,URIC ACID,URIDINE,UROCANIC ACID,URSOCHOLIC ACID,VALINE,VITAMIN D3,XANTHINE,XANTHOSINE,XANTHURENIC ACID
sample_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
s1001,,-0.108049,-0.06103,-0.144163,0.254625,-0.046769,,0.236494,,-0.273943,...,0.677032,-0.119102,0.285868,-0.166864,-0.476704,,0.148454,0.233243,,0.024187
s1002,,-0.014937,0.23834,0.100255,-0.08571,0.206251,,0.024265,,-1.58813,...,0.493056,0.831752,-0.235241,0.249569,0.328136,,-0.21334,-0.173237,,-0.692183
s1003,,0.029499,-0.447115,-0.358842,-0.110645,-0.460865,,-0.273559,,1.133007,...,-1.242202,-0.599941,-0.050325,-0.005899,-0.039374,,-0.170612,0.192142,,0.084908
s1004,,0.086512,0.174122,0.314224,-0.091603,0.203716,,-0.03263,,-0.627744,...,-0.808431,-0.64473,-0.05051,-0.11333,0.074388,,0.189812,-0.33058,,0.382453
s1005,,-0.34749,-0.872575,-2.130107,0.844104,-0.293017,,0.596317,,2.040091,...,3.508261,-0.003,-2.0002,3.462068,3.236005,,0.092709,0.625561,,4.490826
s1006,,-0.107182,-0.56093,-0.036527,-0.054176,0.05582,,-0.392388,,2.142802,...,2.284099,-0.389697,-1.832401,3.181058,4.562597,,0.005544,2.075968,,4.625499
s1007,,-0.479014,-0.447307,0.357342,0.390119,0.466629,,0.105126,,1.129072,...,1.502961,-0.500564,-2.244526,3.041337,2.203029,,-0.21052,-0.582959,,4.645014
s1008,,-0.185776,-0.855617,-2.122358,0.243682,0.422719,,-0.541848,,1.89502,...,1.912114,-0.143964,-2.007077,2.987108,3.552502,,-0.284693,-0.569233,,5.033459
s1009,,-0.34842,-1.438843,-1.051576,0.318677,0.800584,,0.406672,,2.57521,...,2.007252,-0.26384,-2.27625,2.105712,2.933618,,-0.23272,0.653569,,5.13958
s1010,,-0.439976,-0.510748,0.950097,0.212652,0.563495,,-0.115114,,0.779426,...,-0.567564,-0.649932,-3.669699,0.649495,0.849227,,0.346354,-1.853404,,1.505358


# Compute statistics

Statistical calculation is conducted to output a set of p-values and corrected p-values using Student's t-test with Benjamini-Hochberg corrections for multiple comparisons. Both types of p-values are reported on a per-metabolite and per sample-type basis. See the supporting code in `calculate_stats_sig_with_correction.py`.

### p-value calculation for serum metabolites

In [7]:
from calculate_stats_sig_with_correction import calculate_pvalues

colonizations = ['Bt', 'Cs']
sample_type = 'serum'

matrix = fold_change_matrix_mode_collapsed.join(metadata[['experiment_type', 'sample_type', 'colonization']])
matrix = matrix[matrix['colonization'].isin(colonizations + ['germ-free']) & (matrix['sample_type'] == sample_type)]

serum_pvalues = calculate_pvalues(matrix,
                                  colonizations=colonizations,
                                  sample_type=sample_type)

serum_pvalues.to_csv('serum_pvalues.txt', sep='\t')
serum_pvalues

Unnamed: 0,colonization,fc_value_avg,pvalue,pvalue_corrected
"1,6-ANHYDRO-B-GLUCOSE",Bt,0.350215,0.039012,0.211890
"1,6-ANHYDRO-B-GLUCOSE",Cs,0.226852,0.059663,0.261671
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,Bt,-0.146675,0.435541,0.767603
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,Cs,0.226074,0.062912,0.261671
1-METHYLNICOTINAMIDE,Bt,-0.000263,0.974820,0.989103
...,...,...,...,...
VALINE,Cs,-0.001425,0.926748,0.970269
VITAMIN D3,Bt,-0.263886,0.494690,0.784311
VITAMIN D3,Cs,0.576966,0.182440,0.515673
XANTHOSINE,Bt,0.155549,0.619141,0.816696


### p-value calculation for urine metabolites

In [8]:
colonizations = ['Bt', 'Cs']
sample_type = 'urine'

matrix = fold_change_matrix_mode_collapsed.join(metadata[['experiment_type', 'sample_type', 'colonization']])
matrix = matrix[matrix['colonization'].isin(colonizations + ['germ-free']) & (matrix['sample_type'] == sample_type)]

urine_pvalues = calculate_pvalues(matrix,
                                  colonizations=colonizations,
                                  sample_type=sample_type)

urine_pvalues.to_csv('urine_pvalues.txt', sep='\t')
urine_pvalues

Unnamed: 0,colonization,fc_value_avg,pvalue,pvalue_corrected
"1,5-ANHYDRO-GLUCITOL",Bt,-0.201694,0.381044,0.654715
"1,5-ANHYDRO-GLUCITOL",Cs,-4.530191,0.112377,0.400621
"1,6-ANHYDRO-B-GLUCOSE",Bt,-0.634500,0.142821,0.425951
"1,6-ANHYDRO-B-GLUCOSE",Cs,-0.558963,0.397898,0.666285
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,Bt,0.282983,0.050320,0.279047
...,...,...,...,...
URIDINE,Cs,0.130440,0.653621,0.804022
UROCANIC ACID,Bt,-0.316654,0.611575,0.800201
UROCANIC ACID,Cs,-0.369896,0.651990,0.804022
XANTHURENIC ACID,Bt,0.332625,0.242553,0.533699


### p-value calculation for caecal metabolites

In [9]:
colonizations = ['Bt', 'Cs']
sample_type = 'caecal'

matrix = fold_change_matrix_mode_collapsed.join(metadata[['experiment_type', 'sample_type', 'colonization']])
matrix = matrix[matrix['colonization'].isin(colonizations + ['germ-free']) & (matrix['sample_type'] == sample_type)]

caecal_pvalues = calculate_pvalues(matrix,
                                   colonizations=colonizations,
                                   sample_type=sample_type)

caecal_pvalues.to_csv('caecal_pvalues.txt', sep='\t')
caecal_pvalues

Unnamed: 0,colonization,fc_value_avg,pvalue,pvalue_corrected
"1,6-ANHYDRO-B-GLUCOSE",Bt,-0.586226,8.304410e-04,0.002886
"1,6-ANHYDRO-B-GLUCOSE",Cs,-0.256677,7.715344e-02,0.120472
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,Bt,-0.692923,1.215686e-03,0.003819
1-AMINOCYCLOPROPANE-1-CARBOXYLIC ACID,Cs,-0.595714,7.410130e-03,0.016450
1-METHYLADENOSINE,Bt,0.581987,3.999027e-02,0.068096
...,...,...,...,...
VITAMIN D3,Cs,0.044059,5.747855e-01,0.648162
XANTHINE,Bt,0.300951,2.065227e-01,0.282470
XANTHINE,Cs,-1.296677,1.117973e-02,0.023701
XANTHURENIC ACID,Bt,4.723487,1.371515e-07,0.000004
