This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for LCMS/MS input file. Input files used:

 - mal_g9_240min_raw.csv - demo raw MS/MS intensity file containing intensities for Malate at Glucose 9 mM concentration for the time point 240 min from the paper by Alves Tiago et al. 2015
 - metadata_mq.csv - containing MS/MS fragment information name, chemical formulas of parent and daughter, input label used which is C13 for this example
 - meta_sample_malglu.csv - metadata associated with sample names

In [1]:
import pandas as pd
import numpy as np
import re

import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.inputs import multiquant_parser
from corna.output import convert_to_df
from corna.postprocess import fractional_enrichment
from corna.algorithms.background_correction import background_correction 
from corna.algorithms.mimosa_nacorr import na_correction_mimosa

**Defining the input files path and natural abundance values of elements.**

- isMSMS - set to True for postprocessing when data MSMS



In [2]:
raw_df= pd.read_csv('C13_MSMS_no_bg_input.csv')
metadata_df= pd.read_csv('metadata_mq.csv')
sample_metadata = None
isMSMS = True

Merge the raw_intensity dataframe with the metabolite metadata and sample metadata

In [3]:
msms_df, list_of_replicates, sample_background = multiquant_parser.merge_mq_metadata(raw_df, metadata_df, sample_metadata)



In [4]:
msms_df

Unnamed: 0,Component Name,Sample,Intensity,Cohort Name,Formula,Name,Parent Formula,Isotopologue,Metab,Label
0,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,2710000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
1,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,2820000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
2,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,2590000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
3,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,2520000,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
4,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,2480000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
5,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,2050000,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
6,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,1920000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
7,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13H66DDglcG9240...,2100000,TA13Cglucosewithvaryingglc2Sept13H66DDglcG9240...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
8,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,2490000,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0
9,Glutamate 146/128,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,2150000,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,C5H6NO3,Glutamate 146/128,C5H8NO4,M+0,Glutamate,C13_146.0_128.0


Perform background correction and natural abundance correction

In [5]:
na_corrected = na_correction_mimosa(None, msms_df, const.ISOTOPE_NA_MASS)
na_corr_df = convert_to_df(na_corrected, isMSMS, const.NA_CORRECTED_COL)
na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)

In [6]:
na_corr_df

Unnamed: 0,Label,Sample,NA Corrected,Name,Formula,NA Corrected with zero
0,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,34092.93,Lactate 89/89,C3H5O3,34092.93
1,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13E13CglcG960mi...,0.00,Lactate 89/89,C3H5O3,0.00
2,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,18587.90,Lactate 89/89,C3H5O3,18587.90
3,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13F13CglcG9120m...,41485.35,Lactate 89/89,C3H5O3,41485.35
4,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13F13CglcG9120m...,40374.69,Lactate 89/89,C3H5O3,40374.69
5,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13H66DDglcG9240...,0.00,Lactate 89/89,C3H5O3,0.00
6,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13F13CglcG9120m...,0.00,Lactate 89/89,C3H5O3,0.00
7,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13H66DDglcG9240...,336.87,Lactate 89/89,C3H5O3,336.87
8,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13A13CglcG90min...,0.00,Lactate 89/89,C3H5O3,0.00
9,C13_92.0_92.0,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,0.00,Lactate 89/89,C3H5O3,0.00


If NA Corrected values are negative, the function calculating fractional enrichment replaces those with 0 because intensities can't be negative and any negative values are treated as noise. The following cell calculates fractional enrichments

In [7]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,Lactate 89/89,C13_92.0_92.0,C3H5O3,86779.58,0.392868
1,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,Lactate 89/89,C13_91.0_91.0,C3H5O3,86779.58,0.007010
2,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,Lactate 89/89,C13_90.0_90.0,C3H5O3,86779.58,0.000000
3,TA13Cglucosewithvaryingglc2Sept13C13CglcG915mi...,Lactate 89/89,C13_89.0_89.0,C3H5O3,86779.58,0.600122
4,TA13Cglucosewithvaryingglc2Sept13E13CglcG960mi...,Lactate 89/89,C13_92.0_92.0,C3H5O3,0.00,0.000000
5,TA13Cglucosewithvaryingglc2Sept13E13CglcG960mi...,Lactate 89/89,C13_91.0_91.0,C3H5O3,0.00,0.000000
6,TA13Cglucosewithvaryingglc2Sept13E13CglcG960mi...,Lactate 89/89,C13_90.0_90.0,C3H5O3,0.00,0.000000
7,TA13Cglucosewithvaryingglc2Sept13E13CglcG960mi...,Lactate 89/89,C13_89.0_89.0,C3H5O3,0.00,0.000000
8,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,Lactate 89/89,C13_92.0_92.0,C3H5O3,96470.00,0.192681
9,TA13Cglucosewithvaryingglc2Sept13B13CglcG95min...,Lactate 89/89,C13_91.0_91.0,C3H5O3,96470.00,0.010439


Merge all the calculations with sample metadata information and save the output as a csv file

In [9]:
merged_df = merge_multiple_dfs([msms_df, na_corr_df, frac_enr_df])
merged_df.to_csv('C13_MSMS_no_bg_out.csv')