This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for low resolution LCMS data. The example shows a dataset with C13 label:

 - C13_LCMS_large_example.csv - demo raw MS intensity file containing intensities for 24 samples with 56 metabolites each, output format from ElMaven

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present, in this example running without sample metadata

In [2]:
raw_df = pd.read_csv('C13_LCMS_large_example.csv')
sample_metadata = pd.DataFrame()

#For the example with sample metadata
#sample_metadata = pd.read_csv('C13_succ_metadata.csv')

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,Glycine,C12 PARENT,C2H5NO2,SAMPLE_2_10,1.527025e+06,Glycine
1,Glycine,C13-label-1,C2H5NO2,SAMPLE_2_10,5.085417e+04,Glycine
2,Glycine,C13-label-2,C2H5NO2,SAMPLE_2_10,0.000000e+00,Glycine
3,Pyruvic acid,C12 PARENT,C3H4O3,SAMPLE_2_10,2.603249e+05,Pyruvic acid
4,Pyruvic acid,C13-label-1,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid
5,Pyruvic acid,C13-label-2,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid
6,Pyruvic acid,C13-label-3,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid
7,L-Alanine,C12 PARENT,C3H7NO2,SAMPLE_2_10,1.398106e+06,L-Alanine
8,L-Alanine,C13-label-1,C3H7NO2,SAMPLE_2_10,5.677794e+04,L-Alanine
9,L-Alanine,C13-label-2,C3H7NO2,SAMPLE_2_10,1.604293e+05,L-Alanine


Performing na_correction and inputs not relevant for this workflow are set as empty.

In [3]:
#for default NAdictionary
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13'], res_type='low res')


na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df

Unnamed: 0,Name,Formula,Sample,NA Corrected,Intensity,Label,NA Corrected with zero
0,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,3.194672e+05,2.914025e+05,C12 PARENT,3.194672e+05
1,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,-1.592653e+04,9.314458e+03,C13-label-1,0.000000e+00
2,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,-2.933114e+03,0.000000e+00,C13-label-2,0.000000e+00
3,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,9.103946e+01,0.000000e+00,C13-label-3,9.103946e+01
4,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,1.863962e+01,0.000000e+00,C13-label-4,1.863962e+01
5,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,-1.479520e-01,0.000000e+00,C13-label-5,0.000000e+00
6,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,-9.057763e-02,0.000000e+00,C13-label-6,0.000000e+00
7,2-Isopropylmalic acid,C7H12O5,SAMPLE_2_10,1.501159e+05,1.480564e+05,C13-label-7,1.501159e+05
8,3-Phosphoglyceric acid,C3H7O7P,SAMPLE_2_10,0.000000e+00,0.000000e+00,C12 PARENT,0.000000e+00
9,3-Phosphoglyceric acid,C3H7O7P,SAMPLE_2_10,0.000000e+00,0.000000e+00,C13-label-1,0.000000e+00


Calculating fractional enrichments, merging all data into a file and saving as 'C13_succ_corr.csv'

In [4]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,SAMPLE_2_10,2-Isopropylmalic acid,C12 PARENT,C7H12O5,4.696928e+05,6.801620e-01
1,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-1,C7H12O5,4.696928e+05,0.000000e+00
2,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-2,C7H12O5,4.696928e+05,0.000000e+00
3,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-3,C7H12O5,4.696928e+05,1.938277e-04
4,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-4,C7H12O5,4.696928e+05,3.968470e-05
5,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-5,C7H12O5,4.696928e+05,0.000000e+00
6,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-6,C7H12O5,4.696928e+05,0.000000e+00
7,SAMPLE_2_10,2-Isopropylmalic acid,C13-label-7,C7H12O5,4.696928e+05,3.196045e-01
8,SAMPLE_2_10,3-Phosphoglyceric acid,C12 PARENT,C3H7O7P,0.000000e+00,0.000000e+00
9,SAMPLE_2_10,3-Phosphoglyceric acid,C13-label-1,C3H7O7P,0.000000e+00,0.000000e+00


In [5]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Unlabeled Fragment,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,Glycine,C12 PARENT,C2H5NO2,SAMPLE_2_10,1.527025e+06,Glycine,1.575871e+06,1.527025e+06,1.575871e+06,1.584735e+06,9.944061e-01
1,Glycine,C13-label-1,C2H5NO2,SAMPLE_2_10,5.085417e+04,Glycine,8.864906e+03,5.085417e+04,8.864906e+03,1.584735e+06,5.593934e-03
2,Glycine,C13-label-2,C2H5NO2,SAMPLE_2_10,0.000000e+00,Glycine,-6.705847e+03,0.000000e+00,0.000000e+00,1.584735e+06,0.000000e+00
3,Pyruvic acid,C12 PARENT,C3H4O3,SAMPLE_2_10,2.603249e+05,Pyruvic acid,2.713000e+05,2.603249e+05,2.713000e+05,2.713208e+05,9.999235e-01
4,Pyruvic acid,C13-label-1,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid,-9.518009e+03,0.000000e+00,0.000000e+00,2.713208e+05,0.000000e+00
5,Pyruvic acid,C13-label-2,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid,-1.484167e+03,0.000000e+00,0.000000e+00,2.713208e+05,0.000000e+00
6,Pyruvic acid,C13-label-3,C3H4O3,SAMPLE_2_10,0.000000e+00,Pyruvic acid,2.075329e+01,0.000000e+00,2.075329e+01,2.713208e+05,7.648988e-05
7,L-Alanine,C12 PARENT,C3H7NO2,SAMPLE_2_10,1.398106e+06,L-Alanine,1.459461e+06,1.398106e+06,1.459461e+06,1.618762e+06,9.015906e-01
8,L-Alanine,C13-label-1,C3H7NO2,SAMPLE_2_10,5.677794e+04,L-Alanine,2.124110e+03,5.677794e+04,2.124110e+03,1.618762e+06,1.312182e-03
9,L-Alanine,C13-label-2,C3H7NO2,SAMPLE_2_10,1.604293e+05,L-Alanine,1.571773e+05,1.604293e+05,1.571773e+05,1.618762e+06,9.709722e-02


In [6]:
output_df.to_csv('C13_LCMS_large_example_out.csv')