This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for LCMS input assuming high resolution. The example shows a dataset having two labels C13 and N15:

 - C13N15_lcms_high_res.csv - demo raw MS intensity file containing intensities for C17H27N3O17P2 from the data file of the repository published by Carreer William et al. in 2013

In [7]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import get_isotope_na, replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms import matrix_calc as algo
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present

In [8]:
raw_df = pd.read_csv('C13N15_lcms_high_res.csv')
sample_metadata = pd.read_csv('meta_sample_lcms_high_res.csv')

#if sample metadata not present, set it to empty dataframe
#sample_metadata = pd.DataFrame()

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df.head()

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Metadata1,Metadata2,Unlabeled Fragment
0,Cpd1,C12 PARENT,C17H27N3O17P2,Sample 1,41592.2,meta1,meta2,Cpd1
1,Cpd1,C13-label-1,C17H27N3O17P2,Sample 1,6143.7,meta1,meta2,Cpd1
2,Cpd1,C13-label-2,C17H27N3O17P2,Sample 1,2716.9,meta1,meta2,Cpd1
3,Cpd1,C13-label-3,C17H27N3O17P2,Sample 1,123.8,meta1,meta2,Cpd1
4,Cpd1,C13-label-4,C17H27N3O17P2,Sample 1,45.9,meta1,meta2,Cpd1


Dictionary containing natural abundance values for the common isotopes found in nature. It can be defined by the user or one can use the default values from the package. The format of the dictionary is as shown below: 

{E:[M0, M1, ..Mn]} where E is the element symbol and the natural abundance fraction is in the increasing order of masses. For example:

In [3]:
#user defined
# na_dict = {'O': [0.99757, 0.00038, 0.00205], 'H':[0.99985, 0.00015], 'N': [0.99632, 0.00368], 
#             'C': [0.9892, 0.0108], 'Si':[0.922297, 0.046832, 0.030872], 'S':[0.9493, 0.0076, 0.0429, 0, 0.0002]}

Performing na_correction and inputs not relevant for this workflow are set as empty, using default dictionary from the package

In [9]:
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13', 'N15'], ppm_input_user={}, eleme_corr={})

#for user defined NAdictionary
#na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13', 'N15'], ppm_input_user={}, eleme_corr={}, na_dict=na_dict)

na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df

Unnamed: 0,Name,Formula,Indistinguishable_isotope,Sample,NA Corrected,Intensity,Label,NA Corrected with zero
0,Cpd1,C17H27N3O17P2,{},Sample 1,5.083003e+04,41592.2,C12 PARENT,5.083003e+04
1,Cpd1,C17H27N3O17P2,{},Sample 1,8.009174e+04,66223.5,N15-label-1,8.009174e+04
2,Cpd1,C17H27N3O17P2,{},Sample 1,2.237158e+04,18915.2,N15-label-2,2.237158e+04
3,Cpd1,C17H27N3O17P2,{},Sample 1,4.909828e+03,4128.7,N15-label-3,4.909828e+03
4,Cpd1,C17H27N3O17P2,{},Sample 1,-2.166722e+03,6143.7,C13-label-1,0.000000e+00
5,Cpd1,C17H27N3O17P2,{},Sample 1,5.345574e+01,12661.6,C13N15-label-1-1,5.345574e+01
6,Cpd1,C17H27N3O17P2,{},Sample 1,2.648302e+03,5816.8,C13N15-label-1-2,2.648302e+03
7,Cpd1,C17H27N3O17P2,{},Sample 1,3.385685e+02,1079.0,C13N15-label-1-3,3.385685e+02
8,Cpd1,C17H27N3O17P2,{},Sample 1,2.780108e+03,2716.9,C13-label-2,2.780108e+03
9,Cpd1,C17H27N3O17P2,{},Sample 1,4.458649e+04,38606.2,C13N15-label-2-1,4.458649e+04


Calculating fraction enrichments, merging all data into a single file and saving as 'C13N15_lcms_high_res_corrected.csv'

In [10]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,Sample 1,Cpd1,C12 PARENT,C17H27N3O17P2,279327.351862,1.819730e-01
1,Sample 1,Cpd1,N15-label-1,C17H27N3O17P2,279327.351862,2.867307e-01
2,Sample 1,Cpd1,N15-label-2,C17H27N3O17P2,279327.351862,8.009091e-02
3,Sample 1,Cpd1,N15-label-3,C17H27N3O17P2,279327.351862,1.757733e-02
4,Sample 1,Cpd1,C13-label-1,C17H27N3O17P2,279327.351862,0.000000e+00
5,Sample 1,Cpd1,C13N15-label-1-1,C17H27N3O17P2,279327.351862,1.913731e-04
6,Sample 1,Cpd1,C13N15-label-1-2,C17H27N3O17P2,279327.351862,9.480996e-03
7,Sample 1,Cpd1,C13N15-label-1-3,C17H27N3O17P2,279327.351862,1.212085e-03
8,Sample 1,Cpd1,C13-label-2,C17H27N3O17P2,279327.351862,9.952866e-03
9,Sample 1,Cpd1,C13N15-label-2-1,C17H27N3O17P2,279327.351862,1.596209e-01


In [11]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Metadata1,Metadata2,Unlabeled Fragment,Indistinguishable_isotope,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,Cpd1,C12 PARENT,C17H27N3O17P2,Sample 1,41592.2,meta1,meta2,Cpd1,{},50830.02567,41592.2,50830.02567,279327.351862,0.181973
1,Cpd1,C13-label-1,C17H27N3O17P2,Sample 1,6143.7,meta1,meta2,Cpd1,{},-2166.722339,6143.7,0.0,279327.351862,0.0
2,Cpd1,C13-label-2,C17H27N3O17P2,Sample 1,2716.9,meta1,meta2,Cpd1,{},2780.10757,2716.9,2780.10757,279327.351862,0.009953
3,Cpd1,C13-label-3,C17H27N3O17P2,Sample 1,123.8,meta1,meta2,Cpd1,{},-331.809745,123.8,0.0,279327.351862,0.0
4,Cpd1,C13-label-4,C17H27N3O17P2,Sample 1,45.9,meta1,meta2,Cpd1,{},69.064956,45.9,69.064956,279327.351862,0.000247
5,Cpd1,C13-label-5,C17H27N3O17P2,Sample 1,0.0,meta1,meta2,Cpd1,{},-7.968906,0.0,0.0,279327.351862,0.0
6,Cpd1,C13-label-6,C17H27N3O17P2,Sample 1,0.0,meta1,meta2,Cpd1,{},0.505708,0.0,0.505708,279327.351862,2e-06
7,Cpd1,C13N15-label-1-1,C17H27N3O17P2,Sample 1,12661.6,meta1,meta2,Cpd1,{},53.455745,12661.6,53.455745,279327.351862,0.000191
8,Cpd1,C13N15-label-1-2,C17H27N3O17P2,Sample 1,5816.8,meta1,meta2,Cpd1,{},2648.301551,5816.8,2648.301551,279327.351862,0.009481
9,Cpd1,C13N15-label-1-3,C17H27N3O17P2,Sample 1,1079.0,meta1,meta2,Cpd1,{},338.568459,1079.0,338.568459,279327.351862,0.001212


In [12]:
merged_df.to_csv('C13N15_lcms_high_res_corrected.csv')