This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for low resolution LCMS data. The example shows a dataset with C13 label:

 - C13_succ.csv - demo raw MS intensity file containing intensities for C4H4O3 taken from the single measure data of IsoCor repository published by Millard, Pierre et al. in 2012

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present, in this example running without sample metadata

In [2]:
raw_df = pd.read_csv('C13_succ.csv')
sample_metadata = pd.DataFrame()

#For the example with sample metadata
#sample_metadata = pd.read_csv('C13_succ_metadata.csv')

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,Succ,C12 PARENT,C4H4O3,Sample1,0.572503,Succ
1,Succ,C13-label-1,C4H4O3,Sample1,0.219132,Succ
2,Succ,C13-label-2,C4H4O3,Sample1,0.122481,Succ
3,Succ,C13-label-3,C4H4O3,Sample1,0.054081,Succ
4,Succ,C13-label-4,C4H4O3,Sample1,0.0318,Succ


Dictionary containing natural abundance values for the common isotopes found in nature. It can be defined by the user or one can use the default values from the package. The format of the dictionary is as shown below: 

{E:[M0, M1, ..Mn]} where E is the element symbol and the natural abundance fraction is in the increasing order of masses. For example:

In [3]:
na_dict = {'O': [0.99757, 0.00038, 0.00205], 'H':[0.99985, 0.00015], 'N': [0.99632, 0.00368], 
           'C': [0.9892, 0.0108], 'Si':[0.922297, 0.046832, 0.030872], 'S':[0.9493, 0.0076, 0.0429, 0, 0.0002]}

Performing na_correction and inputs not relevant for this workflow are set as empty, using dictionary defined above for NA values. Isotopes indistinguishable because of low resolution mass spec can be defined in the format {'Tracer': [List of Indistinguishable Isotopes]} 

In [4]:
#for default NAdictionary
#na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13'], res_type='low res')

na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13'], res_type='low res', na_dict=na_dict)

na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df

Unnamed: 0,Name,Formula,Sample,NA Corrected,Intensity,Label,NA Corrected with zero
0,Succ,C4H4O3,Sample1,0.602659,0.572503,C12 PARENT,0.602659
1,Succ,C4H4O3,Sample1,0.201109,0.219132,C13-label-1,0.201109
2,Succ,C4H4O3,Sample1,0.115197,0.122481,C13-label-2,0.115197
3,Succ,C4H4O3,Sample1,0.050957,0.054081,C13-label-3,0.050957
4,Succ,C4H4O3,Sample1,0.03065,0.0318,C13-label-4,0.03065


Calculating fractional enrichments, merging all data into a file and saving as 'C13_succ_corr.csv'

In [5]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,Sample1,Succ,C12 PARENT,C4H4O3,1.000573,0.602314
1,Sample1,Succ,C13-label-1,C4H4O3,1.000573,0.200994
2,Sample1,Succ,C13-label-2,C4H4O3,1.000573,0.115131
3,Sample1,Succ,C13-label-3,C4H4O3,1.000573,0.050928
4,Sample1,Succ,C13-label-4,C4H4O3,1.000573,0.030632


In [6]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Unlabeled Fragment,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,Succ,C12 PARENT,C4H4O3,Sample1,0.572503,Succ,0.602659,0.572503,0.602659,1.000573,0.602314
1,Succ,C13-label-1,C4H4O3,Sample1,0.219132,Succ,0.201109,0.219132,0.201109,1.000573,0.200994
2,Succ,C13-label-2,C4H4O3,Sample1,0.122481,Succ,0.115197,0.122481,0.115197,1.000573,0.115131
3,Succ,C13-label-3,C4H4O3,Sample1,0.054081,Succ,0.050957,0.054081,0.050957,1.000573,0.050928
4,Succ,C13-label-4,C4H4O3,Sample1,0.0318,Succ,0.03065,0.0318,0.03065,1.000573,0.030632


In [7]:
output_df.to_csv('C13_succ_corr.csv')