This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for LCMS input assuming infinite resolution. The example deals with a dataset having two labels C13 and N15:

 - C13N15_lcms_high_res.csv - demo raw MS intensity file containing intensities for C17H27N3O17P2 from the data file of the repository published by Carreer William et. al. in 2013

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import get_isotope_na, replace_negatives_in_column
from corna.algorithms import matrix_calc as algo
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present

In [2]:
raw_df = pd.read_csv('C13N15_lcms_high_res.csv')
#sample_metadata = pd.read_csv('meta_sample_lcms_high_res.csv')

#if sample metadata not present, set it to empty dataframe
sample_metadata = pd.DataFrame()

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df.head()

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,Cpd1,C12 PARENT,C17H27N3O17P2,Sample 1,41592.2,Cpd1
1,Cpd1,C13-label-1,C17H27N3O17P2,Sample 1,6143.7,Cpd1
2,Cpd1,C13-label-2,C17H27N3O17P2,Sample 1,2716.9,Cpd1
3,Cpd1,C13-label-3,C17H27N3O17P2,Sample 1,123.8,Cpd1
4,Cpd1,C13-label-4,C17H27N3O17P2,Sample 1,45.9,Cpd1


Dictionary containing natural abundance values for the common isotopes found in nature. It can be defined by the user or one can use the default values from the package. This example shows usage of user defined values. The format being {E:[M0, M1, ..Mn]} where E is the element symbol and the natural abundance fraction is in the increasing order of masses. For example: na_dict = {'O': [0.99757, 0.00038, 0.00205], 'H':[0.99985, 0.00015], 'N': [0.99632, 0.00368], 'C': [0.9892, 0.0108], 'Si':[0.922297, 0.046832, 0.030872], 'S':[0.9493, 0.0076, 0.0429, 0, 0.0002]}

In [3]:
#user defined
na_dict = {'O': [0.99757, 0.00038, 0.00205], 'H':[0.99985, 0.00015], 'N': [0.99632, 0.00368], 
            'C': [0.9892, 0.0108], 'Si':[0.922297, 0.046832, 0.030872], 'S':[0.9493, 0.0076, 0.0429, 0, 0.0002]}

Performing na_correction and inputs not relevant for this workflow set as emptymerge_mv_metdata - from cell 2
list of isotope tracers - ['C13', 'N15'] (in this example)
ppm_input - not relevant here, enter any value (20 here)
na_dict - from cell 3
indistinguishable isotopes not relevant here- {}

In [4]:
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13', 'N15'], ppm_input_user={}, eleme_corr={}, na_dict=na_dict)
na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df.head()

{'C': [0.9892, 0.0108], 'H': [0.99985, 0.00015], 'Si': [0.922297, 0.046832, 0.030872], 'O': [0.99757, 0.00038, 0.00205], 'N': [0.99632, 0.00368], 'S': [0.9493, 0.0076, 0.0429, 0, 0.0002]}


AttributeError: 'module' object has no attribute 'index'

post processing involves : replacing negative values by zero, calculating fraction enrichments, merging all data into file and saving as 'C13N15_lcms_high_res_corrected.csv'

In [13]:
output_df = fractional_enrichment(na_corr_df)
print output_df

      Sample  Name              Label        Formula     Pool_total  \
0   Sample 1  Cpd1         C12 PARENT  C17H27N3O17P2  278958.590202   
1   Sample 1  Cpd1        N15-label-1  C17H27N3O17P2  278958.590202   
2   Sample 1  Cpd1        N15-label-2  C17H27N3O17P2  278958.590202   
3   Sample 1  Cpd1        N15-label-3  C17H27N3O17P2  278958.590202   
4   Sample 1  Cpd1        C13-label-1  C17H27N3O17P2  278958.590202   
5   Sample 1  Cpd1   C13N15-label-1-1  C17H27N3O17P2  278958.590202   
6   Sample 1  Cpd1   C13N15-label-1-2  C17H27N3O17P2  278958.590202   
7   Sample 1  Cpd1   C13N15-label-1-3  C17H27N3O17P2  278958.590202   
8   Sample 1  Cpd1        C13-label-2  C17H27N3O17P2  278958.590202   
9   Sample 1  Cpd1   C13N15-label-2-1  C17H27N3O17P2  278958.590202   
10  Sample 1  Cpd1   C13N15-label-2-2  C17H27N3O17P2  278958.590202   
11  Sample 1  Cpd1   C13N15-label-2-3  C17H27N3O17P2  278958.590202   
12  Sample 1  Cpd1        C13-label-3  C17H27N3O17P2  278958.590202   
13  Sa

In [16]:
df= pd.merge(output_df, na_corr_df, on=['Label', 'Sample', 'Name', 'Formula'])
merged_results_df = pd.merge(df, merged_df)

In [17]:
merged_results_df.to_csv('C13N15_lcms_high_res_corrected.csv')