This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for LCMS data with resolution varying with molecular mass. The example shows a dataset with C13 and N15 label:

 - auto_detect_indistinguish_dual_label.csv - demo raw MS intensity file containing intensities for C10H17N3O6S simulated using combinatorics by considering C13 indistinguishable with O17 and N15 indistinguishable with S34

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present, in this example running without sample metadata

In [2]:
raw_df = pd.read_csv('auto_detect_indistinguish_dual_label.csv')
sample_metadata = pd.DataFrame()

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df.head()

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,Glutathione,C12 PARENT,C10H17N3O6S,Sample 1,0.0,Glutathione
1,Glutathione,C13-label-1,C10H17N3O6S,Sample 1,0.0,Glutathione
2,Glutathione,C13-label-2,C10H17N3O6S,Sample 1,0.0,Glutathione
3,Glutathione,C13-label-3,C10H17N3O6S,Sample 1,0.178,Glutathione
4,Glutathione,C13-label-4,C10H17N3O6S,Sample 1,0.013,Glutathione


Dictionary containing natural abundance values for the common isotopes found in nature. It can be defined by the user or one can use the default values from the package. The format of the dictionary is as shown below: 

{E:[M0, M1, ..Mn]} where E is the element symbol and the natural abundance fraction is in the increasing order of masses. For example:

In [3]:
#user defined
na_dict={'C':[0.9889,0.0111],
           'H':[0.99985, 0.00015],
           'N':[0.9964,0.0036],
           'O':[0.9976,0.0004,0.002],
           'S':[0.950,0.0076,0.0424]}

Performing na_correction and using the dictionary above for NA values. For Orbitrap, for molecular mass of 307 (which is our input compound) and 293808 resolution, ppm is ~7, according to the formula from Su, Xiaoyang et al.,2017

\begin{equation*}
\frac{Δm}{m} = 1.66 × \frac{m^\frac{1}{2}}{(MinimalNominalResolution×√200)} × 10^6
\end{equation*}

which is our ppm_user_input 

In [4]:
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13', 'N15'], ppm_input_user=7, 
                                          eleme_corr={}, na_dict=na_dict, autodetect=True)
na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df


Unnamed: 0,Name,Formula,Indistinguishable_isotope,Sample,NA Corrected,Intensity,Label,NA Corrected with zero
0,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,9.867204000000001e-17,0.0,C12 PARENT,9.867204000000001e-17
1,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,3.311852e-06,0.0,N15-label-1,3.311852e-06
2,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,-2.384534e-08,0.0,N15-label-2,0.0
3,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,0.8128195,0.6808,N15-label-3,0.8128195
4,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,6.476387000000001e-17,0.0,C13-label-1,6.476387000000001e-17
5,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,-1.305946e-07,0.0,C13N15-label-1-1,0.0
6,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,9.402814e-10,0.0,C13N15-label-1-2,9.402814e-10
7,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,0.0005284966,0.0785,C13N15-label-1-3,0.0005284966
8,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,2.1934590000000002e-17,0.0,C13-label-2,2.1934590000000002e-17
9,Glutathione,C10H17N3O6S,"{'C': ['O17'], 'N': ['S34']}",Sample 1,-6.844887e-07,0.0,C13N15-label-2-1,0.0


Calculating fractional enrichments, merging all data into a single file and saving as 'auto_detect_dual_label_isotope_ppm7.csv'

In [5]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,Sample 1,Glutathione,C12 PARENT,C10H17N3O6S,1.023658,9.639160000000001e-17
1,Sample 1,Glutathione,N15-label-1,C10H17N3O6S,1.023658,3.235311e-06
2,Sample 1,Glutathione,N15-label-2,C10H17N3O6S,1.023658,0.0
3,Sample 1,Glutathione,N15-label-3,C10H17N3O6S,1.023658,0.7940342
4,Sample 1,Glutathione,C13-label-1,C10H17N3O6S,1.023658,6.326709000000001e-17
5,Sample 1,Glutathione,C13N15-label-1-1,C10H17N3O6S,1.023658,0.0
6,Sample 1,Glutathione,C13N15-label-1-2,C10H17N3O6S,1.023658,9.185503e-10
7,Sample 1,Glutathione,C13N15-label-1-3,C10H17N3O6S,1.023658,0.0005162824
8,Sample 1,Glutathione,C13-label-2,C10H17N3O6S,1.023658,2.1427650000000002e-17
9,Sample 1,Glutathione,C13N15-label-2-1,C10H17N3O6S,1.023658,0.0


In [6]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Unlabeled Fragment,Indistinguishable_isotope,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,Glutathione,C12 PARENT,C10H17N3O6S,Sample 1,0.0,Glutathione,"{'C': ['O17'], 'N': ['S34']}",9.867204000000001e-17,0.0,9.867204000000001e-17,1.023658,9.639160000000001e-17
1,Glutathione,C13-label-1,C10H17N3O6S,Sample 1,0.0,Glutathione,"{'C': ['O17'], 'N': ['S34']}",6.476387000000001e-17,0.0,6.476387000000001e-17,1.023658,6.326709000000001e-17
2,Glutathione,C13-label-2,C10H17N3O6S,Sample 1,0.0,Glutathione,"{'C': ['O17'], 'N': ['S34']}",2.1934590000000002e-17,0.0,2.1934590000000002e-17,1.023658,2.1427650000000002e-17
3,Glutathione,C13-label-3,C10H17N3O6S,Sample 1,0.178,Glutathione,"{'C': ['O17'], 'N': ['S34']}",0.2077735,0.178,0.2077735,1.023658,0.2029716
4,Glutathione,C13-label-4,C10H17N3O6S,Sample 1,0.013,Glutathione,"{'C': ['O17'], 'N': ['S34']}",-0.001632273,0.013,0.0,1.023658,0.0
5,Glutathione,C13-label-5,C10H17N3O6S,Sample 1,0.0002,Glutathione,"{'C': ['O17'], 'N': ['S34']}",-0.0002356019,0.0002,0.0,1.023658,0.0
6,Glutathione,C13-label-6,C10H17N3O6S,Sample 1,0.0001,Glutathione,"{'C': ['O17'], 'N': ['S34']}",0.0001185349,0.0001,0.0001185349,1.023658,0.0001157954
7,Glutathione,C13N15-label-1-3,C10H17N3O6S,Sample 1,0.0785,Glutathione,"{'C': ['O17'], 'N': ['S34']}",0.0005284966,0.0785,0.0005284966,1.023658,0.0005162824
8,Glutathione,C13N15-label-1-1,C10H17N3O6S,Sample 1,0.0,Glutathione,"{'C': ['O17'], 'N': ['S34']}",-1.305946e-07,0.0,0.0,1.023658,0.0
9,Glutathione,C13N15-label-1-2,C10H17N3O6S,Sample 1,0.0,Glutathione,"{'C': ['O17'], 'N': ['S34']}",9.402814e-10,0.0,9.402814e-10,1.023658,9.185503e-10


In [7]:
output_df.to_csv('auto_detect_dual_label_isotope_ppm7.csv')