This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for LCMS data with mass spectrum resolution. The example shows a dataset with N15 label:

 - N_Sample_Input_Simple_Glutathione_Disulfide.xlsx - data from Su et al., 2017 with intensities for C20H32N6O12S2 

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment


Reading raw file and merging with sample metadata if present, in this example running without sample metadata

In [2]:
raw_df = pd.read_excel('N_Sample_Input_Simple_Glutathione_Disulfide.xlsx')
sample_metadata = pd.DataFrame()

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df.head()

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,glutathione disulfide,C12 PARENT,C20H32N6O12S2,N15_50_140k_A,16085.53,glutathione disulfide
1,glutathione disulfide,N15-label-1,C20H32N6O12S2,N15_50_140k_A,76741.95,glutathione disulfide
2,glutathione disulfide,N15-label-2,C20H32N6O12S2,N15_50_140k_A,283352.8,glutathione disulfide
3,glutathione disulfide,N15-label-3,C20H32N6O12S2,N15_50_140k_A,571908.3,glutathione disulfide
4,glutathione disulfide,N15-label-4,C20H32N6O12S2,N15_50_140k_A,602110.1,glutathione disulfide


According to the formula from Su, Xiaoyang et al.,2017

\begin{equation*}
\frac{Δm}{m} = 1.66 × \frac{m^\frac{1}{2}}{(MinimalNominalResolution×√200)} × 10^6
\end{equation*}

Different vendors will have different formulas, the package supports Orbitrap and FT-ICR as of now

Dictionary containing natural abundance values for the common isotopes found in nature. It can be defined by the user or one can use the default values from the package. The format of the dictionary is as shown below: 

{E:[M0, M1, ..Mn]} where E is the element symbol and the natural abundance fraction is in the increasing order of masses. For example:

In [3]:
#user defined

na_dict={'C':[0.9893, 0.0107],
           'H':[0.999885, 0.000115],
           'N':[0.99636, 0.00364],
           'O':[0.99757, 0.00038, 0.00205],
           'S':[0.9493, 0.00762, 0.0429]}

In [4]:
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['N15'], res_type='autodetect',
                                          na_dict=na_dict, autodetect=True, res=140000, 
                                          res_mw=200, instrument='orbitrap')
na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)
na_corr_df

Unnamed: 0,Name,Formula,Sample,NA Corrected,Intensity,Label,NA Corrected with zero
0,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,23380.202176,16085.53,C12 PARENT,23380.202176
1,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,105021.961854,76741.95,N15-label-1,105021.961854
2,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,378328.532304,283352.8,N15-label-2,378328.532304
3,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,709945.385008,571908.3,N15-label-3,709945.385008
4,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,626254.016002,602110.1,N15-label-4,626254.016002
5,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,203869.872085,331400.5,N15-label-5,203869.872085
6,glutathione disulfide,C20H32N6O12S2,N15_50_140k_A,-19653.089879,100732.9,N15-label-6,0.0


Calculating fractional enrichments, merging all data into a single file and saving as 'auto_detect_dual_label_isotope_ppm7.csv'

In [5]:
frac_enr_df = fractional_enrichment(na_corr_df)
frac_enr_df

Unnamed: 0,Sample,Name,Label,Formula,Pool_total,Fractional enrichment
0,N15_50_140k_A,glutathione disulfide,C12 PARENT,C20H32N6O12S2,2046800.0,0.011423
1,N15_50_140k_A,glutathione disulfide,N15-label-1,C20H32N6O12S2,2046800.0,0.05131
2,N15_50_140k_A,glutathione disulfide,N15-label-2,C20H32N6O12S2,2046800.0,0.184839
3,N15_50_140k_A,glutathione disulfide,N15-label-3,C20H32N6O12S2,2046800.0,0.346856
4,N15_50_140k_A,glutathione disulfide,N15-label-4,C20H32N6O12S2,2046800.0,0.305967
5,N15_50_140k_A,glutathione disulfide,N15-label-5,C20H32N6O12S2,2046800.0,0.099604
6,N15_50_140k_A,glutathione disulfide,N15-label-6,C20H32N6O12S2,2046800.0,0.0


In [6]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Unlabeled Fragment,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,glutathione disulfide,C12 PARENT,C20H32N6O12S2,N15_50_140k_A,16085.53,glutathione disulfide,23380.202176,16085.53,23380.202176,2046800.0,0.011423
1,glutathione disulfide,N15-label-1,C20H32N6O12S2,N15_50_140k_A,76741.95,glutathione disulfide,105021.961854,76741.95,105021.961854,2046800.0,0.05131
2,glutathione disulfide,N15-label-2,C20H32N6O12S2,N15_50_140k_A,283352.8,glutathione disulfide,378328.532304,283352.8,378328.532304,2046800.0,0.184839
3,glutathione disulfide,N15-label-3,C20H32N6O12S2,N15_50_140k_A,571908.3,glutathione disulfide,709945.385008,571908.3,709945.385008,2046800.0,0.346856
4,glutathione disulfide,N15-label-4,C20H32N6O12S2,N15_50_140k_A,602110.1,glutathione disulfide,626254.016002,602110.1,626254.016002,2046800.0,0.305967
5,glutathione disulfide,N15-label-5,C20H32N6O12S2,N15_50_140k_A,331400.5,glutathione disulfide,203869.872085,331400.5,203869.872085,2046800.0,0.099604
6,glutathione disulfide,N15-label-6,C20H32N6O12S2,N15_50_140k_A,100732.9,glutathione disulfide,-19653.089879,100732.9,0.0,2046800.0,0.0


In [7]:
output_df.to_csv('N15_autodetect_glutathione_disulfide_orbitrap_140000_corna_out.csv')