This notebook can be used to calculate NA Corrected intensities as well as fractional enrichment for GCMS data incluing derivatized compounds. This example has C13 label:

 - GCMS_raw.csv - demo raw MS intensity file containing intensities for glucose derivatized as pentaacetate (C16H22O11) taken from Cline, Gary W. and Gerald I. Shulman, 1995 and a simulated example of glucose derivatized with TMS
 
 The compound formula contains both the actual compound and derivatizing agent

In [1]:
import pandas as pd
import numpy as np
import re

from corna.inputs import maven_parser as parser
import corna.constants as const
from corna.helpers import replace_negatives_in_column, merge_multiple_dfs
from corna.algorithms.nacorr_lcms import na_correction
from corna.postprocess import fractional_enrichment

Reading raw file and merging with sample metadata if present, in this example running without sample metadata

In [3]:
raw_df = pd.read_csv('GCMS_raw.csv')
sample_metadata = pd.DataFrame()

merged_df, iso_tracer_data, element_list = parser.read_maven_file(raw_df, sample_metadata)
merged_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity,Unlabeled Fragment
0,Glucosepentaacetate,C12 PARENT,C16H22O11,sample1,0.571376,Glucosepentaacetate
1,Glucosepentaacetate,C13-label-1,C16H22O11,sample1,0.103652,Glucosepentaacetate
2,Glucosepentaacetate,C13-label-2,C16H22O11,sample1,0.272024,Glucosepentaacetate
3,Glucosepentaacetate,C13-label-3,C16H22O11,sample1,0.042745,Glucosepentaacetate
4,Glucosepentaacetate,C13-label-4,C16H22O11,sample1,0.008984,Glucosepentaacetate
5,Glucosepentaacetate,C13-label-5,C16H22O11,sample1,0.001073,Glucosepentaacetate
6,Glucosepentaacetate,C13-label-6,C16H22O11,sample1,0.000133,Glucosepentaacetate
7,GlucoseTMS,C12 PARENT,C9O12H15Si,sample1,0.1627,GlucoseTMS
8,GlucoseTMS,C13-label-1,C9O12H15Si,sample1,0.0242,GlucoseTMS
9,GlucoseTMS,C13-label-2,C9O12H15Si,sample1,0.0104,GlucoseTMS


Performing na_correction and inputs not relevant for this workflow are set as empty. Isotopes indistinguishable because of low resolution mass spec can be defined in the format {'Tracer': [List of Indistinguishable Isotopes]} 

In [4]:
na_corr_df, ele_corr_dict = na_correction(merged_df, iso_tracers=['C13'], res_type='low res')

na_corr_df = replace_negatives_in_column(na_corr_df, const.NA_CORRECTED_WITH_ZERO, const.NA_CORRECTED_COL)

Calculating fractional enrichments, merging all data a into file and saving as 'GCMS_corrected.csv'

In [5]:
frac_enr_df = fractional_enrichment(na_corr_df)

In [6]:
output_df = merge_multiple_dfs([merged_df, na_corr_df, frac_enr_df])
output_df

Unnamed: 0,Name,Label,Formula,Sample,Intensity_x,Unlabeled Fragment,NA Corrected,Intensity_y,NA Corrected with zero,Pool_total,Fractional enrichment
0,Glucosepentaacetate,C12 PARENT,C16H22O11,sample1,0.571376,Glucosepentaacetate,0.7037145,0.571376,0.703715,1.005674,0.699745
1,Glucosepentaacetate,C13-label-1,C16H22O11,sample1,0.103652,Glucosepentaacetate,-0.004104075,0.103652,0.0,1.005674,0.0
2,Glucosepentaacetate,C13-label-2,C16H22O11,sample1,0.272024,Glucosepentaacetate,0.3017928,0.272024,0.301793,1.005674,0.30009
3,Glucosepentaacetate,C13-label-3,C16H22O11,sample1,0.042745,Glucosepentaacetate,-0.001568692,0.042745,0.0,1.005674,0.0
4,Glucosepentaacetate,C13-label-4,C16H22O11,sample1,0.008984,Glucosepentaacetate,0.0001628501,0.008984,0.000163,1.005674,0.000162
5,Glucosepentaacetate,C13-label-5,C16H22O11,sample1,0.001073,Glucosepentaacetate,2.778004e-06,0.001073,3e-06,1.005674,3e-06
6,Glucosepentaacetate,C13-label-6,C16H22O11,sample1,0.000133,Glucosepentaacetate,-2.356081e-07,0.000133,0.0,1.005674,0.0
7,GlucoseTMS,C12 PARENT,C9O12H15Si,sample1,0.1627,GlucoseTMS,0.2012074,0.1627,0.201207,1.006063,0.199995
8,GlucoseTMS,C13-label-1,C9O12H15Si,sample1,0.0242,GlucoseTMS,-0.001985262,0.0242,0.0,1.006063,0.0
9,GlucoseTMS,C13-label-2,C9O12H15Si,sample1,0.0104,GlucoseTMS,-0.0005946323,0.0104,0.0,1.006063,0.0


In [8]:
output_df.to_csv('GCMS_corrected.csv')