Skip to content

R-program to read CDF files containing multiple mass spectra of 13C-labeled metabolites, and write the extracted spectra in a format exchangeable with Metabilights database

Notifications You must be signed in to change notification settings

seliv55/cdf2mid

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo

cdf2MID

Version: 1.0

Short description

R-program to extract mass isotopomer distributions (MID) of 13C-labeled metabolites from raw experimental time course recordings of mass spectra.

Contents

  1. Description
  2. Functions
  3. Ways of using the code
  4. Execution the program

1. Description

Cdf2MID:

Cdf2mid is a computer program designed to primary process of 13C mass isotopomer data obtained with GCMS and initiate a workflow of a comprehensive data analysis. It reads the files generated by mass spectrometers and saved in netCDF format, containing registered time course of m/z chromatograms. It evaluates the MID at the moment when peaks are reached, and saves the obtained information in a form facilitating including it in the database Metabolights, and further correcting for natural isotope occurrence. Cdf2MID is written in 'R', its code can be found in https://github.com/seliv55/cdf2mid. It uses library 'ncdf4' to read netCDF files, and analyze and visualize the spectra that they contain. To perform its function, in addition to a collection of CDF files Cdf2MID needs some additional information, such as retention times and m/z values of metabolites of interest. This additional information can be provided in the most simple form in a text file.

Key features

  • primary processing of 13C mass isotopomer data obtained with GCMS

Functionality

  • Preprocessing of raw data
  • initiation of workflows of the data analysis

Approaches

  • Isotopic Labeling Analysis / 13C

Instrument Data Types

  • MS

2. Functions

Cdf2mid reads the CDF files presented in the working directory, and then

  • separates the time courses for selected m/z peaks corresponding to specific mass isotopomers;
  • corrects baseline for each selected mz;
  • chooses the time points corresponding to the peak intensities where the measured values are less contaminated by other compounds and thus is the most representative of the real analyzed distribution of mass isotopomers;
  • evaluates this distribution, and saves it in files readable by MIDcor, a program, which performs the next step of the analysis, i.e. correction of the Cdf2mid spectra for natural isotope occurrence, which is necessary to carry out a fluxomic analysis.

Tool Authors

  • Vitaly Selivanov (Universitat de Barcelona)

Container Contributors

Website

Git Repository

3. Ways of accessing the program

git clone https://github.com/seliv55/cdf2mid

Optionally a library of R-functions "cdf2mid" can be created

 cd <'path to the directory'/>cdf2mid
 sudo R
 library(devtools)
 build()
 install()
  • Way 2. Using docker image of Cdf2mid.
    The image can be pulled from repo:
 docker pull container-registry.phenomenal-h2020.eu/phnmnl/cdf2mid

or installed locally using a local copy of this repo:

 git clone https://github.com/phnmnl/container-cdf2mid
 cd <'path to the directory'>/container-cdf2mid
 docker build -t cdf2mid .

Here to create the docker image, the same github repository "https://github.com/seliv55/cdf2mid" is used.

4. Execution the program

  • Direct execution of the downloaded code.
    Enter in R environment, load the necessary libraries or/and, as an option, read the code directly:
 R
 library(cdf2mid) # optionally, if this library was created (if not, use the option below)
 library(ncdf4)
 source("<'path to the directory'>/R/cdf2mid.R") # if the library 'cdf2mid' was not installed
 source("<'path to the directory'>/R/libcdf.R") # if the library 'cdf2mid' was not installed

Then run the main program:

 metan(infile, cdfdir, outfile)

Here the text after # is a comment. The main function( metan(infile, cdfdir, outfile) ) takes three parameters. The first one, infile (default value "simetdat") is a name of file with additional information (e.g. retention time, m/z interval for the metabolites of interest). The second, cdfdir (default value "wd/"), is a path to a directory, containing the netCDF files desidned for the analysis. The third, outfile (default value "cdf2midout.csv"), is the name of output file with the obtained results.

  • The file "simetdat" is an example of the information that has to be provided in addition to the netCDF files. This information is necessary for further analysis. Currently it content is:
    NameRTmz0FragmentFormulacontrol
    Citrate37.5459C1-C6C20H39O6Si3459
    Aspartate28.5418C1-C4C18H40O4N1Si3418
    Malate27.2419C1-C4C18H39O5Si3419
    Glucose3.74328C1-C6C14H18O8N1328
    Glutamate2-43.79152C2-C4C5H5O1N1F3152
    Glutamate2-53.79198C2-C5C6H7O3N1F3198
    Lactate5.33328C1-C3C10H13O3N1F7328
    Ribose5.28256C1-C5C11H14O6N1256
    The first column indicates the names of metabolites of interest, the second column corresponding retention times in minutes, the third is the m/z value for the lightest isotopomer of the desired fragments, whos carbons originated from the initial molecule, and the formula of whole derivate are shown in the next two columns. Gas chromatography technique often produces several fragments ot the same derivated metabolite. The last column shows m/z value of the lightest isotopomer of another fragment of the same metabolite, which serves as a ccontrol that the given metabolite was indeed detected. However, since in the presented example only one fragment of each metabolite was registered, here the last column just repeats the third column.

    Based on this information cdf2mid extracts raw MID from the netCDF files presented in cdfdir, and creates tables of data saved in outfile. Such tables are accepted as exchangeable with Metabolights database.

    • Run this example using the command:
      metan(infile="simetdat",cdfdir="wd/",outfile="cdf2midout.csv")

    The file containing the results provided by cdf2mid (here "cdf2midout.csv") can proceed for further correction by MIDcor (https://github.com/seliv55/midcor).

    • To run Cdf2mid as a docker image, created locally, go to a folder, containing the input data, and run the image:
     docker run -it -v $PWD:/data cdf2mid -i /data/<infile> -z /data/<cdfdir> -o /data/<outfile>

    To run Cdf2mid as a docker image created in the PhenoMeNal repository, execute

    docker run -it -v $PWD:/data container-registry.phenomenal-h2020.eu/phnmnl/cdf2mid -i /data/<infile> -z /data/<cdfdir> -o /data/<outfile>

    Cdf2mid can be used also without all the previous steps of downloading the code or docher image installation, but directly as a part of PhenoMeNal Cloud Research Environment. Go to Fluxomics tool category, and then click on Cdf2mid, and fill the expected input files, then press Run. Additionally, the tool can be used as part of a workflow with Midcor, Iso2flux and the Escher-Fluxomics tools. On a PhenoMeNal deployed CRE you should find as well a Fluxomics Stationary workflow, which includes Cdf2mid. This way of using it is described here.

About

R-program to read CDF files containing multiple mass spectra of 13C-labeled metabolites, and write the extracted spectra in a format exchangeable with Metabilights database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages