Version: 1.0
R-program to extract mass isotopomer distributions (MID) of 13C-labeled metabolites from raw experimental time course recordings of mass spectra.
Cdf2mid is a computer program designed to primary process of 13C mass isotopomer data obtained with GCMS and initiate a workflow of a comprehensive data analysis. It reads the files generated by mass spectrometers and saved in netCDF format, containing registered time course of m/z chromatograms. It evaluates the MID at the moment when peaks are reached, and saves the obtained information in a form facilitating including it in the database Metabolights, and further correcting for natural isotope occurrence. Cdf2MID is written in 'R', its code can be found in https://github.com/seliv55/cdf2mid. It uses library 'ncdf4' to read netCDF files, and analyze and visualize the spectra that they contain. To perform its function, in addition to a collection of CDF files Cdf2MID needs some additional information, such as retention times and m/z values of metabolites of interest. This additional information can be provided in the most simple form in a text file.
- primary processing of 13C mass isotopomer data obtained with GCMS
- Preprocessing of raw data
- initiation of workflows of the data analysis
- Isotopic Labeling Analysis / 13C
- MS
Cdf2mid reads the CDF files presented in the working directory, and then
- separates the time courses for selected m/z peaks corresponding to specific mass isotopomers;
- corrects baseline for each selected mz;
- chooses the time points corresponding to the peak intensities where the measured values are less contaminated by other compounds and thus is the most representative of the real analyzed distribution of mass isotopomers;
- evaluates this distribution, and saves it in files readable by MIDcor, a program, which performs the next step of the analysis, i.e. correction of the Cdf2mid spectra for natural isotope occurrence, which is necessary to carry out a fluxomic analysis.
- Vitaly Selivanov (Universitat de Barcelona)
- Way 1. Accessing Cdf2mid code directly, downloading it from the GitHub repository.
git clone https://github.com/seliv55/cdf2mid
Optionally a library of R-functions "cdf2mid" can be created
cd <'path to the directory'/>cdf2mid
sudo R
library(devtools)
build()
install()
- Way 2. Using docker image of Cdf2mid.
The image can be pulled from repo:
docker pull container-registry.phenomenal-h2020.eu/phnmnl/cdf2mid
or installed locally using a local copy of this repo:
git clone https://github.com/phnmnl/container-cdf2mid
cd <'path to the directory'>/container-cdf2mid
docker build -t cdf2mid .
Here to create the docker image, the same github repository "https://github.com/seliv55/cdf2mid" is used.
- Direct execution of the downloaded code.
Enter in R environment, load the necessary libraries or/and, as an option, read the code directly:
R
library(cdf2mid) # optionally, if this library was created (if not, use the option below)
library(ncdf4)
source("<'path to the directory'>/R/cdf2mid.R") # if the library 'cdf2mid' was not installed
source("<'path to the directory'>/R/libcdf.R") # if the library 'cdf2mid' was not installed
Then run the main program:
metan(infile, cdfdir, outfile)
Here the text after # is a comment. The main function( metan(infile, cdfdir, outfile) ) takes three parameters. The first one, infile (default value "simetdat") is a name of file with additional information (e.g. retention time, m/z interval for the metabolites of interest). The second, cdfdir (default value "wd/"), is a path to a directory, containing the netCDF files desidned for the analysis. The third, outfile (default value "cdf2midout.csv"), is the name of output file with the obtained results.
- The file "simetdat" is an example of the information that has to be provided in addition to the netCDF files. This information is necessary for further analysis. Currently it content is:
The first column indicates the names of metabolites of interest, the second column corresponding retention times in minutes, the third is the m/z value for the lightest isotopomer of the desired fragments, whos carbons originated from the initial molecule, and the formula of whole derivate are shown in the next two columns. Gas chromatography technique often produces several fragments ot the same derivated metabolite. The last column shows m/z value of the lightest isotopomer of another fragment of the same metabolite, which serves as a ccontrol that the given metabolite was indeed detected. However, since in the presented example only one fragment of each metabolite was registered, here the last column just repeats the third column.Name RT mz0 Fragment Formula control Citrate 37.5 459 C1-C6 C20H39O6Si3 459 Aspartate 28.5 418 C1-C4 C18H40O4N1Si3 418 Malate 27.2 419 C1-C4 C18H39O5Si3 419 Glucose 3.74 328 C1-C6 C14H18O8N1 328 Glutamate2-4 3.79 152 C2-C4 C5H5O1N1F3 152 Glutamate2-5 3.79 198 C2-C5 C6H7O3N1F3 198 Lactate 5.33 328 C1-C3 C10H13O3N1F7 328 Ribose 5.28 256 C1-C5 C11H14O6N1 256 Based on this information cdf2mid extracts raw MID from the netCDF files presented in cdfdir, and creates tables of data saved in outfile. Such tables are accepted as exchangeable with Metabolights database.
- Run this example using the command:
metan(infile="simetdat",cdfdir="wd/",outfile="cdf2midout.csv")
The file containing the results provided by cdf2mid (here "cdf2midout.csv") can proceed for further correction by MIDcor (https://github.com/seliv55/midcor).
- To run Cdf2mid as a docker image, created locally, go to a folder, containing the input data, and run the image:
docker run -it -v $PWD:/data cdf2mid -i /data/<infile> -z /data/<cdfdir> -o /data/<outfile>
To run Cdf2mid as a docker image created in the PhenoMeNal repository, execute
docker run -it -v $PWD:/data container-registry.phenomenal-h2020.eu/phnmnl/cdf2mid -i /data/<infile> -z /data/<cdfdir> -o /data/<outfile>
Cdf2mid can be used also without all the previous steps of downloading the code or docher image installation, but directly as a part of PhenoMeNal Cloud Research Environment. Go to Fluxomics tool category, and then click on Cdf2mid, and fill the expected input files, then press Run. Additionally, the tool can be used as part of a workflow with Midcor, Iso2flux and the Escher-Fluxomics tools. On a PhenoMeNal deployed CRE you should find as well a Fluxomics Stationary workflow, which includes Cdf2mid. This way of using it is described here.