A Dead Simple Toolkit for Quantitative Chromatography
This package takes raw chromatographic data (mass spec or colorimetric) and outputs relative or absolute quantities of identified compounds for compositional biochemistry analyses. It reproduces functions of proprietary instrument software, so researchers can liberate raw data and script reproducible analyses.
- ingredients you can pronounce
No fancy algorithms <cough>
ordered bijective interpolated time warping</cough>
,
though nothing explicitly prevents their use with this package.
Caveat: since tidychrom
does not implement RT adjustment nor spectrum
deconvolution, it expects chromatographic data with good separation and fairly consistent RTs (within a few sec) across samples. Analyze highly complex mixtures at your own risk, and maybe with a dash of special sauce.
- data you can see and touch
Existing R-based chromatography solutions rely on S3/4 objects with slots
that are not really standardized. This package attempts to keep all analysis
products in tibbles and facilitate downstream analysis with dplyr
. Visualization is implemented with ggplot2
.
(master base peak chromatogram from analyze_standards.R)
This repo contains a couple pre-cooked workflows (see workflows below), but above all, tidychrom is meant to be modular. Take the handful of functions provided and use them in your own dplyr
-based workflows, perhaps with inspiration from those provided here. To aid you in dissecting this repo, some tips on its organization:
-
There are no custom objects nor methods, only functions.
-
Functions are packaged 1 to a file.
-
Visualization is implemented in
ggplot2
. Custom plotting functions return a list ofgg
objects, which can be
- stored in a tibble column and retrieved later,
(from analyze_samples.R)
ggplot() + areas_all_qc %>%
filter(
samp == "JWL0012" &
id == "C22:6"
) %>%
pull(b2b)
- arranged alongside other stored plots,
(as in analyze_standards.R)
b2b <- lapply(scans_best$b2b, function(x){ggplot() + x})
do.call("grid.arrange", c(b2b, nrow = 5, ncol = 7))
- overlaid with other
gg
elements, like titles and other plots.
(from analyze_samples.R)
# to show all the ROIs integrated in a given sample
ggplot() + areas_all_qc %>%
filter(
samp == "JWL0138"
) %>%
pull(xic) +
ggtitle("all ROIs: sample JWL0138")
# or to show all the samples found in a given ROI
ggplot() + areas_all_qc %>%
filter(
id == "C22:6"
) %>%
pull(xic) +
ggtitle("C22:6 (DHA): 39 samples")
Preprocessing scripts to subtract blank data (like solvent peaks and column bleed):
Scanwise blanking (multisession)
This workflow was made to calculate the ratio (molar percentages) of fatty acid methyl esters (FAMEs) in a set of biological samples. It has 2 main steps:
Comments will walk you through each script. All user-provided parameters (including data directories) are provided at the top. Data files used in the scripts will be hosted at a later date.
-
ROI determination by peak frequency, saturation analysis, and spectrum extraction
-
Sample ID (against above ROIs) and relative quantitation